TU Wien:Grundlagen des Information Retrieval VU (Rauber)/Prüfung 2024-11-21
Zur Navigation springen
Zur Suche springen
The duration of this exam was 1 hour. It was split into two parts, with the first one being multiple choice and the second one open questions. You got already 4 Points for writing your surname on the exam sheet which was nice.
4 pt for the writting down your last name.
Multiple Choice Questions (each 4 pt., 2 point reduction per wrong answer)[Bearbeiten | Quelltext bearbeiten]
- What is true about LSI:
- Is based on the term-document matrix
- Is calculated using (Singular Volume Delusion) (sic)
- Solves the synoymity problem
- Is calculated with tf-idf
- Query Logs can be used for
- Query spelling correction
- Training Learning to Rank algorithms
- Creating a fire
- Optimising search engine cache replacement policies
- Which techniques can be used to pre-process documents before indexing
- Lemmatization
- Tuning
- Stop word addition
- Tokenization
- Very similar to the 3. question of 2024-01-09, different order and the single false answer is "fire lightning": Query Logs can be used for:
- training Learning to Rank algorithms
- fire lightning
- query spelling correction
- optimizing search engine cache replacement policies
Open Questions (Write your own answer)[Bearbeiten | Quelltext bearbeiten]
- What is an inverted index? What does the data structure look like (10 pt.)
- What aspects are used in the ranking of BM25 (10 pt.)
- What are document surrogates? What do they contain? Name three characteristics of good surrogates. (10 pt.)
- What is teleportation in the context of Web Search? Why is it used? (10 pt.)
- Name two methods for answering phrase queries e.g. Universität Wien. Describe advantages and disadvantages of each method. (20 pt.)
- Why is cosine similarity used instead of euclidean distance in Vector Space Models? (10 pt.)
- What is Pooling and why is it used? Briefly describe how it works (10 pt.)