TU Wien:Grundlagen des Information Retrieval VU (Rauber)/Prüfung 2024-11-21

Aus VoWi
Zur Navigation springen Zur Suche springen

The duration of this exam was 1 hour. It was split into two parts, with the first one being multiple choice and the second one open questions. You got already 4 Points for writing your surname on the exam sheet which was nice.


4 pt for the writting down your last name.

Multiple Choice Questions (each 4 pt., 2 point reduction per wrong answer)[Bearbeiten | Quelltext bearbeiten]

  1. What is true about LSI:
    1. Is based on the term-document matrix
    2. Is calculated using (Singular Volume Delusion) (sic)
    3. Solves the synoymity problem
    4. Is calculated with tf-idf
  2. Query Logs can be used for
    1. Query spelling correction
    2. Training Learning to Rank algorithms
    3. Creating a fire
    4. Optimising search engine cache replacement policies
  3. Which techniques can be used to pre-process documents before indexing
    1. Lemmatization
    2. Tuning
    3. Stop word addition
    4. Tokenization
  4. Query Logs can be used for: (Very similar to the 3. question of 2024-01-09):
    1. training Learning to Rank algorithms
    2. fire lightning
    3. query spelling correction
    4. optimizing search engine cache replacement policies

Open Questions (Write your own answer)[Bearbeiten | Quelltext bearbeiten]

  1. What is an inverted index? What does the data structure look like (10 pt.)
  2. What aspects are used in the ranking of BM25 (10 pt.)
  3. What are document surrogates? What do they contain? Name three characteristics of good surrogates. (10 pt.)
  4. What is teleportation in the context of Web Search? Why is it used? (10 pt.)
  5. Name two methods for answering phrase queries e.g. Universität Wien. Describe advantages and disadvantages of each method. (20 pt.)
  6. Why is cosine similarity used instead of euclidean distance in Vector Space Models? (10 pt.)
  7. What is Pooling and why is it used? Briefly describe how it works (10 pt.)