TU Wien:Advanced Information Retrieval VU (Knees)/Exam 2025-06-12

Format[Bearbeiten | Quelltext bearbeiten]

Printed, automatically graded multiple-choice test containing only true/false questions. Points are deducted for incorrect answers. There were 12 questions (24 points) on the lecture content and 16 questions (16 points) on a paper that was shared 48 hours before the exam. There were at least two groups with slightly different questions.

Exam Paper[Bearbeiten | Quelltext bearbeiten]

Ivica Kostric and Krisztian Balog. 2024. A Surprisingly Simple yet Effective Multi-Query Rewriting Method for Conversational Passage Retrieval. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '24). Association for Computing Machinery, New York, NY, USA, 2271–2275. https://doi.org/10.1145/3626772.3657933

Questions[Bearbeiten | Quelltext bearbeiten]

On the lecture (2 points each)[Bearbeiten | Quelltext bearbeiten]

IR evaluation[Bearbeiten | Quelltext bearbeiten]

Datasets with sparse judgments trade reduced document coverage for increased question coverage.
The M in MAP refers to calculating the mean over all questions evaluated.
Mean Reciprocal Rank calculates the average over the reciprocal ranks of all relevant documents.

Tokenization and text representation[Bearbeiten | Quelltext bearbeiten]

To reduce vocabulary sizes, BERT relies on lemmatization.
When using word embeddings, n-grams can be modelled by projecting word representations to join representations via sliding window CNNs.
In BERT, special tokens control the segment embeddings, which are added to the token embeddings and position embeddings.
BERT operates on a word level. Tokens found in the input text are never split up or modified.

Retrieval augmented generation[Bearbeiten | Quelltext bearbeiten]

Naïve RAG mainly consists of three parts: indexing, retrieval and generation.
For optimizing LLMs through the strategy of fine-tuning, the aspect of requiring external knowledge outweighs the aspect of requiring model adaption [sic!].
Retrieval Augmented Generation is used as a strategy to avoid hallucinations of LLMs.
Advanced RAG proposes multiple optimization strategies around pre-retrieval and post-retrieval, following a chain-like structure.

On the provided paper (1 point each)[Bearbeiten | Quelltext bearbeiten]

1) The rewrite score is length normalized since longer token sequences otherwise always lead to lower scores. 2), The proposed multi-query rewriting strategy utilizes the internal sequence generation and scoring process to obtain query rewrites basically for free. 3) One of the central issues in query rewriting is that the user intent might be misrepresented if context is not taken into account. 4) So-called "hallucinations" might still occur with the retrieved texts, as the used representations can lead to generation of word sequences representing falsehoods. 5) Dense retrieval approaches increase the efficiency of inverted indexes by removing data sparsity. 6) The underlying idea of multi-query rewriting for dense retrieval models is that a weighted combination of query rewrite vectors should represent a more robust expression of the original user's intent and information 7)For the majority of experiments, CMQR produces significantly better results at p<0.05 than its single-query counterpart. 8)For experiments, TREC-CAsT query-answer pairs serve only for evaluation purposes, not for training. Consequently, TREC-CAsT is the only dataset for which manually rewritten queries are not outperformed by any automatic approach. 9)De-contextualization refers to the creation of a single stand-alone query at every point in a conversation capturing the intent of the entire conversation. 10) The authors of the paper claim that - to the best of their knowledge - this is the first work to address neural query rewriting. 11) The authors experiment with varying numbers of rewrites and conclude that a value of 10 is best. 12) The beam search algorithm is applied to calculate and rank sequences of words based on their overall probability rather than selecting individual words on their probability. 13) As an overall trend, it can be observed that CMQR outperforms its BM25 counterparts in the bag-of-words scenario, but not in the continuous embedding space scenario. 14)n the sparse retrieval method, the weights associated to each occurring term are calculated by normalized sum of the query rewrite scores of query rewrites containing the term. 15) The paper contains a sentence with misspellings of both the words retrieval and hyperparameters. 16) For evaluation purposes, only recall-oriented metrics are considered to reflect the importance of covering a /wider range of documents through query expansion metho
The proposed multi-query rewriting strategy utilizes the internal sequence generation and scoring process to obtain query rewrites basically for free.
For the majority of experiments, CMQR produces significantly better results at $p<0.05$ than its single query counterparts.
The authors claim that – to the best of their knowledge – this is the first work to address neural query rewriting.

More questions on the paper followed, including one about the presence of typos in the paper.