A Brief Survey of Information Retrieval Models
Abstract
Retrieval models are one of the essential concepts in IR system. In this work, we will present some basic retrieval models that are used to find the top-k answers to a given user query. It is not a trivial task to select the most important/relevant results from huge amount of documents. One of the most important models used in IR systems is statistical model which includes the vector space model that was first proposed by Salton and McGill (1983). This model represents documents and queries as vectors in multidimensional space and computes the similarity between queries and documents. Then, documents are ranked according to this similarity score. Another popular model used to represent documents and queries is language modeling. This model was first proposed by Ponte and Croft (1998). It is a statistical language model such that each document is viewed as a language model. Statistical language model is a probability distribution over all possible words in a language. Finally we will also present learning to rank strategy which becomes very popular in recent years. This strategy learns a ranking function using implicit or explicit relevance data. In the following subsections, we give the details of each different approach.
Downloads