
Term frequency-inverse document frequency, or tf-idf, is an equation to measure the importance of a word, or term, to a document, in the context of a collection of documents. It is the product of term frequency by inverse document frequency, where the first measures the frequency of a term in a document, and the second measures how rare that word is across documents in the corpus. tf-idf is used to rank documents given a search query.
Related concepts:
Bag-of-Words Model