Author: Olaf Kopp
Reading time: 3 Minutes

What is BM25?

5/5 - (2 votes)

BM25 is a popular ranking function used in information retrieval systems to estimate the relevance of documents to a given search query. It belongs to a family of scoring functions known as probabilistic information retrieval models, which are based on the probabilistic relevance framework.

How BM25 Works:

BM25 calculates a score for each document relative to a specific query, where higher scores indicate a greater relevance of the document to the query. The score is based on the query terms appearing in each document, taking into account the frequency of each term in the document and across all documents in the collection. Here’s a breakdown of the main components of the BM25 formula:

  1. Term Frequency (TF): This reflects how often a query term appears in a document. More occurrences of the term usually suggest higher relevance.
  2. Inverse Document Frequency (IDF): This measures the informativeness of a term. If a term appears in many documents, it is less likely to be significant for determining relevance. The IDF component of BM25 penalizes terms that are too common across documents.
  3. Document Length Normalization: This aspect of BM25 adjusts for the length of the document. Longer documents may have higher term frequencies simply due to their length, so BM25 normalizes for this, preventing longer documents from inherently receiving higher scores unless they are more relevant.

The BM25 Formula:

The formula for BM25 is as follows:

where:

  • π‘žπ‘– is a query term,
  • 𝑓(π‘žπ‘–,𝐷) is π‘žπ‘–‘s term frequency in the document 𝐷,
  • 𝐷 is the length of the document,
  • avgdl is the average document length in the text collection,
  • π‘˜1 and 𝑏 are free parameters, usually chosen empirically (common values are π‘˜1=2.0 and 𝑏=0.75),
  • IDF(π‘žπ‘–) is the IDF for π‘žπ‘–.

Applications and Usage of BM25

BM25 is widely used in search engines and various information retrieval applications due to its effectiveness and efficiency. It is particularly well-regarded for its balance between simplicity and performance, making it a foundational component in many modern search systems, including those that use more complex machine learning models.

In summary, BM25 is a robust method for scoring documents based on their relevance to a query, efficiently balancing term frequency, document frequency, and document length.

Difference between BM25 and TF-IDF

The difference between BM25 (Best Matching 25) and TF-IDF (Term Frequency-Inverse Document Frequency) lies mainly in how they evaluate the relevance of documents concerning a search query. Here are the main differences:

1. Calculation and Weighting of Terms

TF-IDF:

  • Term Frequency (TF): Measures how often a term appears in a document. The more frequently a term appears, the higher its weighting.
  • Inverse Document Frequency (IDF): Measures how rare a term is across the entire document collection. Rare terms have a higher weighting as they are considered more relevant.

The TF-IDF weighting is calculated as:

BM25:

  • BM25 is an extension of TF-IDF that introduces additional parameters to make the weighting more flexible and adaptive.
  • BM25 uses a saturated frequency function for Term Frequency (TF), considering that the relevance of a term does not increase linearly with its frequency.
  • BM25 also takes into account the length of documents and normalizes them to avoid penalizing longer documents.

The BM25 weighting is calculated as:

2. Adaptability and Relevance Scoring

TF-IDF:

  • Relatively simple and straightforward.
  • Suitable for smaller or less complex document collections.
  • The weighting is based solely on term frequency and inverse document frequency.

BM25:

  • More flexible and adaptive due to the use of hyperparameters k1k_1 and bb, which control term frequency saturation and document length normalization.
  • Generally provides better results for larger and more complex document collections, especially in information retrieval.
  • Considers not only the frequency of a term but also the document length and term saturation.

Summary

While TF-IDF is a simple and intuitive method for weighting terms based on their frequency and rarity, BM25 offers an advanced and fine-tuned method that considers additional factors such as document length and frequency saturation. As a result, BM25 is often better suited for more complex applications in information retrieval.

About Olaf Kopp

Olaf Kopp is Co-Founder, Chief Business Development Officer (CBDO) and Head of SEO & Content at Aufgesang GmbH. He is an internationally recognized industry expert in semantic SEO, E-E-A-T, LLMO, AI- and modern search engine technology, content marketing and customer journey management. As an author, Olaf Kopp writes for national and international magazines such as Search Engine Land, t3n, Website Boosting, Hubspot, Sistrix, Oncrawl, Searchmetrics, Upload … . In 2022 he was Top contributor for Search Engine Land. His blog is one of the most famous online marketing blogs in Germany. In addition, Olaf Kopp is a speaker for SEO and content marketing SMX, SERP Conf., CMCx, OMT, OMX, Campixx...

COMMENT ARTICLE



Content from the blog

LLMO / Generative Engine Optimization: How do you optimize for the answers of generative AI systems?

As more and more people prefer to ask ChatGPT rather than Google when searching for read more

Prompt Engineering Guide: Tutorial, best practises, examples

Prompt engineering is an essential skill to maximize LLM potential, providing methods to control and read more

Overview: Brand Monitoring Tools for LLMO / Generative Engine Optimization

Generative AI assistants like ChatGPT or Claude and AI search engines like Perplexity or Google read more

What is the Google Shopping Graph and how does it work?

The Google Shopping Graph is an advanced, dynamic data structure developed by Google to enhance read more

How Google can personalize search results?

The personalization of search results is one of the last steps in the ranking process read more

The dimensions of the Google ranking

The ranking factors at Google have become more and more multidimensional and diverse over the read more