Author: Olaf Kopp
Reading time: 3 Minutes

What is BM25?

5/5 - (2 votes)

BM25 is a popular ranking function used in information retrieval systems to estimate the relevance of documents to a given search query. It belongs to a family of scoring functions known as probabilistic information retrieval models, which are based on the probabilistic relevance framework.

How BM25 Works:

BM25 calculates a score for each document relative to a specific query, where higher scores indicate a greater relevance of the document to the query. The score is based on the query terms appearing in each document, taking into account the frequency of each term in the document and across all documents in the collection. Here’s a breakdown of the main components of the BM25 formula:

  1. Term Frequency (TF): This reflects how often a query term appears in a document. More occurrences of the term usually suggest higher relevance.
  2. Inverse Document Frequency (IDF): This measures the informativeness of a term. If a term appears in many documents, it is less likely to be significant for determining relevance. The IDF component of BM25 penalizes terms that are too common across documents.
  3. Document Length Normalization: This aspect of BM25 adjusts for the length of the document. Longer documents may have higher term frequencies simply due to their length, so BM25 normalizes for this, preventing longer documents from inherently receiving higher scores unless they are more relevant.

The BM25 Formula:

The formula for BM25 is as follows:

where:

  • π‘žπ‘– is a query term,
  • 𝑓(π‘žπ‘–,𝐷) is π‘žπ‘–‘s term frequency in the document 𝐷,
  • 𝐷 is the length of the document,
  • avgdl is the average document length in the text collection,
  • π‘˜1 and 𝑏 are free parameters, usually chosen empirically (common values are π‘˜1=2.0 and 𝑏=0.75),
  • IDF(π‘žπ‘–) is the IDF for π‘žπ‘–.

Applications and Usage of BM25

BM25 is widely used in search engines and various information retrieval applications due to its effectiveness and efficiency. It is particularly well-regarded for its balance between simplicity and performance, making it a foundational component in many modern search systems, including those that use more complex machine learning models.

In summary, BM25 is a robust method for scoring documents based on their relevance to a query, efficiently balancing term frequency, document frequency, and document length.

Difference between BM25 and TF-IDF

The difference between BM25 (Best Matching 25) and TF-IDF (Term Frequency-Inverse Document Frequency) lies mainly in how they evaluate the relevance of documents concerning a search query. Here are the main differences:

1. Calculation and Weighting of Terms

TF-IDF:

  • Term Frequency (TF): Measures how often a term appears in a document. The more frequently a term appears, the higher its weighting.
  • Inverse Document Frequency (IDF): Measures how rare a term is across the entire document collection. Rare terms have a higher weighting as they are considered more relevant.

The TF-IDF weighting is calculated as:

BM25:

  • BM25 is an extension of TF-IDF that introduces additional parameters to make the weighting more flexible and adaptive.
  • BM25 uses a saturated frequency function for Term Frequency (TF), considering that the relevance of a term does not increase linearly with its frequency.
  • BM25 also takes into account the length of documents and normalizes them to avoid penalizing longer documents.

The BM25 weighting is calculated as:

2. Adaptability and Relevance Scoring

TF-IDF:

  • Relatively simple and straightforward.
  • Suitable for smaller or less complex document collections.
  • The weighting is based solely on term frequency and inverse document frequency.

BM25:

  • More flexible and adaptive due to the use of hyperparameters k1k_1 and bb, which control term frequency saturation and document length normalization.
  • Generally provides better results for larger and more complex document collections, especially in information retrieval.
  • Considers not only the frequency of a term but also the document length and term saturation.

Summary

While TF-IDF is a simple and intuitive method for weighting terms based on their frequency and rarity, BM25 offers an advanced and fine-tuned method that considers additional factors such as document length and frequency saturation. As a result, BM25 is often better suited for more complex applications in information retrieval.

About Olaf Kopp

Olaf Kopp is Co-Founder, Chief Business Development Officer (CBDO) and Head of SEO & Content at Aufgesang GmbH. He is an internationally recognized industry expert in semantic SEO, E-E-A-T, LLMO, AI- and modern search engine technology, content marketing and customer journey management. As an author, Olaf Kopp writes for national and international magazines such as Search Engine Land, t3n, Website Boosting, Hubspot, Sistrix, Oncrawl, Searchmetrics, Upload … . In 2022 he was Top contributor for Search Engine Land. His blog is one of the most famous online marketing blogs in Germany. In addition, Olaf Kopp is a speaker for SEO and content marketing SMX, SERP Conf., CMCx, OMT, OMX, Campixx...

COMMENT ARTICLE



Content from the blog

LLMO / Generative Engine Optimization (GEO): How do you optimize for the answers of generative AI systems?

As more and more people prefer to ask ChatGPT rather than Google when searching for read more

LLMO / GEO: How to optimize content for LLMs and generative AI like AIOverviews, ChatGPT, Perplexity …?

In the rapidly evolving digital landscape in the AI era, a silent revolution has fundamentally read more

Digital brand building: The interplay of (online) branding & customer experience

Digital brand building or branding is one of the central topics in online marketing. Read read more

E-E-A-T: Discovery and evaluation of high quality ressources

The assessment of the Quality and authority of websites is crucial for search engines and read more

E-E-A-T: More than an introduction to Experience ,Expertise, Authority, Trust

There are many definitions and explanations of E-E-A-T, but few are truly tangible. This article read more

Learning to Rank (LTR): A comprehensive introduction

In the age of the internet and vast amounts of data, the ability to find read more