Revisiting Text Ranking in Deep Research
Topics: AI Mode, Chunk Relevance, LLM Readability, LLMO / GEO, Passage based retrieval, Ranking, Retrieval Augmented Generation (RAG)
This paper investigates how well-established text ranking methods from information retrieval (IR) perform in the context of “deep research” — a task where AI agents iteratively search the web and reason over results to answer complex, multi-step questions. Because most prior work relies on opaque, black-box web search APIs, the authors replace these with transparent, reproducible retrieval pipelines and systematically test different retrievers, re-rankers, and query strategies on a fixed benchmark dataset. Their key finding is that classic lexical retrieval (BM25) combined with a re-ranker rivals much larger and more expensive neural systems, and that translating keyword-style agent queries into natural-language questions significantly improves neural ranker performance.
