Evaluating Verifiability in Generative Search Engines
Topics: AI (Deep Learning), AIOverviews, Retrieval Augmented Generation (RAG)
The paper evaluates the verifiability of four generative search engines: Bing Chat, NeevaAI, perplexity.ai, and YouChat. The study reveals that these systems often produce fluent and seemingly helpful responses but frequently include unsupported statements and inaccurate citations. On average, only 51.5% of generated sentences are fully supported by citations, and 74.5% of citations actually support their associated statements. These findings highlight significant challenges in the trustworthiness of current generative search engines.