System and Method for Processing Queries Against Semantic Cache Entries Using Unique Distance-based Thresholds
Topics: Indexing, Microsoft, Probably in use, Search Query Processing, Semantic Search
This Microsoft patent describes a system designed to improve the efficiency of Large Language Model (LLM) applications by using a sophisticated semantic cache. Instead of using a single, rigid similarity rule for all cached items, the system generates “synthetic” versions of questions to calculate unique, optimized distance thresholds for every entry. This allows the system to accurately recognize when a new user query means the same thing as a previous one, even if the wording is different, thereby providing a stored answer without needing to trigger a new, expensive LLM request.
