Phrase-Based Detection of Duplicate Documents in an Information Retrieval System
Topics: AI (Deep Learning), Anna Lynn Patterson, Document Classification, Indexing, Information Gain, Phrase based Indexing
The information retrieval system described in this patent uses phrases to index, retrieve, organize, and describe documents. It identifies phrases that predict the presence of other phrases in documents, indexes documents based on their included phrases, and clusters related phrases to improve search result relevance and eliminate duplicates.