Selection of documents to place in search index
Topics: Backlinks, Document Classification, E-E-A-T, Indexing, User Signals
The Google patent describes a system for selecting documents to include in a search index based on predicting how useful they will be as search results. Key points:
- The system analyzes historical search data to determine a “utility score” for documents previously included in the index, based on how often they were selected or presented as search results.
- It uses this data to train a model that can predict utility scores for new documents based on their features.
- When considering new documents to add to the index, it uses the model to predict their utility scores and ranks them accordingly.
- It then selects a number of the top-ranked documents to include in the index, potentially considering factors like document size/cost and quotas for certain types of content.
- This allows the system to prioritize including documents that are likely to be useful search results, given limited index space.