Most interesting Google Patents for SEO in 2022
In this article I would also like to contribute to archiving well-founded knowledge from Google patents. Enjoy!
Research Google patents is one of the smartest ways to understand modern search engines like Google. A pioneer in researching Google patents was the unforgettable Bill Slawski. In his blog SEObythesea he published the insights from hundreds of Google patents and thus did an essential job for the entire SEO industry. He inspired me to substantiate my own thoughts and theories via Google patents.
A patent application does not mean that the theory described there will eventually find its way into practice. An indication of whether a methodology/technology is so interesting for Google that it could find its way into practice can be obtained by checking whether the patent is pending only in the USA. The claim for a patent priority for other countries must be made 12 months after the first filing.
Regardless of whether a patent finds its way into practice, it makes sense to deal with Google patents, as you get an indication of the topics and challenges that product developers are dealing with.
Below are summaries of the most interesting Google patents from 2022
DISTANCE BASED SEARCH RANKING DEMOTION
The Google patent “Distance based search ranking demotion” was drawn in 2018 and published in September 2022. There were various prior versions dating back to 2018 and 2020. The oiginal patent is from 2015. The patent has a scheduled expiration date of 2035. There are signings for the patent in US, Spain, Germany and China. This makes it very likely that the patent will be used.
The patent is about ranking local documents in relation to local search queries. More precisely, it is about the downgrading of documents when it is far away from the location of the terminal on which the search is performed.
A local search result document is a “distant” search result document when the location associated with the local search result document is determined to not meet a proximity threshold. A proximity threshold may be met, for example, when the location for the local search result document and the location for the user device are within a same geographic region (e.g., a same state), or within a threshold distance (e.g., 100 miles).
Documents that are too far away from the user’s location or do not serve a local search intention or do not have a sufficient ranking score are downgraded.
Two ranking components are described in the patent. An information retrieval score and an authority score.
The search results are ranked based on scores related to the resources identified by the search results, such as information retrieval (“IR”) scores, and optionally a separate ranking of each resource relative to other resources (e.g., an authority score).
Possible ranking criteria for local documents are included addresses or frequent calls from users from the region in relation to users outside the region. The rating is determined by a ranking subsystem for local search results.
For example, the local result subsystem 120 may determine a document is a local document if the document includes an address; or if search results for the document have a high rate of selection from user devices in a given location relative to user devices outside of the particular location; or if the local document has been specified by the publisher as being local to a particular location; etc.
Excitingly, the patent describes that for local search queries that include a geographic identifier such as city, zip code or similar, the distance to the user’s location is not as important.
For such queries, search result documents that are local to the location specified by the location phrase may be determined to be more relevant than search result documents that are not local to the location. In particular, the location of the user device may be determined to be of little, if any, relevance, as the user has explicitly specified a location.
As soon as the search query has an implicit local search intention, i.e. no geopgraphic identifier explicitly appears in the search query, but there is still a local search intention, the distance to the user plays an important role.
However, if the query does have an implicit local intent, and is not an explicitly local query, e.g., such as the query “coffee shops,” then the local result subsystem 120 performs a distance adjustment process 122.
An implicit local search intent can be determined via user behavior.
A downgrading of results can take place if the ranking score does not reach a certain threshold or the object described in the document is too far away from the user. Depending on the degree of locality of the search query, non-local documents can also rank, but can also be pushed down a few places by documents with better local relevance.
The process 200 adjusts the search score of the local document eligible for demotion to demote its ranking in the first order so that the rank of the demoted local document relative to the rank of the sufficiently ranked non-local document is decreased. In some implementations, the demotion can be such that the demoted local document is ranked at least one position below sufficiently ranked non-local document.
If the local object described in a document is too far away, it can also lead to a downgrade. The distance from the user plays a role here. If local objects are too far away from the user, they are not ranked. The maximum distance to the user differs depending on the object. When searching for a restaurant, the distance will be smaller than when searching for a hospital.
Search result ranking and presentation
The Google patent “Search result ranking and presentation” was drawn in 2019 and published in August 2022. There were various prior versions dating back to 2012. The patent has a scheduled expiration date of Aug. 16, 2032. The patent describes the basic features of a semantic or entity-based search.
„In some implementations, a computer implemented method for providing search results comprises determining, using one or more processors, an entity reference from a search query. A ranked list of properties associated with a type of the entity reference is identified based on a knowledge graph. A property for generating a presentation of search results from the ranked list of properties is identified, based at least in part on the search query and on the type of the entity reference.“
„In some implementations, a computer implemented search method for search comprises identifying a modifying concept based on a search query. A rule for ranking search results is determined based at least in part on the modifying concept and on a knowledge graph from which at least one of the search results was obtained. Search results are ranked based at least in part on the rule.“
The patent addresses the fact that search queries require different display formats in addition to a list of links as a response.
„In some implementations, it may be desired to present search results using a technique that reflects the content of the search query, the content of the search results, or both. For example, it may be useful for the search system to present search results that include geographic locations on a map, and to present search results that include chronological dates on a timeline. For example, a search results for the search query “Cities in California” may automatically be presented on a map, while search results for the search query “Paintings by Van Gogh” may be presented in an image gallery view.“
For search queries such as “the tallest building”, it is useful to output a listing of the entities with the largest values for the “height” property.
In an example, where search query block 102 includes the search query “Tallest Building,” the search system may retrieve a collection of buildings from data structure block 104 and/or webpages block 110, determine that the sorting property is “Height,” and may output a ranked list of buildings by height to ranked search results block 108.
In addition, it is pointed out that it is partly necessary to access data from a structured database in order to create the search results. This can be a knowledge graph, for example.
„Data structure block 104 includes a data structure including piece of information defined in part by the relationships between them. In some implementations, data structure block 104 includes any suitable data structure, data graph, database, index, list, linked list, table, any other suitable information, or any combination thereof. In an example, data structure block 104 includes a collection of data stored as nodes and edges in a graph structure. In some implementations, data structure block 104 includes a knowledge graph.“
The graphic from the patent has similarities to a graphic I created showing the interaction between the classic search index and the Knowledge Graph. In this graphic the interface between the two databases is called “Processing Block”. In my graphic I call it Entity Processing that could be built on the foundation of hummingbird.
The Processing Block is used to create entity references to the search query. This is done by Natural Language Processing.
„The search system determines an entity reference from the search query by parsing, by partitioning, by using natural language processing, by identifying parts of speech, by heuristic techniques, by identifying root words, by any other suitable technique, or any combination thereof. In some implementations, the entity reference includes text or other suitable content referencing any suitable topic, subject, person, place, thing, or any combination thereof.“
The modifiers shown in the graph can be e.g. superlatives like best, oldest, highest ….
More about this in my article HOW DOES GOOGLE UNDERSTAND SEARCH TERMS BY SEARCH QUERY PROCESSING?
An entity reference is the concept to a real world thing. The Processing Block creates a list of ranked properties of the entity.
Additionally, the entity’s properties can be enriched with other formats such as links, images, and videos.
„In some implementations, ranked search results, presentation techniques, or both, are output to ranked search results block 108. In some implementations, search results include, for example, entities from data structure 104, other data from data structure 104, a link to a web page, a brief description of the target of the link, contextual information related to the search result, an image related to the search result, video related to the search result, any other suitable information, or any combination thereof.“
If a search query can refer to multiple entity references, a Popularity Score per entity is taken into account. The most popular entity is prioritized in the delivery of search results.
„In some implementations, the search system selects one of the more than one identified entity references based on a global popularity score of that entity reference, a relevance and/or closeness to some or all elements of a search query, user input, user history, user preferences, relationships between the entity references as described in a data structure, any other suitable information, or any combination thereof. „
Read more in my articles How Google creates knowledge panels (SEL) and KNOWLEDGE PANELS & SERPS FOR AMBIGUOUS SEARCH QUERIES
It is exciting that the patent describes that not only entities, but also complete lists can be stored in the Knowledge Graph, which can then be delivered directly upon search query.
„In some implementations, the ranked list of properties is stored in a data structure such as a knowledge graph, in a database, in any other suitable data storage arrangement, or any combination thereof. In some implementations, a schema table is preprocessed. In some implementations, the ranked list is predetermined, is based on the received search, or any combination thereof.“
The ranking of the lists can be based on the following:
- search history
- User habits
- Input from developers
- Trends in general search behavior
- Recent search patterns
- Domain related ranking
This ranking takes place in the Processing Block or Entity Processing in my words.
The relationships between entities can be established using a “phrase tree”. The phrase tree is a theoretical construct that represents the relationships between entities.
DYNAMIC INJECTION OF RELATED CONTENT IN SEARCH RESULTS
This Google patent was published 07.05.2022 and filed on 06.08.2020. This Google patent is for me one of the most exciting in 2022. It is only registered in the US and China. It is therefore unlikely that it is currently in international use. But still exciting!
It describes a methodology how a search engine automatically suggests further links and search query alternatives within a box based on the dwell time in the SERPs. The appearance of these suggestions is reminiscent of the “others also searched for” suggestions when you return to the SERP after clicking on a search result.
It seems to be oriented to this functionality and to integrate more suggestions like links into the SERP. The difference to the already known functionality here is that not a click on a search result is the triggering event, but the dwell time.
“Implementations use a dwell signal to display related suggested items and/or to influence “next page” search results for dynamic pagination. For example, some implementations may calculate related suggestions for a search result presented in response to a query. The suggestions may include refined queries and/or links to specific items. “
If a threshold value for a dwell time is reached, a box with suggestions is automatically displayed, because it can be assumed that the user has not found what he is looking for.
The suggestions are intended to make the user direct the search queries in a slightly different direction and suggest similar content of the same category or class.
In addition, or instead, the suggestions may offer tangential suggestions that take the user in a slightly different direction, e.g., offering related queries, alternate interpretations of the query terms, and/or documents in a same category/classification as the particular search result but not highly similar to the result.
Besides links and search query refinement, the suggestions can also consist of images, videos, PDFs, audios … include. Entities can also be suggested.
In finding responsive items, the query system 120 may be responsible for searching one or more indices, represented collectively as item index 140. The item index 140 may include a web document index, e.g., an inverted index that associates terms, phrases, and/or n-grams with documents. Web documents can be any content accessible over the Internet, such as web pages, images, videos, PDF documents, word processing documents, audio recordings, etc. The item index 140 may also include an index of entities, for example from a knowledge base or knowledge graph
It is also interesting to note that suggestions can be generated based on the user journeys of other searchers.
In some implementations, the suggested follow-on queries may be related to a specific responsive item. For example, the responsive item may be associated with one or more queries, e.g., because the responsive item has been selected often after being presented as a search result for the related queries. If the responsive item has related queries these queries may be included as suggested items for the responsive item. For example, the suggested items 135 can include parts of a topic journey that other users have taken. For instance, if the current query is “jobs in Pittsburgh” the search system may suggest “housing in Pittsburgh” or “best elementary schools in Pittsburgh” as a suggested item 135.
Refinement suggestions are issued for ambiguous search queries based on other interpretations of the search query. Or in the form of terms with similar meanings, or in the form of explicit questions that illuminate a new perspective.
As another example, the suggested items 135 may include alternate interpretations of a query term. For instance, the query “jaguar” may result in “jaguar car,” “jaguar cat,” and/or “jaguar team” as suggestions. Similarly, suggested items 135 may include alternate possibilities. For example, a query of “washing machine” may have as suggested items 135 “new washing machine” or “washing machine repair” while a query of “university” may include “trade school” or “journey program” as a suggested item 135. Another example of suggestions tangential to a query are alternate viewpoints. For instance, a query of “How long should I foam roll after running?” may have as a suggested item “Should I foam roll after running?” or “Alternatives to foam rolling after running.”
In addition to the suggestions in a box, a “Next page” function can be used to offer the user to refresh the search results completely, or at least the first ten, without having to load a completely new set of hundreds of results.
The next page may include another small set of results, which may include some of the original smaller set that were not included in the first page as well as results added due to the dwell score signals. Thus, implementations may support dynamic pagination of search results and use a dwell score (or scores) to determine which search results are provided next. Dynamic pagination may be utilized irrespective of manual pagination; in other words, the user may interact with a “next page” type UI element or via automatic in-line pagination, which appends new results to the existing page.
The advantage would be a faster display of search results.
ACCELERATED LARGE-SCALE SIMILARITY CALCULATION
This Google patent was first published in 2019 and republished on 2022-05-07.
The patent describes the process for determining a similarity of two entities based on the similarity of attributes. The degree of similarity is determined by a similarity score. The purpose of this process is to determine a response to a query.
“For example, the query might seek information indicating which domain names 20-year-old males in the U.K. find more interesting relative to the general population in the U.K. The system computes the correlations by executing a specific type of correlation algorithm (e.g., a jaccard similarity algorithm) to calculate correlation scores that characterize relationships between entities of the different datasets.”
Googles machine Learning platform Tensorflow is used as the basis for determining the similarity score.
“The system includes a tensor data flow interface that is configured to pre-load at least two data arrays (e.g., tensors) for storage at a memory device of the GPU.”
For example, a Knowledge Graph and/or the Knowledge Vault or any kind of semantic database can be used as the entity database accessed by the algorithm.
The following example from the patent shows possible entities and attributes for a comparison:
“For example, entities of one dataset can be persons or users of a particular demographic (e.g., males in their 20’s) that reside in a certain geographic region (e.g., the United Kingdom (U.K.)). Similarly, entities of another dataset can be users of another demographic (e.g., the general population) that also reside in the same geographic region.”
The exciting thing about the patent is that in addition to outputting search results, it can also be used to create groups or cohorts of similar users for Google Analytics, for example. You can find these sections in the patent:
“For situations in which the systems discussed here collect and/or use personal information about users, the users may be provided with an opportunity to enable/disable or control programs or features that may collect and/or use personal information (e.g., information about a user’s social network, social actions or activities, a user’s preferences or a user’s current location). In addition, certain data may be treated in one or more ways before it is stored or used, so that personally identifiable information associated with the user is removed. For example, a user’s identity may be anonymized so that the no personally identifiable information can be determined for the user, or a user’s geographic location may be generalized where location information is obtained (such as to a city, ZIP code, or state level), so that a particular location of a user cannot be determined.
System 100 is configured to analyze and process different sources of data accessible via storage device 106. For example, CPU 116 can analyze and process data sources that include impression logs that are interacted with by certain users or data sources such as search data from a search engine accessed by different users. In some implementations, entities formed by groups of users, or by groups of user identifiers (IDs) for respective users, can be divided into different groups based on age, gender, interests, location, or other characteristics of each user in the group.”
“In some implementations, hosting service 110 represents an information library that receives and processes queries to return results that indicate relationships such as similarities between entities or conditional probabilities involving different sets of entities. For example, a query (or command) can be “what are all the conditional probabilities associated with some earlier query?” Another query may be related to the conditional probabilities of all the ages of people that visit a particular website or URL. Similarly, another query can be “what are the overlapping URL’s visited by 30-year-old females living in the U.S. relative to 40-year-old males living in the U.K.?”
This Google patent shows that similarities and thus relationships between entities are important to Google and that attributes are the basis for determining these. It also shows that organizing around entities in terms of Internet users can also be a solution to the privacy challenges of building cohorts of similar users based on certain attributes.
METHODS, SYSTEMS, AND MEDIA FOR PROVIDING A MEDIA SEARCH ENGINE
This Google patent was first published in 2011 and republished on 2022-08-02 under a new patent number. The status is active and the anticipated expiration is january 2031. It is classified in Operations research or analysis, machine learning and Marketing, e.g. market research and analysis, surveying, promotions, advertising, buyer profiling, customer management or rewards; Price estimation or determination.
This patent describes how an algorithm is trained via supervised machine learning using various methods (logistic regression, support vector machines, Bayesian approaches, decision trees, etc.) in order to classify the content. Content is labeled by people in order to then make it available to a learning algorithm as sample training data. It should be noted that these learning approaches are useful in situations when the classes considered are significantly biased (pornography or adult content, children’s content, hate speech, bombs, weapons especially, ammunition, alcohol, offensive language, tobacco, spyware, unwanted code, illegal drugs, downloading music, certain types of entertainment, illegality, profanity, etc.) and where there are limited resources to get information from people.
In addition, the approaches can be used to prevent the display of advertising on critical pages or content. Classification can be based on URL, text, anchor texts, DMOZ categories, third party classification, images on a page…
This patent provides approaches that Google uses to evaluate E-A-T, or classify, websites for spam, scam, or other content that Google does not want indexed.
- Relevance, pertinence and quality in search engines - 9. March 2023
- How does Google search (ranking) may be working today - 4. January 2023
- Most interesting Google Patents for SEO in 2022 - 28. December 2022
- A bit more than an introduction to E-E-A-T (Experience, Expertise, Authority, Trust) - 20. December 2022
- The role of successful SEO: Consultant, interface and enabler - 29. November 2022
- All you should know as an SEO about entity types, classes & attributes - 6. August 2022
- What are Micro Intents? - 8. July 2022
- How does Google understands search terms by search query processing? - 29. June 2022
- Knowledge Panels & SERPs for ambiguous search queries - 22. May 2022
- Evolution of Marketing: From Advertising to Content – From Push to Pull - 16. May 2022