Author: Olaf Kopp
Reading time: 9 Minutes

Helpful content: What Google really evaluates?

4.7/5 - (3 votes)

Since the first Helpful Content Update in 2022, the SEO world has been thinking about how to create helpful content or optimize it accordingly. Hypotheses are put forward, analyses, checklists and audits are created. I don’t find most of these approaches useful because they are derived from the perspective of a human and not a machine or algorithm.

Google is a machine, not a human!

My SEO mantra is: “Think like an engineer, act like a human.”

Many statements and opinions on the topic of helpful content revolve around what Google rates as helpful and what does not.

The focus here is often on the nature of the content. But does Google really evaluate content according to helpfullness?

With this article I would like to invite you to a discussion.

Helpful content, what is it anyway?

Helpful Content is a terminology that Google introduced as part of the first Hepful Content update in August 2022. Google initially announced that the Helpful Content System was a “sitewide classifier”. It was later announced that it would also be used to rate individual documents.

Our helpful content system is designed to better ensure people see original, helpful content written by people, for people, in search results, rather than content made primarily to gain search engine traffic.

Our central ranking systems are primarily designed for use at page level. Various signals and systems are used to determine the usefulness of individual pages. There are also some website-wide signals that are also taken into account.

I have already commented on the first Helpful Content Update that this update was primarily a PR update, and not just because of the meaningful title. You can read my reasoning and criticism in detail here.

One of Google’s PR goals is to encourage website operators to make crawling, indexing and therefore ranking easier. At least that was the aim of the biggest updates such as the changeover to Page Speed Update, Page Experience Update, Spam Update … These updates have one thing in common. They imply a recommendation for action through the meaningful concrete title and thus help Google with information retrieval.

I would have prefer to call the Helpful Content System a “User Satisfaction System”. But more on that later.

What is helpful?

In order to answer this question, you should take a closer look at the information retrieval terms relevance, pertinence and usefulness. As described in my article “Relevance, pertinence and quality in search engines“, these terms are described as follows:

Something is relevant to search engines if a document or content is significant in relation to the search query. The search query describes the situation and the context. Google determines this relevance using text analysis methods such as BM25 or TF-IDF, Word2Vec

Pertinence describes the subjective importance of a document for the user. This means that in addition to the match with the search query, a subjective user level is added.

In addition to the conditions for relevance and pertinence,usefulness also restricts the level of novelty.

For me, pertinence and usefulness are the two levels that stand for helpfulness.

How can you algorithmically measure helpfulness, pertinence and usefulness?

Pertinence and usefulness can be determined by user satisfaction with the content. The best way to determine user satisfaction is to measure and interpret user behavior. In addition to the relevance of the content to the search query, this provides a better indication of whether users really find the content helpful in the respective context. The analysis of documents or content properties only provides limited information about how helpful a content search result is, as the user is not taken into account here.

There are various possible metrics for this, which emerge from the Google API leak:

  1. CTR (click-through rate)

    • ctrWeightedImpressions: This attribute records the weighted impressions for the calculation of the CTR.
    • Source: GoogleApi.ContentWarehouse.V1.Model.IndexingSignalAggregatorAdaptiveIntervalData
  2. Good clicks
    • goodClicks: This attribute tracks the number of good clicks.
    • lastGoodClickDateInDays: Shows the date on which the document received the last good click.
    • Source: GoogleApi.ContentWarehouse.V1.Model.QualityNavboostCrapsCrapsClickSignals
  3. Bad clicks
    • badClicks: This attribute records the number of bad clicks.
    • Source: GoogleApi.ContentWarehouse.V1.Model.QualityNavboostCrapsCrapsClickSignals
  4. Long clicks
    • LastLongestClicks: This attribute tracks the number of clicks that were the last and longest in related user queries.
    • Source: GoogleApi.ContentWarehouse.V1.Model.QualityNavboostCrapsCrapsClickSignals
  5. Short clicks
    • While there is no direct attribute called “short clicks”, the absence of long clicks or a high number of bad clicks could indicate shorter interactions.
    • Source: GoogleApi.ContentWarehouse.V1.Model.QualityNavboostCrapsCrapsClickSignals

Source: Google API Leak Analyzer

Other factors that I have researched from Google patents are

  1. Click-Through-Rate (CTR):
    • Search result interaction: the percentage of users who click on a website link when it appears in search results.
    • Ad performance: CTR for the ads displayed on the website.
  2. Dwell time:
    • Average time spent on the website: The average time users spend on the site after clicking on a search result.
    • Bounce rate: The percentage of visitors who leave the website after viewing only one page.
  3. Good clicks and bad clicks:
    • User engagement metrics: Metrics such as page interactions (likes, shares, comments), bounce rates and revisits.
    • View duration: Longer views have a higher relevance, which indicates good clicks, while shorter views have a lower relevance, which indicates bad clicks.
  4. Long clicks and short clicks:
    • View duration: measures the time users spend viewing a document. Longer views (long clicks) are considered more relevant.
    • Weighting functions: Applies continuous and discontinuous weighting functions to adjust relevance scores based on viewing duration.

Patents:

      • “Ranking factors or scoring criteria”
      • “Increased importance of metrics for user engagement”
      • “User engagement as a ranking factor”

Source: Database Research Assistant

Usefulness can also be determined by search engines using an information gain score.

The information gain or information gain refers to a score that indicates how much additional information a document contains over and above the usual information contained in the documents previously viewed by a user.

This score helps determine how much new information a document offers the user compared to what the user has already seen.

You can find out more about information gain in the article Information gain: How is it calculated? Which factors are crucial?

Identification of helpful document properties based on user signals

Another possibility is to identify supposed document properties or document patterns that could be helpful for users via positive user signals in statistically valid quantities.

The Google patent “Ranking Search Result Documents” describes a method that compares the properties of search queries with document properties based on past user interactions.

However, this method would require a lot of computer resources. In addition, such a methodology would always result in a greater time delay until the results are meaningful.

The interaction between initial ranking and reranking

In order to understand at which point in the ranking process helpful content is determined, a brief digression into parts of the information retrieval process is required.

There are three steps in the ranking process:

  1. Document evaluation
  2. Classification of quality
  3. Re-ranking
  • Document scoring is responsible for the initial ranking of the top n documents. An ascorer is used here to calculate IR scores. How high the n is can only be guessed. For performance reasons, I assume a maximum of a few hundred documents.
  • Signals relating to E-E-A-T play a particularly important role in quality classification. Here, the quality of the individual documents is not evaluated, but page-wide classifiers are used.
  • Twiddlers are used for reranking.

Twiddlers are components within Google’s Superroot system that are used to re-evaluate search results from a single corpus. They work with ranked sequences rather than isolated results and make adjustments to the original ranking created by Ascorer. There are two types of twiddlers: Predoc and Lazy.

      1. PredocTwiddler:
        • Operation: they work with thin answers (initial search results with minimal information).
        • Functions: Changing IR scores, reordering results and making remote procedure calls (RPCs).
        • Use case: Suitable for comprehensive, initial adjustments and promoting results based on preliminary data.
      2. Lazy tinkerers:
        • Process: executing bold results (detailed document information).
        • Functions: Reorganizing and filtering results based on detailed content analysis.
        • Use case: Ideal for fine-tuning and filtering based on specific content attributes.

More detailed information can be found in the “Twiddler Quick Start Guide”, which you can download here.

Source: Database Research Assistant

According to the API leak, these Twiddlers can also be used for evaluation at domain level in addition to the document level.

Twiddlers are used in Google’s ranking and indexing processes to adjust the relevance and ranking of documents. These are essentially factors or signals that can be “twiddled” or adjusted to fine-tune search results. Here are some key points about Twiddler based on the documents provided:

    1. Classification of domains:
      • Twiddlers can be used to classify the domain of a document, which helps to understand the context and relevance of the content.
      • Source: “qualityTwiddlerDomainClassification” – Google-Leak_API-Module_summarized
    2. Spam detection:
      • Twiddlers play a role in spam detection and mitigation. They can adjust the ranking of documents that are flagged by spam detection algorithms.
      • Source: “spamBrainSpamBrainData” – Google-Leak_API-Module_summarized
    3. Content quality:
      • Twiddlers can influence the perceived quality of content by adjusting scores based on various quality signals.
      • Source: “commonsenseScoredCompoundReferenceAnnotation” – Google-Leak_API-Module_summarized
    4. Shopping and ads:
      • For e-commerce and shopping-related search queries, users can customize the relevance of shopping annotations and ads.
      • Source: “adsShoppingWebpxRawShoppingAnnotation” – Google-Leak_API-Module_summarized

Source: Google API Leak Analyzer

The Twiddlers are part of Google’s Superroot and are responsible for a downstream quality assessment in terms of, among other things, helpfullness at a document and domain level.

Source: Internal Google presentation “Ranking for Research”, November 2018

Objective ranking factors, with the exception of information gain, make no sense for the evaluation of helpful content, as they do not focus on the user. These factors are primarily taken into account in the initial ranking via the Ascorer.

It makes sense that Google evaluates Helpful Content primarily on the basis of the various possible user signals and an information gain score, which can also be evaluated individually by very personalized users.

Helpful content has a correlation with content, but is causal to user signals

As mentioned at the beginning, I am skeptical about many analyses and checklists regarding helpful content because I think that Google evaluates helpfullness primarily on the basis of user signals and not on the basis of document properties. In other words, I think that analyzing individual content in terms of helpfullness without having insight into user data is only of limited value.

Of course you want to improve user signals by optimizing content, but in the end it is the user who decides whether he/she finds a piece of content helpful or not and not the SEO who optimizes certain properties of a document according to a checklist.

In addition, the user’s decision as to whether he/she finds a piece of content helpful depends on the topic and context. In other words, the recommendations for optimization are also always dependent on this.

There may be correlations between document properties and helpful content, but in the end there is causality to the user signals.

In other words: If you optimize a piece of content and the user signals do not improve, it will not become more helpful. Google must first learn what is helpful based on the user signals.

Internal Google presentation “Google is magical”, October 2017

This thesis is underpinned by the findings from the antitrust proceedings against Google. According to this, the understanding/quality of content can only be derived to a limited extent from the document itself.

Source: Internal Google presentation “Ranking for Research” from November 2018

The desire for a blueprint, preferably in the form of checklists, is great in the SEO industry. That’s why they always get a lot of attention and are popular. However, they lag behind the times, as the need and therefore the helpfulness of content can be very dynamic for each search query.

There is also a great desire for clarity, e.g. regarding Google updates and possible reasons for a penalty. This is why analyses of Google updates are also very popular.

But if content is king, user signals are queen and they ultimately determine how helpful a piece of content is rated by Google. Since most analyses of core updates and helpful content are based on the characteristics of documents and domains, they represent correlations at most, but not causalities.

A theory such as Google devaluing websites because of affiliate links or because they do not mention the right entities or keywords does not make sense. Google devalues websites because the user signals are not appropriate and they do not offer any information gain, so they do not meet user needs and are therefore not helpful for many users. Google does not devalue pages in the re-ranking because of certain document properties.

For me, the Helpful Content System is more of a framework that summarizes all the user signals used and the rating systems based on them. So I would call it “User satisfactions system”.

What is your opinion? Let’s discuss!

About Olaf Kopp

Olaf Kopp is Co-Founder, Chief Business Development Officer (CBDO) and Head of SEO & Content at Aufgesang GmbH. He is an internationally recognized industry expert in semantic SEO, E-E-A-T, modern search engine technology, content marketing and customer journey management. As an author, Olaf Kopp writes for national and international magazines such as Search Engine Land, t3n, Website Boosting, Hubspot, Sistrix, Oncrawl, Searchmetrics, Upload … . In 2022 he was Top contributor for Search Engine Land. His blog is one of the most famous online marketing blogs in Germany. In addition, Olaf Kopp is a speaker for SEO and content marketing SMX, CMCx, OMT, OMX, Campixx...

COMMENT ARTICLE



  • Simon

    22.07.2024, 02:14 Uhr

    Olaaf, thank you for another informative article. So just to be clear, is your view that AI writers that analyze entities contained in the top end results and seek to add these to an article are just a waste of time?

    Another question: is there a place for a tool that measures user interaction on the page and comes up with some sort of helpfulness metric to guide owners as to the helpfulness of content?

    • Olaf Kopp

      22.07.2024, 08:06 Uhr

      Hi Simon, no. You have to differentiate between the different steps of ranking and ranking systems. The helpful content system is one of them and part of re-ranking. In my opinion helpful content is a quality classifier, that is activated in the re-ranking process. The initial ranking happens in the ascorer or scoring process and here content based relevance signals are important.

  • Lee Stuart

    23.08.2024, 05:10 Uhr

    Olaff thanks for the interesting and reasoned view. I was wondering what your view is on this now that in the latest core update it appears that some sites previously heavily impacted by HCU have recovered. The point of contention is that those user signals have been next to zero for a long time for some of these. So do you think that G is using historical data beyond its normal look back period or is their another re-ranking component added or perhaps some kind of manual intervention ? Interested to hear your thoughts.

    • Olaf Kopp

      23.08.2024, 08:22 Uhr

      Hi Lee, good question. The Helpful Content System is only one part of the Ranking Core. Other systems and concepts are e.g. E-E-A-T. Adjustments to the search intents can also have an influence. I think you can find at least as many examples of websites that have not recovered. This is the problem with the analysis of core updates. You will never get a complete overview, but only focus on examples that support your theses. You are subject to confirmation bias.

Content from the blog

Case Study: 1400% visibility increase in 6 months through E-E-A-T of the source entity

In this article, I would like to show the background, implementation and results of a read more

The most important ranking methods for modern search engines

Modern search engines can rank search results in different ways. Vector Ranking, BM25, and Semantic read more

Digital brand building: The interplay of (online) branding & customer experience

Digital brand building or branding is one of the central topics in online marketing. Read read more

How to become a really good SEO

I’ve been doing SEO for 15+ years now and it’s been a long road of read more

Helpful content: What Google really evaluates?

Since the first Helpful Content Update in 2022, the SEO world has been thinking about read more

Interesting Google patents & research papers for search and SEO in 2024

In this article I would like to contribute to archiving well-founded knowledge from Google patents read more