E-E-A-T: Discovery and evaluation of high quality ressources
The assessment of the Quality and authority of websites is crucial for search engines and users alike. In a time of constantly growing information available on the Internet, including content generated by AI, it is becoming increasingly important reliable and high-quality sources to identify and distinguish it from less trustworthy content.
This article explores the concept of E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness), a framework that search engines like Google use to Quality and trustworthiness of websites / domains.
During the research, I analyzed numerous research papers and Google patents from the SEO Research Suite to find out more about the methodology, metrics and type of classifiers. I go on both automatic systems for source selection and evaluation as well as manual and heuristic approaches.
The goal is to develop a deeper understanding of what sets a website apart in terms of quality and authority and how these aspects can be made measurable.
I refer to the sources used at the end of the article.
Contents
How to discover good quality websites?
Discovering good quality websites multidimensional process, which both automatic analysis of content and structures as well as that Evaluation of links, correctness of content and user behavior includes
Automated systems analyze the content and structure of websites to identify valuable sources for expanding knowledge. They are looking for new information, ease of annotation and cost-effectiveness of data extraction. Well-structured websites are preferred.
In summary, discovering good websites is through a combination of automated content analysis, evaluation based on various website characteristics, analysis of the link structure, evaluation of the correctness of the content and consideration of user behavior can be done.Good quality websites can be discovered through various automated and user-centered methods.
How can websites be rated for quality?
Websites can be assessed for quality in a variety of ways, using various automated and user-centered methods:
- Evaluation based on usefulness for knowledge expansion: Systems evaluate web sources based on how well they are suitable for filling knowledge gaps. The criteria here are the proportion of new facts, the ease of annotating the data and the economic efficiency of data acquisition. MIDAS prefers sources that provide valuable, new facts and are easy to use. Domain authority is also taken into account.
- Assessment by analyzing website signals and machine learning: This identifies website signals that are predictive of quality. Machine learning creates models that characterize the relationships between human quality scores and these signals. These models can then be applied to unrated websites to generate calculated quality scores. These ratings may consider factors such as the originality of the content, the ratio of original to copied content, the layout of the site, grammar, spelling, and the presence of inappropriate content.
- Rating based on the quality of incoming links: The quality of a website can also be assessed by analyzing the quality of the resources that link to it. A link quality score for the target website is determined based on the quality ratings of the linking resources. A low score can lead to being classified as inferior.
- Assessment of content accuracy (Knowledge-Based Trust): The trustworthiness of a website can be estimated based on the accuracy of the facts it provides. This approach looks at the content itself rather than just external signals. A probabilistic model can identify trustworthy sources with low popularity and flag less trustworthy but popular websites.
- Rating based on user behavior: The duration of user visits to a website’s resources can serve as a measure of their quality. By calculating statistical metrics from these time measurements, a site quality score can be determined.
- Rating by propagating quality: Quality ratings can also be transmitted between linked or related websites. A quality model can take neighboring features and page-specific features into account.
- Predicting quality through phrase models: It is also possible to predict the quality of a website based on linguistic patterns and phrases. Phrase models relate the frequency of certain phrases on already rated websites to their quality ratings. Aggregated quality scores can then be determined for new websites based on the frequency of these phrases.
Metrics for the Quality assessment
- Number of new facts, which provides a website.
- Simplicity of annotation of the data found on the website.
- Homogeneity of content the website (focus on a single topic or entity).
- Level of structuring of the data (Presence of tables, lists, etc.).
- Domain Authority the website.
- Economic efficiency of data collection from the website.
- Coverage: Percentage of relevant data present on the website.
- Accuracy: Accuracy of the information on the website.
- Timeliness: Up-to-dateness of the data on the website.
- Cost: costs associated with accessing the website’s data.
- Metrics based on one Knowledge Graph:
- Relatedness Metric
- Notable Type Metric
- Contribution Metric
- Prize Metric
- Website signals:
- Originality of content.
- Ratio of original to copied content.
- Website layout.
- Grammatik.
- spelling.
- Presence of inappropriate content.
- Quality of inbound links from other resources.
- Link Quality Score based on the quality of the linking resources.
- Duration of user visits on the site resources.
- CTR of the website in search results.
- Selection duration: average time the website is displayed when selected.
- Layout-Score: Evaluation of the quality of the website layout.
- Number of incoming links to the website.
- Number of unqualified sourcesthat link to the website.
- Phrase-specific relative frequency of certain phrases on the website.
- Aggregated quality score based on the frequency of phrases.
- Web source accuracy (Probability that the information provided is correct).
- Precision and recall of the extractor (Performance of the programs that extract information from the website).
- Relevance of the content on the subject.
- Non-triviality of the facts presented.
- Target group targeting (e.g. wide or niche).
More signals for E-E-A-T you can find in our detailed overview “How Google evaluates E-E-A-T? 80+ ranking factors for E-E-A-T” and graphic:
What type of classifiers can be used for quality assessment?
Ressources for this research in the SEO Research Suite
- Website quality signal generation
- Evaluating quality based on neighbor features
- Resource scoring adjustment based on entity selections
- Producing a ranking for pages using distances in a web-link graph
- Obtaining authoritative search results
- Determining a quality measure for a resource
- Classifying sites as low quality sites
- Scoring site quality
- Finding Quality in Quantity: The Challenge of Discovering Valuable Sources for Integration
- Analysis of MIDAS: Finding the Right Web Sources to Fill Knowledge Gaps
- Classifying resources using a deep network
- Predicting site quality
- Website quality signal generation
- LLMO / GEO: How to optimize content for LLMs and generative AI like AIOverviews, ChatGPT, Perplexity …? - 21. April 2025
- LLMO / Generative Engine Optimization: How do you optimize for the answers of generative AI systems? - 16. April 2025
- Digital brand building: The interplay of (online) branding & customer experience - 27. March 2025
- E-E-A-T: Discovery and evaluation of high quality ressources - 25. March 2025
- E-E-A-T: More than an introduction to Experience ,Expertise, Authority, Trust - 19. March 2025
- Learning to Rank (LTR): A comprehensive introduction - 18. March 2025
- Quality Classification vs. Relevance Scoring in search engines - 1. March 2025
- How Google evaluates E-E-A-T? 80+ ranking factors for E-E-A-T - 27. February 2025
- Query document matching: How are queries matched with documents in information retrieval? - 24. February 2025
- Prompt Engineering Guide: Tutorial, best practises, examples - 27. January 2025