A major SEO industry leak is being unpacked, with lots of theories from the last decade finally getting some validation. Here’s what this leak means for your site’s SEO.
What was leaked?
Internal API documentation used by the Google Search team contains over 14K attributes (with accompanying descriptions!) that can be used as part of the search algorithm.
While these aren’t necessarily the exact levers used in search ranking, they’ve pulled back the curtain to give SEOs some valuable insights.
How did it leak?
An anonymous source shared thousands of leaked Google search API documents with Rand Fishkin, co-founder of SparkToro.
Rand reported, “The leak appears to come from GitHub, and the most credible explanation for its exposure matches what my anonymous source told me on our call: these documents were inadvertently and briefly made public (many links in the documentation point to private GitHub repositories and internal pages on Google’s corporate site that require specific, Google-credentialed logins). During this probably accidental, public period between March and May of 2024, the API documentation was spread to Hexdocs (which indexes public GitHub repos) and found/circulated by other sources.”
Check out his coverage of the leak in detail here.
What does it mean?
- SEOs are getting a behind-the-scenes look at the different levers that could be factored into ranking in the organic results.
- Just because an attribute exists doesn’t mean it’s in active use, but it does have the potential to be used and can be factored into SEO experimentation.
- Many of them contradict things Google has said in the past, such as site authority being a factor and the potential penalisation of smaller sites.
Historically, most SEO theories are based on observed impacts from optimisation testing from the community as a whole.
(e.g. Optimising on-page headings can boost CTR, as Google uses these elements to replace page titles on the SERP).
These attributes serve as extra context for those theories, helping SEOs frame their testing methodologies around potential levers.
All that being said, there’s already some division on the leak’s real impact.
Time will tell as the leak gets tried and tested by SEOs in the trenches.
To sum up the highlights:
- Google has repeatedly said that they don’t use anything like Domain Authority but a potential attribute called siteAuthority was revealed in the leaks.
- Google claims they don’t use clicks for rankings but there are modules focusing on click signals that represent users as voters, as well as clicks from Chrome and cookie data as votes.
- Google has said they don’t use Chrome for ranking, but there are modules related to Chrome clickstream data, particularly for sitelinks in sitelinks search box featured snippets. Here, it looks like the sitelink list might be generated primarily based on top pages from Chrome clickstream data.
- Google has an attribute called hostAge that many theorise leads to sandbox fresh domains before determining where to rank them.
- Google has a specific flag for smaller sites that indicates it’s a “small personal site”, which may harm those sites’ potential to compete in some niches.
- Industry leaders are not surprised to see confirmation of whitelists for certain topics like COVID-19 or politics.
- Google historically stated that quality rater scores didn’t affect rankings. However, several attributes reference using these scores as factors with specific notes that these ratings impact the model’s quality.
From content to backlinking, this leak further validates our holistic, user-focused approach to SEO for website builds and audits.
We’re still unpacking the trickle of thoughts and opinions – watch this space for more updates.
Recommended reading list:
- Spark Toro: An Anonymous Source Shared Thousands of Leaked Google Search API Documents with Me; Everyone in SEO Should See Them
- i Pull Rank: Secrets from the Algorithm: Google Search’s Internal Engineering Documentation Has Leaked
- Search Engine Land: How SEO moves forward with the Google Content Warehouse API leak