Marketing Pulse Blog

Google, Google Logo, Google Logotype, Google Logo Identity

Demystifying Google’s Algorithm: Top Learnings from the Leak

Written by
Chris Tatum

The inner workings of Google’s search engine is one of the internet’s best-kept secrets. They are extremely complicated and crucial systems that most outsiders don’t fully understand. However, a recent significant leak exposed thousands of internal Google Search API documents, offering a rare look into how Google ranks content and operates its search engine. The WITHIN Team has carefully reviewed these documents, uncovering key insights and exploring their implications for marketers and brands.


Three Major Myths Dispelled By The Leak

Myth: Google Doesn’t Consider Site Authority When Ranking Sites

Questions have long been raised about the role of site authority in Google’s ranking algorithm. Although Google has previously denied using  “site authority” as a measurement metric, the leaked documents suggest they do assess factors contributing to a website’s authority when ranking search results. While the exact impact on overall rankings and scoring remains unclear, the significance of site authority is supported by the longstanding importance of backlinks in Google’s algorithm. (Backlinks from reputable and relevant websites serve as a strong indicator of the quality and reliability of the content on the linked site.)

Myth: Clicks and Click Engagement Have No Impact On Organic Rankings

Despite common assumptions, clicks play a crucial role in your overall ranking within Google’s NavBoost system. (NavBoost uses past searches to predict which websites are most helpful for navigation-based queries — a search where the user’s main goal is to find a specific website or location.) The leaked documents highlight that clicks, click-through rates, and the engagement of those clicks are key factors in the NavBoost ranking system. It was also revealed that Google uses a squashing function to compress and normalize click data. This method helps prevent manipulation by ensuring that the data reflects genuine user behavior, rather than artificial boosts from bots or coordinated click efforts.

Myth: Google Does Not Use a Sandbox or Site Segregation Tactics

There’s been a long-standing debate about whether Google uses a “sandbox” or site segregation measures for new websites. In search-engine optimization (SEO) terms, a “sandbox” refers to a probationary period where new websites or those lacking trust signals are kept separate from the main search index so they don’t have to compete with established sites. The leaked documents suggest that Google might use various factors, including data from Chrome users who interact with the site, to assess a website’s trustworthiness. Essentially, these sites likely go through a review process before being fully integrated into the main search index and competing for higher rankings.


A Deep Dive Into Google’s Multi-Faceted Ranking Systems

The recent data leak revealed various ranking systems employed by Google, each playing a unique role in the search and information processing. These systems are integrated through Spanner, a database management system that synchronizes and processes data across Google’s global network. Below is a detailed look at some of these key systems:

  • Trawler – The web crawling system
  • Alexandria – Core indexing system
  • SegIndexer – Places documents into tiers within the index
  • TeraGoogle – Secondary indexing system for documents for long term storage
  • HtmlrenderWebkitHeadless – Rendering system for javascript. 
  • LinkExtractor – Extracts links from pages
  • WebMirror – System for managing canonicalization and duplication
  • Mustang – primary scoring, ranking and serving system
  • Ascorer – Primary rankings algorithm that ranks pages prior to re-ranking adjustments 
  • NavBoost – Re-ranking system based on clicks and user behaviors
  • FreshnessTwiddler – Ra-ranking system for documents based on freshness
  • WebChooserScorer – Defines features names used in snippet scoring
  • Google Web Server – The server that the frontend of Google interacts with
  • SuperRoot- The brain of Google search that sends and manages everything to present the results
  • SnippetBrain – System that generates snippets for results
  • Glue – System that pulls together universal results using individual user behavior
  • Cookbook – System for generating signals


Note. This information was taken from Mike´s King Article titled “Secrets from the Algorithm: Google Search’s Internal Engineering Documentation Has Leaked.”


Leveraging Google’s Leaked Insights

While the leak doesn’t provide an exact roadmap, it does provide valuable insights. Here’s how to leverage them going forward:

1. Create a Comprehensive SEO Content Strategy: Focus on creating high-quality, valuable content that is relevant to the page it is posted on and any pages it links to. Relevant, well-linked content can improve a page’s authority in Google’s algorithm, while incorrectly linked content can have the opposite effect. At WITHIN, our approach has been in line with this best practice, emphasizing the importance of relevance in both on-site and off-site content.

2. Enhance Page Engagement: Every piece of content and each page should provide users with exactly what they are searching for. At WITHIN, our focus on optimizing user experience and providing relevant content has proven effective in boosting engagement and maintaining or enhancing SERP standings. Keep in mind that poor engagement indicators, such as high bounce rates and low interaction rates, can negatively impact your SERP rankings, thus reducing visibility and site traffic.

3. Optimize Site Navigation: Implement structured site navigation to clearly define hierarchy, categorize content, and guide users effectively. WITHIN’s approach to building clear and effective navigation has long supported these best practices, as it enhances user experiences while also helping Google’s bots efficiently find, crawl, and index your pages – which is crucial for maintaining your site’s ranking. 

4. Manage Product Reviews: Google consistently highlights the importance of reviews, and the recent leak suggests that negative reviews can adversely affect search engine results page (SERP) indexing. At WITHIN, we guide our clients to adopt best practices for managing reviews, emphasizing thoughtful responses to negative feedback, which can improve the customer’s experience and potentially result in more positive reviews. Additionally, ensure all reviews on your site are properly displayed and use structured data to accurately communicate this information to Google’s bots.

5. Develop Localized SEO Strategies: Tailor your SEO strategies so that users can easily find your products or services in their local area. Ensure your website includes language options for the countries you serve and that stock and inventory information is up-to-date for each e-commerce site location. Consistently maintaining your business’s name, address, and phone number across all directories is crucial to boost local SEO as well. At WITHIN, integrating these practices into our clients’ strategies is standard, ensuring they align with local requirements and the latest industry insights.

6. Build Authority with Links: WITHIN has always advocated for adopting ethical (or “white hat”) strategies to create valuable content that naturally attracts more links to your site. Supplementing these efforts with performance PR and sponsored content can boost your brand’s visibility. We understand that these strategies are crucial for gaining high-quality backlinks and improving your site’s authority — key factors in Google’s ranking algorithm. Be aware that Google closely watches these practices, and manipulating them could lead to penalties. 

7. Optimize Title Tags and Click Rates: The recently released documents highlight the importance of title tags in search result determinations, emphasizing the need for optimized title tags to ensure Google indexes your pages accurately. Additionally, click-through rates have been deemed a crucial metric for user engagement within Google’s algorithm, underscoring the importance of crafting compelling titles that drive clicks. At WITHIN, optimizing these elements has always been a core practice, validated further by the latest leaks.

8. Create Concise, Valuable Content: We have always advised our clients to create concise and original content, which not only aligns with Google’s preferences but also enhances user engagement and comprehension. When creating content for SEO, consider these three key points:

  • The length of your content, including blogs and homepages, isn’t as important as you might think. Google often truncates long content, so it’s crucial to place the most important information at the beginning.
  • Google values succinct and precise content. Adding unnecessary length won’t enhance your rankings, so keep your writing to the point.
  • Google prioritizes unique and original content. Crafting engaging and distinctive content is essential for standing out and boosting your search rankings.


At WITHIN, we have always prioritized understanding as much as possible about SERPs and the essential strategies for driving website traffic. We take pride in our proactive approach to SEO, from managing reviews and optimizing navigation to building authority through ethical link-building practices. 


What Does This Mean For SEO Going Forward?

While the leaked documents offer a rare look into Google’s ranking algorithms, they ultimately reinforce the core principles of user-centric SEO. The key takeaways are clear: prioritizing user experience with high-quality content, optimized navigation, and ethical practices is essential for long-term success. These documents confirm that Google’s systems are designed to prevent manipulation and reward websites that genuinely meet user needs. By partnering with SEO experts like the team at WITHIN, and focusing on genuine user benefits rather than trying to manipulate Google’s algorithms, you can position your brand for success.

Share this
Written by
Chris Tatum


Related Articles