Google’s Position on Crawl Budget

Standard

http://webmasters.googleblog.com/2017/01/what-crawl-budget-means-for-googlebot.html

  • Crawl budget is something smaller sites (less then a few thousand pages) do not have to worry about
  • Crawl rate
    • How many active connections Googlebot has to a site
    • 2 factors affect crawl rate
      • Crawl health: fast site, faster crawl; slow site or lost of errors (5xx), slower crawl
      • Search Console limit: mostly useful for slowing down crawl
  • Crawl demand
    • Low demand, less crawling
    • 2 primary factors
      • Popularity: popular sites are crawled more often
      • Staleness: keeping URLs from becoming stale in the index
  • Crawl budget: number of URLs Google can and wants to crawl
  • Other factors
    • low value URLs can decrease crawling and indexing
      • Faceted navigation and session IDs
      • On site duplicate content
      • Soft error pages
      • hacked pages
      • infinite spaces and proxies
      • low quality and spam content
    • Googlebots would rather focus on valuable pages on the site
    • Redirect chains is bad for crawling
    • AMP, herflang, embeded content, CSS, all count towards crawl budget
  • Crawling helps get your content indexed but is not a ranking signal
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s