URL Parameters in Webmaster Tools

Maile Ohye

Developer Programs Tech Lead

Google

URL Parameters in Webmaster Tools

Advanced feature

Some sites already have high crawl coverage as determined by Google.

Improper actions can result in pages not appearing in search.

URL Parameters hint (vs. directive)

  • URL parameters can be helpful hint

  • robots.txt or meta noindex is more a directive than hint

Issue: Inefficient crawling

www.googlestore.com

  • 158 products
  • 380,000 URLs identified by Googlebot!

Issue: Crawling redundant or duplicative content/items

http://www.googlestore.com/googlesearch.aspx?category=You%20Tube

http://www.googlestore.com/googlesearch.aspx?category=You%20Tube&size=M

Assists understanding parameters to crawl site more efficiently

  • Crawl your site more efficiently (decrease number of duplicates)
    • Saves bandwidth
    • Helps more unique, fresh content to be indexed

  • For removals, go to URL Removals in Webmaster Tools

On-page markup can still be applied

Page-level markup

  • rel="canonical"
  • rel="prev" and rel="next"
  • rel="alternate" hreflang="x"
  • noindex

is still taken into consideration if page is crawled.

* Make sure we can crawl your page (i.e., not robots.txt disallowed), if you want page-level markup applied!

URLs eligible for the feature

http://www.googlestore.com/googlesearch.aspx?category=office

http://www.googlestore.com/googlesearch.aspx?category=Wearables

http://www.googlestore.com/googlesearch.aspx?category=Wearables&size=M

http://www.example.com/page.php? key=value&key2=value2

which we interpret as equivalent to

http://www.example.com/page.php?

key2=value2&key=value

Ineligible URLs

http://www.example.com/Wearables++Youtube++size+M.axd

http://example.com/cancun+hotel+zone-hotels-1-23-a7a141343.html

http://example.com/hotels/cancun/a7a141343.html

http://www.example.com/Wearables++Youtube++size+M.axd

http://example.com/cancun+hotel+zone-hotels-1-23-a7a141343.html

http://example.com/hotels/cancun/a7a141343.html

Step 1: Specify parameters that do not change content

  • Do I have parameters that don't affect the page content (e.g., SID, affiliateID, or tracking-id)?

Likely mark as "does not change content."

  • Results as "One representative URL" setting in Webmaster Tools

Likely mark as "does not change content."

  • Results as "One representative URL" setting in Webmaster Tools

Step 2a: Specify parameters that change content

Step 2b: Specify Googlebot's preferred behavior

Sort parameter

Changes the order content is presented

  • sort=price_ascending
  • rankBy=bestSelling
  • order=highest-rated
  • sort=newest

Example of sort parameter pulldown menu

Example of sort parameter pulldown menu

1. Identify the sort parameter

2. Specify Googlebot's preferred behavior for URLs with this parameter

Option 1: Sort parameter never displayed by default?

  • Is the sort parameter optional throughout my entire site (i.e. not displayed by default, but only with manual selection)?
  • Can Googlebot discover everything useful when the sort parameter isn't displayed?

If "yes," likely that with your parameter you can specify "crawl No URLs."

Verify examples displayed aren't canonical, and that the canonical can be reached with JavaScript turned off during navigation.

Option 2: Same sort values site-wide?

  • Are the same sort values used consistently across my entire site? (e.g., not sort=year-issued as a sort option for selling coins but not coin albums)
  • When a user changes the sort value is the total number of items unchanged?

If "yes," likely that with your sort parameter you can specify "only URLs with value x" where x is one of the sorting values used site-wide.

Option 3: Let Googlebot decide

If neither rule applies "Let Googlebot decide."

Narrows

Filters the content on the page by showing subset of total items.

size=M

less_than=25

color=blue

Example of "narrows" on an e-commerce site

size=M

less_than=25

color=blue

Example of "narrows" on an e-commerce site

Narrows

  • If the "narrows" parameter shows less useful content that's a subset of the content from the more useful URL without the "narrows" parameter, you might be able to specify "Crawl No URLs."

Useful: category=You%20Tube

Less useful: category=You%20Tube&size=M

    • But verify a few things first...

Narrows and "Crawl No URLs"

  • Be sure the "narrows" parameter won't also filter out useful pages you'd like crawled and surfaced in search results (e.g., perhaps brand or category pages)

  • Verify that the example URLs shown in the Webmaster Tools provide non-searcher useful content when compared to the parent URL

Narrows (cont.)

If "Crawl No URLs" isn't optimal for your site, then perhaps select "Let Googlebot decide."

Specifies

Determines the content displayed on a page.

  • itemid=android-t-shirt
  • SKU=495

Crawl every URL.

Translates

Unless you want to exclude certain languages from being crawled/available in search results, (e.g., auto-generated translations), select "Crawl every URL."

Translates (cont.)

Best practice to place languages in subdirectory or subfolder rather than parameter to help search engines more easily understand site structure.

Paginates

Displays one component page of a multi-page sequence.

  • page=3
  • viewItems=10-30
  • start-index=20

Nearly always "Crawl every URL."

Multiple parameters in one URL

example.com/item.php?sku=234&page=3&sortBy=price&sortOrder=lowToHigh

Imagine all URLs begin as eligible for crawling, then apply each setting as a process of elimination,

not inclusion.

Recap

Utilize URL Parameters for more efficient crawling

    • Specify parameters that do not change content
    • Specify parameters that change content
      • If you can't determine, don't guess, "let Googlebot decide"

Recap (cont.)

  • Sorts
    • If parameter never exists in URL by default: "Crawl no URLs"
    • If parameter values are used consistently site-wide: "Crawl URLs with value x"
  • Narrows: for non-useful filters "Crawl no URLs" (but be sure to double-check :)
  • Specifies: usually "Crawl every URL"
  • Translates: usually "Crawl every URL"
  • Paginates: usually "Crawl every URL"

Thanks for your time!

URL Parameters blog post

http://goo.gl/MDgoE

URL Parameters help center article

http://goo.gl/pC1Eu

Google Webmaster Central

http://www.google.com/webmasters/