Thursday, July 21, 2011

URL parameters

Google’s goal is to crawl your site as efficiently as possible. Crawling and indexing pages with identical content is an inefficient use of our resources. It can limit the number of pages we can crawl on your site, and duplicate content in our index can hinder your pages' performance in our search results. Duplicate content often occurs when sites make the same content available via different URLs—for example, by using session IDs or other parameters, like this:

http://www.example.com/products/women/dresses/green.htm
http://www.example.com/products/women?category=dresses&color=green
http://example.com/shop/index.php?product_id=32&highlight=green+dress&cat_id=1&sessionid=123&affid=431
In this case, all these URLs point to the same content: a collection of real green dresses.

When Google detects duplicate content, such as variations caused by URL parameters, we group the duplicate URLs into one cluster and select what we think is thehttp://www.blogger.com/img/blank.gif "best" URL to represent the cluster in search results. We then consolidate properties of the URLs in the cluster, such as link popularity, to the representative URL. Consolidating properties from duplicates into one representative URL often provides users with more accurate search results.

To improve this process, we recommend using the parameter handling tool to give Google information about how to handle URLs containing specific parameters. We'll do our best to take this information into account; however, there may be cases when the provided suggestions may do more harm than good for a site. Read More....

No comments: