Wednesday, April 28, 2010

Why Google Indexed My Blogger Archive?

Since the archive pages of your blog repeat the content of your posts by default, there is a controversy about "duplicate content penalty" towards search engine ranking being discussed at every Blogger discussion forum. Some people do not agree to such kind of penalty, and say that it's a myth. Whether it's a myth or truth, you might want to disallow the search engine to index your archive pages so that they never appear in the search results; thus, restricting the end-user to land only on permalinks of your real posts (and not archives).

But wait! Doesn't your blogger template already contain the following line which instructs the Google bot to leave your archive pages?
<b:if cond='data:blog.pageType == &quot;archive&quot;'>
<meta content='NOINDEX' name='ROBOTS'/>
Why, then, the archive pages still appear in search results? One reason could be that you changed your template. While the template may be updated now (you could verify by viewing the page source of your rendered archive pages to see if there is a meta robots tag), at some point in time your archive pages didn't contain this tag, and that's exactly the time when Google indexed these pages. To confirm this hypothesis, you just need to check the source of the "cached archive page" on Google.