Thursday, 31 May 2007

Blogger feeds, Google crawls - Sitemap test over, y'all please come back now!






My attempt at a new sitemap has "taken" as much as it could, experiment's over, and I've deleted my "don't come if you don't want a slow-loading page" post. Google's picking up the same number of URLs from the home page as before, no more.

I should have checked more thoroughly first. The only Blogger feed Google Webmaster Central will accept for their Webmaster tools sitemaps (quickie on Google Sitemaps for those unfamiliar with 'em) is the atom.xml feed, because it's the only one in the root directory of files stored by Google for Blogger blogs.

Kirk pointed out that New Blogger's base atom.xml feed shows just 25 entries anyway - regardless of how many posts you've set to display on your main blog home page, whether fewer than 25 or more than that. This is different from Old Blogger, where the feed showed exactly the number of posts you'd set to show on your home page, and boy do I have some editing of old posts to do, as many of my posts mentioned the importance of this point..

Plus, that base feed is ordered by "date updated" too, not "date published" (more on Blogger feed ordering and sorting, see Kirk's post for the full lowdown). Unfortunately you can't change that when you add the feed URL to Webmaster Central to use it as a sitemap, even though you can change it in e.g. the feed URL that you give to services like Feedburner.

This means that the base Blogger feed shows the 25 latest posts that you've published or updated - yes, including old posts that you've just edited - rather than your 25 newest posts.

So the feed only pointed the Googlebot to my last 25 updated posts, not the universe of posts I've put up since my domain name change. Wail. However, Kirk also pointed out, quite rightly, that feed as sitemap isn't really appropriate for blogs and hopefully all the pages on the blog will get picked up by Google eventually as they link to each other and the Googlebot follows links. It's just going to take longer than I'd hoped for Google to crawl and index my post-domain name change posts. Bear with me as I may well publish a post with those links, just to help it on its way.

And I hope my sudden 30% drop in daily visitors since the domain name change was just a blip with Google rather than the result of the change. Quite depressing as I'd managed to build up a PageRank of 6 before the change, now it's 0 at least until the next Google public update - unfortunately for me my domain name change came at the wrong time for me so it never got picked up on the last one around the end of April or beginning of May, I gather. Ah well, c'est la vie.

I'll be posting on my trials and tribulations following the domain name change, with tips and pitfalls to avoid, of course, once I've actually dug myself out of the current pits!

2 comments:

Phil Rossi said...

Hi there!

Did you solve this blogger base feed of 25 posts issue? It's just about to screw me.

Thanks!
Phil Rossi

Improbulus said...

Phil, yes. Going to be doing a separate post just on Blogger feeds soon so keep your eyes peeled. Meanwhile, use for your feed URL:
http://YOURBLOGNAME.blogspot.com/feeds/posts/default?max-results=100 (or whatever number you want, I've tried up to 500!)

If you blog via FTP use
http://www.blogger.com/feeds/YOURBLOGID/posts/default?max-results=100
(how to find out your blog ID)