Recently I found a paper, SVMs for the Blogosphere: Blog Identification and Splog Detection by Pranam Kolari, Tim Finin and Anupam Joshi, on the identification of blogs and how to distinguish them from non-blogs - not necessarily as easy as it might sound especially if you're trying to automate the process and have software rather than humans do it.
It concentrated on the detection of spam blogs, but part of it dealt with how (as part of their research) they identified blogs hosted on what they found were the most used blogging software or blogging services: Blogger's Blogspot, Microsoft's MSN and the like (mainly for the purpose of then excluding those blogs from their study). They carried out this analysis based on the websites returned from random searches on leading blogosphere search engine Technorati, from which they collected about half a million "live" blog home pages. The study was conducted in in May to August 2005 and the paper is copyright 2006, so the data seems relatively up to date. They also regularly monitored Weblogs.com for data on blog updates, to confirm that the Technorati data was indeed a good sampling of the blogosphere, and the results from the about 5 million blog home pages they collected in this way generally matched in the relative order of blog hosting popularity though not in their exact position.
Based on the Technorati queries part of their study, it seems that the most commonly-used blogging platforms are:
Note that the method used in the SVMs paper only looks at the relative number of blog hosts, gleaned from the domain names, rather than blogging software. A lot of Blogger users publish to their own domains/servers instead of using Blogspot.com, and similarly many Wordpress and Movable Type etc users have their own domains, so just going by the domain name will leave those users out of the equation. However, I still think it's useful as a rough guide. Certainly, these results reassure me that if I want to keep producing posts which are helpful to the most number of bloggers, then continuing to focus mainly on Blogger is fine as clearly Blogger still has the biggest user base.
I also found an analysis by Elise Bauer carried out back in February 2005, where she looked at how many sites Google reported as linking to sites on Blogspot.com, Livejournal.com etc, what she called "Google Share". It's obviously out of date now, but still useful as a comparison. Again, it looks only at the domain names, so the same caveats apply.
In February 2005, according to this method, the top two weblog tools were Blogger and Live Journal, with (quite some way behind) Diaryland next. So it's clear that MSN has taken over Livejournal's no. 2 spot in the blogosphere, in just a few months (remember, the SVMs study ended in August 2005). It would certainly be interesting to compare the results if she carries out a similar analysis again soon.
I'd love to see more accurate statistics, though, pulled out just for the purpose of analysing the relative "market share" of the various blogging systems (looking at meta "generator" tags of blogs, for instance, which would be a more acccurate way of identifying the blogging software used than looking at their domains). It would be very interesting to track the changes in the comparative popularity of the different blogging platforms over time, and see if there are any broad trends.
Someone with more expertise than me should be able to employ Technorati's API to dig out that kind of information - perhaps Technorati CEO David Sifry could include this sort of data in a future edition of his authoritative "state of the blogosphere" updates?
Technorati Tags: Blog, blogs, blogosphere, blogging, blogging platforms, blogging systems, blogging software, blogging tools, weblog tools, blog tools, stats, statistics, metrics, analytics, market share, popularity, Improbulus, A Consuming Experience, Consuming Experience