Sunday, 14 January 2007

Google: hit clustering, related searches

Last week Google announced the introduction of "a new spin on" Results Hit Clustering for their Google Search Appliance for enterprise search, which will enable "Enable users to drill down on a specific subject and more easily refine their searches with automated grouping of search results by topic" - i.e. it produces:
"groups of dynamically formed sub-categories based on the results of each search query. These clusters appear at the top of search results and help searchers refine their queries from possible ambiguous terms. For example, if an employee searches for "customer" on the company network, a set of categories could appear at the top of the results with groups of topics such as "customer support" or "customer contacts" to help guide the search. Administrators can customize the location and appearance of Results Hit Clustering within search results."

I wonder when they're going to introduce this feature for standard Internet searches? I noticed something earlier this month, which might be it, when I searched on Google for ".mmf" (with the dot):

See the horizontal lines across, separating the results into sections, and the "See results for: .mmf files" heading up the second section? I hadn't encountered that before, I wonder if they're separating what might be different clusters?

The top section isn't so good, granted - I think it's trying to be the "music" cluster but there's a mobile result in there too. The second is obviously stuff to do with the mmf file format, and the final is on money and finance (though another music result crept into it too). I'd be interested to know if Google really are introducing clusters in this way.

If anyone has got their enterprise search with clustering, does it look similar to this? I do wonder why Google are rolling out clustering for Enterprise Search first, before doing it generally. I'd have thought they would want to test and tweak the clustering functionality extensively first (and on who better than their zillions of general Internet searchers, an army of free beta testers), before unleashing it on businesses, who normally don't like being guinea pigs and want features that are robust and have been thoroughly tested. But who can understand the mind of Google?

(As an aside, another interesting thing I noticed is that the search results are different depending on whether you search on or do the same search on - an attempt at localisation, maybe?)

In trying to see if this kind of clustering or at least section separation when doing other searches, I tried to search on Google for .doc (another dotted search term) and noticed something else, namely a section at the end headed "Searches related to.." (this "related searches" feature, as already noted, started cropping up in Google searches in December - and of course also involves a kind of clustering of search results):

Interestingly, when I did the same search on there were no "related searches" links at the end.

Looks like Google are definitely going to be rolling out "related searches" generally. I wonder if they'll be doing the same with clustering? Or are they the same thing? If so, what's the horizontal line separation feature about then?


Yakov said...

Please check how Quintura helps refining a search query through an interactive tag cloud and let us know your views.

Improbulus said...

Thanks Yakov I'll take a look when I've a chance. Of course refining is only the second stage, any good search engine has to return sufficient relevant results in the first place.