Midomi beta (FAQ) is a music search engine with a difference - a singing search engine, if you like. It lets you search online for songs and other music in any of 6 languages, free. Just sing, hum, whistle, play etc the tune into your computer microphone or headset (or, in future, pick it out on an on-screen piano keyboard). And it actually works! What's more, it's a classic Web 2.0 service with user generated content and social networking for singers and music fans. Their searchable database of music (which their music recognition search engine uses for matching voice search queries) is 100% user-generated - it's been built up entirely by their registered members recording and submitting the music of their choice. And registered users can rate and comment on other members' renditions, share recordings they like, add recordings to their playlists etc, too. It also serves as a digital music store, if you want to buy the music that you've found through your search.
So to put a name to that tune that's been stuck in your head buzzing round and round, to figure out the unknown song or earworm bugging you which you can't get out of your mind, why not try midomi. Even if you can't remember the whole song, you can simply feed it the few notes you can remember. It's able to identify a song from a fragment of the melody, from just a few bars or lines - although the longer the sequence that you can give it, the more accurate its recognition is likely to be.
You can click to play items in the search results list to make sure you find the right song, and it'll play just the part which their software thinks matches what you sung rather than the whole song. You can hear sample extracts of original commercial recordings of the music discovered through your search - and even buy and download digital tracks legally for US $0.99 via a program you have to download from the site. They've licensed 2.5 million songs for sale.
I mentioned Sandra Uitdenbogerd's music information retrieval research in my post about Yahoo! Audio Search results now including audio samples as did others. A commercial version of that system is still a few years away; but midomi is already here, and (mostly) working, taking a much simpler yet very workable approach: why try to analyse the "signatures" of polyphonic audio with multiple lines, multiple instruments or sound sources and possibly complex harmonies, when you can make use of monophonic renditions of single melodies?
Midomi seem to be on an upward growth curve too. Their CEO Keyvan Mohajer gave A Consuming Experience the following statistics:
- Total number of recordings as of Aug 2007: well over 100,000 (when they launched they started with 12,000 songs)
- Number of new songs added per day: between 1000 and 3000
- Unique visitors in August 2007: close to 2 million per month
- Expected unique visitors in September 2007: over 3 million per month.
Remember Jakob Nielsen's 90-9-1 rule that generally 1% of users account for most of the contributions in social networks and online communities? If that holds true here, the figures on numbers of recordings (which would be by the most active users i.e. the 1%) suggests pretty impressive overall user numbers.
It would also be interesting to know how many users in each language have signed up, and whether users from cultures that are more involved in singing outnumber the English language users, e.g. are there more Japanese language users given the popularity of karaoke?
I don't know how I missed midomi before, as it was launched back in January 2007. In a way, though, it was just as well I did, as its database gets better with more additions of user generated content over time. I might not have been very happy with it if I'd looked at it when it first launched, because the data set was too small then for the degree of accuracy I'd like, and initial reports weren't that good. I think it's now reached the "enough samples to work properly" stage, though.
So, catching up, this post contains a description, howto and review, and some commentary on the interesting ways in which I think midomi exemplifies Web 2.0.
How does midomi work?With midomi, I think audio music search is coming into its own - it really does seem to be "next generation" voice search, or as they put it in their January 2007 press release, the "first search engine that can match human voice to human voice".
Midomi makes use of proprietary patent-pending MARS (Multimodal Adaptive Recognition System) technology, "the only way to match human voice to human voice by matching multiple features and without any manual intermediate indexing", invented by California-based Melodis Corporation - the company behind midomi.
Melodis was formed by a team of Stanford University graduates, the CEO being Keyvan Mohajer who I have already mentioned. He is himself a music lover and musician as well as electrical engineer - in fact he started playing the piano when he was 7. He says he is fairly good at music, especially music theory, and he enjoys all sorts of music, including classical, traditional of various countries, and pop /rock music.
He confirmed that in matching human voice to human voice to perform the search, their search engine (or "sing engine" as I've seen it called) matches many features: pitch variation, rhythm information, location of pauses, phonetic content and speech content, etc. In addition, for every query they adapt to the parameters that seem more important. For example, if you only hum a song, it ignores speech content. But if you sing it with the lyrics, it uses that information to make the search more accurate. He said the search is independent of key, tempo, language and singing quality. Yes, you saw that right, singing quality. Even if you're a bit out of tune, it will still work. If you're completely tone deaf though that's a different matter...
He also confirmed my speculation about how their technology and database work:
"The more people contribute the same song, the more accurate the recognition result of that song will be. If a song has been contributed multiple times, then we have more information about that song and the search becomes more accurate. As you know different people sing the same song differently, and this is the most effective method for creating our index. The good news is that the more popular songs get contributed more often to our database and this makes our engine even more accurate. Basically we use every piece of information we can get to make the search more accurate."
The browser user interface into midomi employs Adobe Flash Player (download Flash Player) for both the audio searching and recordings by users (on which more below).
The next part of this post is a practical how-to, and I'll end with some general thoughts on midomi.
How to set up midomiMidomi is (mostly) very easy to use. All you need is:
- a working microphone connected to your computer, e.g. built into a webcam, or even a headset or phone handset that you use for internet telephony
- soundcard and speakers.
The initial steps are:
- Go to the midomi website.
- Make sure your mic or headset etc (e.g. a Skype headset works fine) is connected to your computer, and working.
- Click "Test microphone", "Allow" camera and microphone access, choose your mike from the dropdown (e.g. headset or soundcard), click "Record 5 second sample" and make some noise (no need to sing if you don't want to!).
- After the 5 seconds it's supposed to play back your noise so that you can hear it's working.
- When it works for sure, click Save Settings.
Problem no. 1 - playback of sound test sometimes isn't audibleWhen I tried to test my headset mic, at first I couldn't hear my recorded sounds. It could be an issue with my browser, my XP PC, or Flash, or midomi, who knows, but I got nothing. The status during your test is supposed to show in turn as Recording, Uploading, Buffering, Playing, Stopped. The activity level % was going up and down with the volume, so the microphone was definitely registering something. But for me it skipped the Playing stage - it didn't play back my sound recording at all.
Tried both Internet Explorer and Firefox, nothing. Finally, I tried selecting my soundcard Creative SB Live! Series from the dropdown, gave it the 5 secs (nothing happened of course, no recording, no sound playback, because I don't have a mic connected direct to the soundcard). Then I switched back to headset and tried again, and this time it worked. Go figure. A troubleshooting tip: if you have difficulties too, you could try the same steps. You'll know it's finally working when you hear your recorded sound played back to you and it shows "Playing".
After I shut down Internet Explorer, however, it stopped working yet again (in all browsers, inexplicably. I had both IE and Firefox open). The same trick above didn't work. However, I kept reloading the "Test your microphone" page, and finally it worked. So that's another tip to try.
Is there a solution?I put the problems I had (there were a couple of others, see below), to Keyvan. He said all but one of the problems I'd encountered were very random and not a well known problem with a known solution, but as they were constantly improving the service he felt that many of these problems would go away.
My suggestion: just keep trying, maybe with different browsers. I agree they are random, inexplicably, but at least if it doesn't work one time it'll usually work the next.
Flash Player and mic access & how to adjust your settingsBeing asked to allow midomi mic access is a Flash Player, not midomi, feature. Flash Player has security and privacy protections so that when Flash content tries to use any camera or microphone attached to your computer, e.g. to record from your Webcam or headset etc, you'll get a prompt asking you if you want to allow or refuse the request to access your camera or mike.
If you think it's too much of a pain to have to click to give midomi access every time you search or record, you can choose to always permit access by a particular website by checking "Remember" or "Don't want to see this box again" etc:
If you later change your mind, in your browser just visit the Flash settings manager page and via your website privacy settings panel you can delete a site etc. I had trouble accessing those settings with Firefox, but then there seem to be issues with Firefox and Flash on occasion. You can always adjust it in Internet Explorer or Opera, which is what I did in the end.
How to search by singing via midomiIf you're shy, don't worry - Keyvan has confirmed that midomi won't automatically save your singing or humming if you're simply searching, and remember you don't need to register on their site if all you want to do is search. Before midomi will store what you've sung into their database you have to actively choose to register, then go to their "recording studio", on which more below, record your singing, and then submit the recording you like best.
You don't actually need to test your microphone first before searching, you could just go ahead and try it, but until some random problems with that are ironed out I'd recommend you do that just to save yourself some frustration if it doesn't work.
So, once you've done the "Test microphone", got it working and saved your settings, click "Start Voice Search", allow camera/mic access again if necessary, then sing or hum etc your search melody or song snippet. The longer the better, it needs at least about 5 seconds' worth of the tune.
Click Done when finished, and midomi will come up with the results most closely matching your search.
You can click to hear a contributor's own version of a song, and even play a sample of the original commercial recording.
Problem no. 2 - sometimes can't hear original clip on attempted playbackSometimes I found that there was a problem with the playback of the original 30-second samples, not so much from the search results page but more often from the recording studio pages, see below. You could see the playback bar progressing, but no sound was audible at all. Others have encountered this problem too.
Keyvan said they were aware of that problem and would fix it soon: "It is a known problem and we know the solution." I certainly hope they fix it on the recording studio pages too, where the problem seems more prevalent.
Problem no. 3 - voice search from search results pages doesn't always work On my system anyway, if you want to do another search next, the "Start Voice Search" button doesn't always work when you click Start Voice Search from the search results page. You are supposed to allow mic access, then click Record when ready, but on hitting Record, a lot of the time all I got was a continuous "Connecting..." I also tried the "Start Voice Search" link in the left sidebar of the search results page, and exactly the same thing happened, here's a screenshot from a few weeks back (it seems to be working better now):
Another tip: the fix for me, for this particular problem, was simply to go back to the Midomi home page and do the new voice search from there. So again, the voice search function or link seems a bit erratic. It didn't work from the search results page more often than it did.
What if you can't sing? Are there any other searching methods?Even if you're a little out of tune, indeed even if you make mistakes and sing some notes completely wrong, it seems to work. That's one of the interesting features of the MARS technology - it looks for similarity, not identicality. The "input query" doesn't need to exactly match the melodies in the database. As long as there's a certain amount of similarity, their music recognition engine can match and return results sorted by relevance (hear the deliberately wrong versions of "Happy Birthday" etc in the video demo mentioned below!).
But ideally you still have to feed the search engine some approximation of the right tune. So, what if you really can't sing for toffee? Well, it's still usable.
Play a musical instrument - and, sometime, an online keyboardYou can try whistling. Or if you're totally tone deaf and can't even whistle roughly in tune, you can try playing an instrument into the microphone. I asked Keyvan that question, and he confirmed that midomi works with instruments as well e.g. a piano.
It works best though if it is only one instrument, and not multiple notes playing at the same time. Their search matches both melody and lyrics (if lyrics are detected). If you play an instrument, the search automatically ignores matching lyrics and only uses the melody. There is only one possible gotcha to note here: since most users sing, the search engine is tuned, if you'll forgive the pun, to only consider notes that are in a lower kHz range (i.e. the normal frequency range of the human voice). So if you play the very high notes on the piano it may not work, but the middle and lower notes should work. I guess Queen of the Night coloratura or "whistle register" searches are out then!
What about people who don't want to sing or whistle, or can't, and don't have a musical instrument? If you could search by picking out notes on a Flash on-screen piano keyboard with your mouse (as George Ou has suggested), and perhaps even allow inexperienced musicians to record several renditions till they are happy with one that's closest to what they want to search for, that would be useful to many. Or maybe let the computer keyboard be used as a "piano" to pick out notes.
Keyvan told me that in fact they do have a piano flash interface - see this video demo (WMV, 7.8 MB) whose link he kindly sent me, which hasn't been made public previously, and which interestingly mentions that their music recognition engine can also search digital sheet music databases and that it may also be possible to input notes through a MIDI interface e.g. electronic piano keyboard connected to the computer. However, when they tested it on beta users they did not use it nearly as much as they used the voice search. So midomi decided to postpone its launch to avoid confusing the users for now. But they will release it in the future. My suggestion: if their search engine isn't sensitive to tempo because of MARS, they shouldn't make the user choose exact duration of each note as shown in the demo (crotchets, quavers etc, or quarter notes etc in the USA) - just record the time intervals between different notes being picked out, and use that, or people will be put off, especially those who can't read music.
What if a searcher nevertheless plays part of a song complete with full backing to the mic? That's what happens with Shazam's music recognition search service in the UK - dial their number, hold your mobile out towards the music that's playing, get a text with the song info (they charge for the text, obviously). How well can midomi's search engine handle polyphonic input? Well, it can still work because Keyvan said their engine tries very hard to extract the main melody from background noise and background music - but if the background music is very loud, obviously it may not work as accurately. Of course, if you are able to play a particular song into the mic, you probably already have a recording of it and don't need to search to find out what it's called!
Text search for song name or singerCovering all bases, midomi also has, as a backup or alternative, a "Search with your Keyboard" text search option where you can just type the name of the song or singer if you already know it, or indeed both song and artist names, to look for certain music, hear extracts with different people singing it, hear the original version, and then maybe buy it from their website. (The search will also return the names of midomi registered users, more on that below.)
Does it work? Is it any good?Given what's been said about how the technology works, it wasn't surprising that I found it works best when you sing the lyrics, not just la la la or humming or whistling, if you happen to know the lyrics that is. That's because, as Keyvan mentioned, it takes account of the lyrics info too if you can provide that. Remember too that it's similarity it looks for, it needn't be an exact match, so even if you misheard or misremembered the original lyrics and sing some of the words wrong a la "kiss this guy", you may still find the right song!
It does indeed seem independent of key and tempo. I tried singing the same song snippet at different pitches on different occasions, and also at different speeds i.e. faster and slower, and it still worked. Very clever. Of course, there are limits - if you sing chipmunk fast or too slowly it seems to confuse the search engine, so a middle ground is best.
I found it a bit variable, but largely very usable with all 3 browsers I tried. Most of the time it came up with the right song, if not at the top, at least on the first page of the search results. But as mentioned above I found it sometimes erratic in terms of whether and when it supported my microphone and speakers. It's best to do the "Test your microphone" thing in the browser you're going to use immediately before you start a search session, just in case.
A couple of times, humming the same tune (that I knew was in their database) doing a repeat search, at a different pitch or a different speed, did produce different results. But it's relatively early days yet for midomi. The way it works is, the more users who contribute to their database for the same song, the more accurate their search engine will be. The examples I mentioned where the same search at different times produced different results had only one user who sang that particular song, and when I sang it with the lyrics it worked much better.
The site seems rather slow too sometimes - possibly a sign of more users than they expected? Presumably they'll up their server capacity as their visitor numbers expand.
But it's a lot of fun listening to other people singing such a huge variety of songs, even songs you already know - and even if (especially if?) they're really bad!
What sort of music can you find through midomi? Pop, classical, instrumentals?The recordings are in different languages and different genres. Their database is 100% user generated. So if someone adds particular types of content to midomi's database, they will be searched, whether pop, rock, soul, music theatre or classical, whether vocal or indeed instrumental music. For example, they have the main theme of Godfather, and also the Imperial March from Star Wars, and there's one guy who's tried putting in the most famous tune from O Fortuna in Carmina Burana!
At the moment though their database seems to contain mostly popular music, not surprisingly, with a lot of non-English language pop songs.
How do you record a song to contribute to midomi?Anyone can search on midomi, but if you want to contribute your own recording of a song or indeed instrumental (Jaws, anyone?!), you need to register, which is free. Make sure your mic is plugged into your computer and working, of course. Do the usual Test to check.
Login to midomi, go to My Studio or Recording Studio, and search for the song you want to sing - you can include the artist's name as well as the title of the song to help narrow it down.
Then click Sing next to the version you want to record, sing it, and Stop when you've finished: ta-da, online karaoke!
One issue here is that if there are different commercial versions of the same song, e.g. live or studio, acoustic or electric, or sung by different artists, you have to plump for one of them. That is clearly very much from a sellers' "which commercial recording to flog to users" viewpoint, and I get that.
But I think from a contributor user's viewpoint, they won't really care in most cases what version, they just want to sing a particular song, and it can be confusing and offputting as well as taking up unnecessary time to make them choose a version at that point. It would be much better if the would-be contributor was just presented with one version of the song. The search results page is where I think where midomi should then be listing the different commercial versions available for sale, instead (and making sufficient versions available!). But I don't know if their set up has been designed in such a way that it wouldn't be possible to change this aspect.
The instructions are pretty clear and it's straightforward, click the red-on-yellow button to start recording, then sing away and click on the Stop button (similar) to stop; the green bar shows how far along you are. A warning will pop up if you're too loud at any point, so keep an eye on the screen, you may have to re-do the recording if so. It's a bit difficult not to be too loud if you're using a headset, I have yet to work out the best approach then - maybe remove the headset and just have the mike a few inches away!
You may have noticed the "Find Lyrics" link under the name of the song. That doesn't yet produce a new web page or window with the official lyrics of the song you want to record, which would have been helpful. At the moment all it does is a search on Google for the lyrics, which I guess is the next best thing as it usually works. Keyvan confirmed that they have already licensed and display lyrics on midomi.co.jp their official Japanese site. The English lyrics will most likely also appear in the near future once they clear the license. Which will be a great help.
You may also have noticed the "Preview Original" link. This plays a bit of the original song, so you can remind yourself of how it goes if you need to. Or rather, it should play it. See problem 2, above - I've not been able to hear any original sample from the Recording Studio page, so far. You can see the progress bar, but there's no sound.
You can also see that you have several "trials" or chances to get down the best rendition you can. You may have to delete an older version if you run out of trial slots. When you're happy with a particular version just hit Submit for it. You can obviously play back any of your recorded trials to compare them first.
Now you're supposed to just record one vocal line, because that's how it's designed to work, matching one voice singing one musical line. But some contributors try to record a vocal line with instrumental backing (or even with several voices in harmony) - and midomi don't discourage them. The single voice or instrument is ideal for the voice search, but a lot of users enjoy doing recordings with background music etc, and Keyvan said their search engine is smart enough to deal with it. It either won't search it or it will successfully extract the melody line from it, he said.
Social networking aspectsIt seems midomi have thought about the community aspects too, helping people with similar musical tastes get together. You can rate other users' recordings, comment on them (within reason, as you'd expect), add favorites and check your fan clubs, create playlists, get updates whose settings you can configure, share midomi user recordings with your friends by email; and they show midomi "stars" on the front page, have "up and coming" charts and rankings etc.
You have your own profile, as is standard, where you can add info about yourself, photos, post notes (typed or voice notes). And again as standard you can send other users private messages, etc.
What about copyright? DRM?2.5 million songs have been licensed for sale by download through midomi. Strictly, including recordings of copyrighted songs itself needs a licence from the songwriter/composer, even if they are amateur renditions rather than the original commercial recordings, and Keyvan confirmed that they had cleared licenses for streaming user generated renditions too: "We pay the publishers any time you listen to a user generated recording on the site." So it's legal, and the original writers get paid when people play back your recordings of a composer's song.
Why can't you buy more easily from midomi's website though? Why do you have to download special software before you can buy a song? That's because the download software is required by the record labels concerned for DRM (digital rights management) control, stopping buyers from copying the downloaded song etc. This isn't very good at all as far as I'm concerned, but I've made no secret of the fact that I'm anti-DRM (e.g. for BBC iPlayer). I think DRM technology unduly restricts legitimate customers while not actually being effective to stop organised large-scale pirating. The music industry seems however finally to be moving away from DRM, and about time too, so with any luck maybe they'll drop the DRM requirement from midomi - certainly the extra step of having to download the DRM software will be a barrier to many people buying from midomi, and of course restrictions on how/where they can play the music they've bought would be a big offputting factor too. I must admit I personally would hesitate to buy a DRM-crippled song, from midomi or anywhere else.
Other interesting points, and the future
Voice search is from MARS, text search is from, well, MARSBy which feeble attempt at that book reference I meant, Melodis's MARS technology can be used for more than just voice search:
"Our text search [i.e. on the midomi site] used to use MARS, but since we are internationalizing the site, we stopped using it for now, and we will use it in the near future again after we apply it to other languages. The text search will allow you to search without knowing the exact spelling of the song. Our text search in fact is being used by www.passalong.com. Try searching for "enrike eglecias" and you will find "Enrique Iglesias" That shows the power of MARS as applied to text search. As long as what you type "sounds similar" and/or "looks similar" to what you are looking for then MARS will find it. It will be available on midomi in the near future as well."
So it seems that MARS's "similarity search" technology can be used for text as well as audio search, coping with mistakes in spelling, typos etc - clearly there's a lot more potential for other applications of MARS and I shall look forward to that with interest.
Searching for lyricsTo be a complete song search site, in my opinion midomi needs to be able to allow visitors to search for songs by snippets of lyrics too.
I hope that when the licensing of lyrics is sorted, they'll enable text searching of lyrics and not just provide them for contributors to use when recording.
Searching for songwriters or composersI've suggested that midomi
Keyvan liked the idea and said it was more of a data limitation than a technology limitation. Once they had find a good reliable
source of data that info would be added to the text search engine. Again of course they could have their users assist in this by asking them to add composer info when they do a new recording, if there was a Composer or Songwriter field some of them might well complete it and help generate that data too.
Mobile phones etcMelodis have in mind that their technology could be used for
Musings on midomi betaI wanted to end with some general thoughts on midomi as I think it's a classic example of Web 2.0 service and social networking website, combining UGC with digital distribution of music / audio, a mashup in that way, if you like.
Why do I say it's the epitome of a Web 2.0 application and social sharing service? For several reasons. Taking Jyri Engestrom's 5 principles for Web 2.0 success, this startup hits the spot on almost all of them - with the exception that their business model seems to involve making money from ads and from users buying songs through their site. More likely, they'll probably just make a shedload from being bought up by Google, Yahoo or Microsoft. From the way they've set things up, it seems that midomi are run by very smart people too in a business and marketing as well as technical sense.
You don't have to register as a member just to search on midomi. But registered users (over 13s only as with most American sites) can contribute to their database of songs and other music, rate other users' renditions and buy music via midomi. Here, midomi are neatly harnessing the power of user-generated content.
Asking users to contribute tunes has two benefits. First, of course, it builds up midomi's database - the more songs that are contributed, the bigger their searchable database, and the more people who contribute the same song, the more accurate their engine's recognition of that song will be.
Second, many people are performers at heart, oh all right look-at-me show-offs!, and this website hooks into the addictiveness of karaoke, people just liking to hear the sound of their own singing voices, the competitive instinct of users who think they can sing a song better than another user who's already uploaded a song to their database (though midomi claim it's not a talent contest), the ego-stroking feeling of garnering "fans", perhaps the desire of some X-Factor, America Idol etc. wannabes to be "discovered" and make it as a pop or rock recording star.
Recording extracts for midomi can be quite addictive if you're a singer, you get a feeling of achievement too especially if it's a song no one else has uploaded yet - I know, I've been doing it, and it's hard to stop! (I'm not registered as "Improbulus" obviously, as my voice is too distinctive and recognisable and I blog anonymously. And no I'm not giving out my midomi username!)
But members don't have to contribute music - they can register just to rate other people's performances, become "fans", get to know other fans of similar music, etc. And again that taps into people's need for self-expression, for their views and opinions to count and make a difference, so successfully exploited by TV shows from Big Brother to American Idol and other talent shows where the audience can phone or text etc to vote for contestants. Not to mention the community aspects.
Midomi are clearly building up communities fast. As they put it, they are "humanizing social networking through voice. On midomi the human voice is getting people connected. We are a social network where you can experience and discover people that you like and experience and discover music that you like." Cleverly, given the (probably greater) popularity of singing in many non-English speaking cultures, they launched very quickly in French, Spanish, Italian, Chinesee and Japanese too.
I've already mentioned that I love the simplicity of midomi's approach, cutting the Gordian knot of music recognition by sticking to simple monophonic vocal lines and harnessing their users to generate their database, and the fact that the accuracy of their recognition can only get better and better with time and more user contributions.
If they iron out the issues I mentioned above I think they'll be a massive success - unless of course a competitor with better audio searching and without the bugs comes along fast! There are alternative music search engines, but they don't seem to work as well (e. g. Naiyo), do quite the same thing (e.g. Songtapper - rhythm search), work in the same way (e.g. Meldex), or are still in development (e.g. Amazon Web Services proof of concept mashup "What's That Tune"). So I think midomi have a head start here.
Do try out midomi - I would be interested to hear what others think.