Maestro is an OMR, Optical Musicscore Recognition. You can think of it like a musical OCR. It was my school project in 2005 in a team of four persons.
In the team, I was responsible for lines detection, stanzas deduction, audio play and user interface.
During the second year of undergraduate study, we have to make a project more oriented algorithm. It doesn’t matter if it has nice graphics, it must be powered by interesting algorithms.
A lot of people just do an OCR which is basically a text extraction from a scanned sheet. This a perfect example of what I just said: all the focus is on an efficient algorithm. But we wanted to innovate!
Finally, we chose to create an application able to recognize a sheet music and play the melody. We named our OMR Maestro.
Here are the basic steps:
First of all, we must clean the image. As a result, we obtain a binary image (each pixel is black or white). Then we rotate the image in order to have straight lines.
After that we detect the lines and deduct the stanzas. We cut each stanza and send them to the musicscore recognition module. Once the musciscores are recognized, we look for chains like bemols, fast times, etc.
Finally, we convert the result in a MIDI file and play it. The MIDI file format is easy to create from scratch with a good quality.
For the lines detection, I created a very simple algorithm. I start from the top middle of the image and I go down. If I find a black pixel I go on the left and on the right unless I find a white pixel. Then I know the length of a line by couting the black pixels.
At first, I thought this would be the definition of a line: if the length is greater than half the page width, it is a line. But it didn’t work because a line can have some white pixels because of the imperfection of the image, and some lines which are not in a stanza can occur. I had to redefine a line.
I introduced the concept of imperfection. We have now a variable initialized at 5. Everytime we find a white pixel, this variable is decremented and when it reached 0, we stop the traversal. But if we find a black pixel, we increment the variable (up to 5).
Once we got all the lines of the sheet, we had to group them into stanzas. My first approach was to group them 5 by 5. But the problem was that if we miss a line, all the stanzas become wrong.
So I had to calculate the average distance between two lines, so I could predict if we were off the stanza or not during the grouping. Finally, we were able to detect stanzas perfectly even if we missed some lines.
This was the first time I had to use Qt, and I was not sure if I should use Qt Designer or do everything in Emacs. Finally, I coded the interface only with Emacs, to be sure to fully understand how Qt works.
The user interface was pretty simple, since it was just a school project: each step was available at the top of the application. You just had to click to go through all the processing.
At first I tried to use timidity++ in order to play the generated MIDI file but the configuration was way to hard, and we wanted our software to be simple to use. Afther a while, I decided to try SDL with SDL_mixer and it was so simple that we had a functionnal MIDI player in one night.
With the Qt framework, it is quite easy to create a multiplatform application.
One avantage of the Mac version was that Qt was directly able to play a MIDI file without SDL because it was calling QuickTime. We provided a package file in order to install it under Mac OS X, delivering Qt libraires as well for Mac users so they didn’t have to install Qt to use Maestro (unlike Linux users).
Our software doesn’t recognize every musicscore and is still perfectible, but we are proud of it. Its flaws are only due to the lack of time at the end of the year because our recognition engine is quite flexible to be enhanced
We couldn’t implement everything on time, even with our coding nights before the final presentation. Who knows, it could have been a reference since there is not so much similar applications. Thanks to Maestro, we got the vice-major place among all the projects of the promotion.
Qt: Cross-platform application framework from Trolltech