Skip to main content
Topic: music recognition/transcription? (Read 27564 times) previous topic - next topic

music recognition/transcription?

Does anyone have any experience with any audio music recognition/transcription software, such as Sibelius's AudioScore Ultimate?
http://www.sibelius.com/products/audioscore/ultimate.html

I'm curious as to how good that technology really is.

Thanks!

[red]Edit - This topic is not intended to be about visual music recognition/transcription software, such as SharpEye![/red]

Re: music recognition/transcription?

Reply #1
There is only one piece of software <grin> that will do a decent job and that lies between your ears.


Re: music recognition/transcription?

Reply #2
I use SharpEye quite extensively.  It works very well on excellent images of well engraved music.  Its accuracy gets progressively worse as the quality of the scan/music degrades.  Its interface is quirky, but once you're used to it (and the keyboard shortcuts) you can make corrections quite quickly.  I think you already know that it exports MusicXML and its ability to understand different voices and put these into layered staves is pretty good.

Re: music recognition/transcription?

Reply #3
Hi Phil,

Sorry I wasn't more clear - I'm looking at software for audio recognition this time, not visual recognition!  Instead of scanning sheet music, I'm wondering how software does at listening to MP3's/CD's and picking out at least all the notes/harmonics present, and possibly connecting them into voices/parts by the timbre of the instruments making them.  With my engineering background, I know it can be done, and will be done quite well at some point in the future.  But looking at the link above makes me wonder if the future has arrived to some degree, based on their sale's pitch!

Randy

Re: music recognition/transcription? (WAV to MIDI)

Reply #4
The semi-annual WAV to MIDI topic

What makes this one unusual is it wasn't started by someone within hours of registering as new member.
Registered user since 1996

Re: music recognition/transcription?

Reply #5
Another thing that makes this one different is that there is a specific product, from a company (Avid) with real credibility in the audio/visual field, being presented.

My son uses other Avid products professionally.  I'm going to ask him if he is aware of this application, and whether he may have access to a trial of the full polyphonic version as part of the suite of Avid products available to him.
I plays 'Bones, crumpets, coronets, floosgals, youfonymums 'n tubies.

Re: music recognition/transcription?

Reply #6
Hmm, it seems that the company my son currently works for isn't using any Avid products so that avenue is closed.
I plays 'Bones, crumpets, coronets, floosgals, youfonymums 'n tubies.

Re: music recognition/transcription?

Reply #7
Thanks for checking, Lawrie!

1.  I'm actually not at all interested in WAV or MIDI!  I'm interested only in a program that can go from MP3 to Music XML (then from Music XML to NWC using my importer).  I guess I could have thought to search the forum for WAV instead of MP3, and/or MIDI instead of Music XML, but I was really only interested in the very latest technology.  From what I read at the link, they seem to have jumped light years ahead just recently!  I just wish I had a place where I could talk to other interested folks without bothering the uninterested.

2.  I think the term "semi-annual" (as well as "biannual") means every six months, when it's been more like "every other year" for the WAV to MIDI discussions?  Searching back, I found mostly a bunch of unapplicable MIDI to WAV posts - roughly 3 times as many as the WAV to MIDI posts.  For the latter, there was this in early 2009 (https://forum.noteworthycomposer.com/?topic=6748.0), but the FAQ it points to has not been updated in a year!  Going back further, I found an early 2007 post, and a 2005 post, etc.

So I apologize for not thinking to search for WAV and/or MIDI, and I apologize that I had not yet joined the forum at the time of the most recent WAV to MIDI post.

Re: music recognition/transcription?

Reply #8
Quote
I just wish I had a place where I could talk to other interested folks without bothering the uninterested.

You know what, Randy?  Nobody is forced to read your message or the rest of this thread, so I wouldn't worry about bothering those who do read it.

Some of us will lurk until you give us the verdict, then if the verdict is positive and we can afford it, some of us will benefit tremendously from your research.

So keep it coming, good sir!






Re: music recognition/transcription?

Reply #9
G'day Randy,
MIDI import is, of course, easy.  mp3 and/or wav to notation is the hard one and is essentially the same problem: how to unscramble an egg, or how to unbake a cake...  :)

I too am very interested in the results of your research.  If they've finally managed to analyse the complex waveform that makes a polyphonic wav or mp3 well enough to extract instruments and note/chord information reliably then I for one will be not only very impressed, but I'll also be very happy!  I may even need to eat some crow 'cos I have a feeling I once predicted that it wouldn't happen any time soon...  Hope it was longer than I think it was... ;)
I plays 'Bones, crumpets, coronets, floosgals, youfonymums 'n tubies.

Re: music recognition/transcription?

Reply #10
It's clearly theoretically possible to convert played music to a list of the notes.  In the simplest example of a single line - no chords or polyphony - then a Fourier transform of the music will give a set of frequencies, and a time analysis of those will give notes and durations.  In principle, there's nothing stopping this from being performed with polyphonic music - the transform will now give a set of frequencies - these are complicated by the fact that each note has harmonics, but it should still be possible to analyse them to the original notes with fair accuracy.

The question for us on this forum is - would this be worth it?  Most of us are trying to create music (which wouldn't benefit from the above) or recreate something from the original.  Generally, the original was written music and the layout will be complex - many voices, slurs, ties, accidentals, key changes, etc., etc.  This complexity will be very poorly represented in the performed music, and so any transcription from that performed music will be a very poor representation of the original.

So I'll stick to starting with the printed music....

Re: music recognition/transcription?

Reply #11
G'day Phil,
for the most part, I'm right with you.  But there are times when the written music simply isn't available, or if you have a need to extract particular performance information that isn't in the music...  Lots of reasons that make this a useful thing.

There are several tools that make a tolerable job of extracting monophonic data, it's the polyphonic ones that create the real challenge.
I plays 'Bones, crumpets, coronets, floosgals, youfonymums 'n tubies.

Re: music recognition/transcription?

Reply #12
   Going back to Randy's original question - does anyone have any experience with any music recognition/transcription software, such as Sibelius's AudioScore Ultimate? - I assume, no-one having come forward to say they have - that the answer is "No".

   And presumably you, Randy, have not tried the free Demo version?

   I had a look round the Web.  There are several glowing reviews, but - like  others - I find it hard to believe the software is really able to live up to its promise.  And here are two very contrary reviews from the Amazon Website where the program is on sale.

   Enjoy

   MusicJohn, 21/Dec/10

========================================

This program claims to be able to transcribe audio to MIDI data in multiple voices by processing MP3, Wav files, or direct recordings. It certainly does put MIDI notes in place. Unfortunately, those MIDI notes only marginally resemble the original audio. I threw some very simple piano recordings at it, some acoustic guiter, etc., with no percussion and very little reverb. The program consistently failed miserably every single time.

I decided to go for the "ultimate" in simplicity. I recorded a track from scratch using the program's OWN metronome as the only sound. It's an audio click with a high pitch on beat one and slightly lower pitches on beats two, three and four. This program couldn't even transcribe this correctly.

AudioScore Ultimate might be worth $20, but I can't believe they actually try to get people to pay $250 for this piece of junk. Fortunately, I ran my tests on the free demo download rather than blindly buying the product.

This is pretty much false advertising for a product to claim it can transcribe multiple parts when it can't even transcribe one part.

===========

I too checked out the demo software for AudioScore and tested it the same way David did in the previous review. I second David's opinion... it's junk and don't waste your money.

Re: music recognition/transcription?

Reply #13
Well, me too tested some demo/free programs some time ago and I found the results usually awful.

In just one case I had "good" results, but it was very easy.
I was playing my recorder after having "trained" the program.
BTW I can't remember the name of the program.

So, as far as I can get, the best program for this target is between my ears... even if sometimes it proves insufficient.

Re: music recognition/transcription?

Reply #14
I have had rather impressive results with "PDF to Music Pro" from www.myriad-online.com.  This program takes an PDF file that was generated by any music software, plays it, exports it in miscellaneous formats, plays and even sings it.

I use it to export a file in mxml format, then use "Music XML to NWC Converter" from www.niversoft.com to generate input into NWC.  This combo is definitely not perfect, but can save a passel of time, especially with a large score.

Re: music recognition/transcription?

Reply #15
Phil:

Obviously our musical interests are in different worlds.
I often have to produce arrangements from mp3 files.
The sheet music for most of them was never published and if it was the harmonies and arrangement are nothing like the sheet music.
Transcription from audio is the bread and butter of staff arrangers in TV.
It's a challenging exercise to reproduce the content of a recording as a score.
If a program could simply recreate the correct chords it would be a great time saver.
I don't believe that full score reproduction will happen in my lifetime.

Debaumann:
This thread is about the production of a score from an audio source not a print source.
There are several programs that can do that well but I have yet to find one that can reproduce a score from a recording.

Barry Graham
Melbourne, Australia

Re: music recognition/transcription?

Reply #16
Well, exCUUUUUUUUse me.  I will limit my responses henceforth to "subject" topics!

Re: music recognition/transcription?

Reply #17
Thank you MusicJohn/Maurizio/Lawrie/David/Phil/Barry!  I see now that perhaps they haven't made a big jump in technology, as they appeared to be claiming.  (I'm a computer and electrical engineer by trade, and having dabbled with running Fourier transforms on music myself, I still believe there will be a big breakthrough one day.)  But I will download the demo to see for myself, and report back, though I now expect to be quite disappointed.  Perhaps I will start a new thread in 2 years, when the next "breakthrough" software comes out (unless a 4-hr-old member beats me to it ;-).

Debaumann - It was my fault for not being clear off the top I was looking into audio recognition!  I will see if I can retroactively change the title of this thread...

Re: music recognition/transcription?

Reply #18
I'm not surprised to read the negative reviews on Amazon. Lawrie is right: the problem is akin to unscrambling an egg. I have serious doubts that it will ever be satisfactorily done.

What makes a clarinet sound different from a kazoo is the overtone structure of the sound - which of the partials of a given note (the high notes that are sounding softly from an instrument) are sounding how loud in relation to the fundamental (the note you actually hear the instrument producing). Sorting out which instrument is producing any particular partial is a next-to-impossible task. Yes, you can relate some of the upper partials to their fundamentals by mathematical analysis. Others you can't, because more than one note present in a chord may be producing the same overtone. The problem may be soluble for instruments with radically different timbres, like a piano and a cello. But untangling a string quartet is well nigh impossible.

And then you add the performer into the mix. Subtly different attacks, tuning, dynamics, and tempi, not only among ensemble performers but from note to note by the same performer. It's hard enough for software to get this straight when it's being fed from a MIDI keyboard. Sorting it out from a sound wave adds a whole new level of complexity. I am willing to surprised, but I don't think it's gonna happen. Like Maurizio says, the best software to do this job is the piece between your ears. Admittedly, it isn't an easy task for untrained ears. But any music school should offer ear training courses, and it is a learnable skill. Those who need to transcribe music from performances should take the time to learn the skill instead of pinning their hopes on software development.

Just my grouchy take on the matter....

Bill

Re: music recognition/transcription?

Reply #19
I would be happy with something that just tells me the likely fundamental notes going on at certain points in time - I'd "connect the dots" myself, and assign instruments to the resulting lines, all by ear.  It'd be akin to SharpEye, where you don't necessarily get a good final picture out of the software, but you get something that jump-starts your manual entry, possibly tremendously.

Re: music recognition/transcription?

Reply #20
Thank you MusicJohn/Maurizio/Lawrie/David/Phil/Barry!
  <snip>
I will see if I can retroactively change the title of this thread...
I tried, but as usual, I get no thanks :-)

What AudioScore Recognizes
...
• Notes from chords that even the ear cannot distinguish
...
When bugs are dressed up as features, you should know your leg is being pulled.
Registered user since 1996

Re: music recognition/transcription?

Reply #21
I'm not surprised to read the negative reviews on Amazon. Lawrie is right: the problem is akin to unscrambling an egg. I have serious doubts that it will ever be satisfactorily done.

I don't believe this.  It's not unmixing something - that's well known to be impossible.  See below.

What makes a clarinet sound different from a kazoo is the overtone structure of the sound - which of the partials of a given note (the high notes that are sounding softly from an instrument) are sounding how loud in relation to the fundamental (the note you actually hear the instrument producing). Sorting out which instrument is producing any particular partial is a next-to-impossible task.

Each note played only produces harmonics - i.e. multiples of the original frequency, and these diminish quite sharply as the frequency rises.  So a violin playing middle C will produce some C above middle, soem C above that, and so on.  It should be possible to recogise the patterns of specific instruments and match these to the frequency distribution.  As I said, it's not easy, or it would have been done, but it can't be computationally impossible - we do it, and we only have ears and a brain to hear and calculate.

Re: music recognition/transcription?

Reply #22
I definitely can't say it's impossible (never dare!)
But I can say it's quite difficult.
And I think the actual status is still primitive.

In some cases I analyzed some arrangements using both bare Fourier transform and spectrograms.
That work sometimes helped me a lot, sometimes it really fooled me. :-(
As usual, the results must be carefully evaluated by brain (and, in this case, by ears :-).

25 years ago one of my professors was working on a speak recognition program.
He was able to get a satisfactory recognition of the fluent speech of a single speaker after this one made some training to the program and somehow "helped" it speaking a bit more clearer than it's customary with a human.
He was then working to extend the results to "speaker independence".
Many years are passed and I'm sure some progress was made, but...

Ok HAL, that's all for today. Shut-down yourself, please. :-)

Re: music recognition/transcription?

Reply #23
25 years ago one of my professors was working on a speak recognition program.
He was able to get a satisfactory recognition of the fluent speech of a single speaker after this one made some training to the program and somehow "helped" it speaking a bit more clearer than it's customary with a human.
He was then working to extend the results to "speaker independence".
Many years are passed and I'm sure some progress was made, but...
One of the technical articles that came out at that time indicated some problems just with its title: How to Wreck a Nice Beach.
Since 1998

Re: music recognition/transcription?

Reply #24
Each note played only produces harmonics - i.e. multiples of the original frequency, and these diminish quite sharply as the frequency rises.  So a violin playing middle C will produce some C above middle, soem C above that, and so on.  It should be possible to recogise the patterns of specific instruments and match these to the frequency distribution.  As I said, it's not easy, or it would have been done, but it can't be computationally impossible - we do it, and we only have ears and a brain to hear and calculate.

First of all, Phil, harmonics aren't just multiples of the original frequency - they're fractional multiples. Straight multiples only produce octaves (220, 440 and 880 are all A's). The overtone series has octaves, fifths, fourths, thirds, seconds, and some intervals we don't have names for. Second, instrumental timbres are averages: one clarinet rarely sounds exactly like another, because their overtones don't match exactly. Third, our brains don't operate computationally. There is a large body of hard scientific evidence that they operate by gestalt - that patterns are grasped as wholes rather than as the sum of their parts. This isn't a computational problem, it's a problem in pattern recognition. For that, the ear and brain operating together are much better than a computer.

I can almost always sort the cello part out of a string ensemble, even when it goes higher than the violins (for an example of that, check the Shostakovitch op. 69 trio). My wife (who is a string player) usually cannot. I seriously doubt that a computer will ever be able to do it.

Re: music recognition/transcription?

Reply #25
On a much less serious note:
Quote
How to Wreck a Nice Beach

We have Wreck Beach in Vancouver out at the University of B.C.  It's a nudist beach. 

Re: music recognition/transcription?

Reply #26


   Getting a computer to use the real sound - or a WAV or MP3 or the like file of real sound - of a performed musical work to make a printed score (or PDF File or the like) or to make a MIDI File (or similar) is a desideratum that occupies many.  While making the score or MIDI File from the notes is no longer the problem, actually acquiring the note and instrument information is a serious difficulty.

   In general, attempts to achieve this assume that the right starting point is to look at and analyse the waveform of the sound, and from that to identify both the note(s) - frequency, duration and amplitude, and so on - being played and the instrument(s) doing the playing, and in theory, and given enough computing power, this ought to be eminently possible.  So far, however, it has proved remarkably difficult, even when attempted for a single instrument - such as a recorder - playing notes one at a time.  For a combination of instruments (such as an orchestra), or for a single instrument (such as a piano or guitar) playing several notes at once, it seems at present to be well-nigh impossible.

   Why?  Well, as previously pointed out, not only is the sound from individual instruments for the most part extremely complex, resulting in a waveform representation that is not easy to analyse with the degree of accuracy required, but the extraordinarily elaborate waveform representing the output of multiple instruments playing multiple notes simultaneously has so far defied attempts to extract meaningful data from it.

   The nature of the problem is often likened to that of separating a cake into its original components (flour, fruit, and so on) or scrambled egg back into separated white and yolk, but actually those are not appropriate analogies, for in each case the cooking changes the ingredients in such a way that the original components no longer exist as such.  A better analogy is that of separating a simple but intimate mixture of essentially non-interacting substances, such as one of sand, salt and petroleum jelly (while it is true that sound waves combine and interfere, and thus "interact", nevertheless the original sound is still there, and can in theory be separated out).  With enough effort it is possible to pick out from the mixture every particle of sand and every grain of salt - and with some lateral thinking a little less effort will extract the salt by dissolving it in water, and the jelly by dissolving it in petrol (gasoline).  Filtering can also help.

   The sound of an orchestra is an incredible mixture, and yet we humans can pick out, can recognise, all the individual instruments - an expert musician (the conductor, say) will do this better than the rest of us, and will be able not only to identify, for example, the clarinets, and distinguish them from their neighbours the oboes and the bassoons, but also to listen to the notes they are playing almost as if they were doing so in isolation.  And if we can do it, why shouldn't the computer be able to?  Why shouldn't the computer be able to analyse the waveform representing the real sound, match it to the known wave forms of individual instruments, and separate them out, note by note.  It's not impossible; it's just that it is a very hard computational problem, rather like the three body problem, but it can be cracked if enough effort (and money) is thrown at it (a lot of expensive and intensive, and iterative, computing deals with the three body problem, and allows NASA et al to know how to ensure that satellites and space probes go, to the necessary exactness, where they're wanted). 

   Maybe the technique presently being used by Google to effect language translation might be applicable; this relies not on a real analytical translation, as would be done by a human interpreter, but a statistical comparison of word combinations in the source language to find known matches in the target language.  Though this might seem to be a technique which can only really be done by computer, and at which computers should be extremely good, in fact this is a typical pattern matching exercise, and is in many ways akin to how children learn a language.  And, come to think of it, even a human interpreter will match simple phrases from one language to the other rather than "translating" each word.  So, perhaps it might be possible, and helpful, to match a burst of orchestral sound output against an enormous database of known sound - "known" in the sense that its instrumental and note components are already recorded - and so deduce what notes are being played, and by which instruments.

   Bear in mind that the things you can do on a laptop today are things that 20 years ago you'd have been hard-pressed to do on a mainframe - and that 40 years ago would have been dismissed as quite impossible.  Perhaps we'll have it in 20 years' time ... like fully-controllable fusion reactors, perfect speech recognition, and no doubt a host of other things that are always just around the corner???

   MusicJohn, 22/Dec/10

Re: music recognition/transcription?

Reply #27
First of all, Phil, harmonics aren't just multiples of the original frequency - they're fractional multiples. Straight multiples only produce octaves (220, 440 and 880 are all A's). The overtone series has octaves, fifths, fourths, thirds, seconds, and some intervals we don't have names for.

What I said wasn't quite accurate, but neither is this.  The harmonics are categorically only straight multiples of the fundamental.  Thus if we play a low A (110 Hz) we get all the multiples of 110 Hz in varying degrees.  So we get 220 Hz (the next A), 330 Hz (approximately E above the A), 440 Hz (the next A), 550 Hz (approximately C) and so forth.  We get no other frequencies in between.   Therefore, if I provide the waveform from an instrument playing the low A, it's easy to determine that this is the fundamental being played.  Similarly, if I provide the waveform of one instrument playing a low A and another playing a low C at the same time, a Fourier transform makes it plain that this is what's being played by both instruments.  As I've said, I'm not downplaying the difficulty of doing this with multiple instruments and a time-varying waveform, simply that (in contrast to what's quoted below) it is a computationally difficult problem, but not impossible.  Unscrambling eggs is impossible and is a poor analogy.

Second, instrumental timbres are averages: one clarinet rarely sounds exactly like another, because their overtones don't match exactly. Third, our brains don't operate computationally.

This is not relevant.  The different timbres come from the varying proportion of the harmonics - 2nd, 3rd, 4th, etc.  But these all still exist at the straight multiples as descibed above.  It seesm to me that characterising an instrument (like training voice recognition) would make it easier to identify the fundamental of that instrument and remove all the higher harmonics, thus simplifying the problem of identifying the other instruments, but this is an improvement, not necessary.

There is a large body of hard scientific evidence that they operate by gestalt - that patterns are grasped as wholes rather than as the sum of their parts. This isn't a computational problem, it's a problem in pattern recognition. For that, the ear and brain operating together are much better than a computer.

Whan I was a lad, we were often told that chess requires human intelligence and that a computer will never beat the best men.  This has been shown to be incorrect.

I can almost always sort the cello part out of a string ensemble, even when it goes higher than the violins (for an example of that, check the Shostakovitch op. 69 trio). My wife (who is a string player) usually cannot. I seriously doubt that a computer will ever be able to do it.

I'm fairly certain it will.  In the first instance, I would expect the the computer will simply identify the tones, rather than the instrument playing.  I do think that it will also be possible to do both at some point.

Re: music recognition/transcription?

Reply #28
In the first instance, I would expect the the computer will simply identify the tones, rather than the instrument playing.  I do think that it will also be possible to do both at some point.

I also believe this, and much greater achievements will be possible.  If you look back at the history of computing, you will see the massive computing advances in what is a relatively very short time (in comparison to other advances mankind has made).  You also may be aware that the advances, to a certain extent have reached a plateau. This is really due to hardware technology holding back advances.

I believe that when neural net technology becomes established, you will again see massive advances in computing power, including unscrambling the audio scrambled egg.

Today's computers are good at processing data very quickly and therefore finding answers that mans brain would have found given the time to analyse the data. Tomorrow's computers with neural net technology, should be able to solve problems that mankind does not have a clue how to solve. A bit worrying in a way.

 
Rich.

Re: music recognition/transcription?

Reply #29
The harmonics are categorically only straight multiples of the fundamental.
Correct. I was thinking of the way the harmonics are produced by the vibrating string or air column (or xylophone bar, or....). As you know, these are fractions - the vibrating body vibrates as a whole, in halves, in thirds, fourths, etc. Technically, to obtain the frequency of one of the overtones, you divide the fundamental by the fraction that's producing that particular overtone. However, since the numerator of these fractions is always one, this is equivalent to multiplying by the denominator. So you do get whole-number multiples of the fundamental, and it's OK to simplify to that.

Most of the rest of your post I have to respectfully disagree with. I think you are ignoring the great variety of complications that arise in live performance. An example I didn't mention in my previous post is overblown winds. When you overblow a wind instrument, the fundamental is no longer sounding; the note that appears to be the fundamental is actually the first overtone, unless the instrument is a clarinet, in which case it's the second overtone. And it doesn't stop there: the notes you hear from a good Bach trumpeter (or a good french horn player) are 'way up there among the upper partials, where they are mechanically out of tune but lipped into place. How is a computer going to deal with a sixteenth-note run on a Bach trumpet, where the "fundamentals" are actually a series of different harmonics - all well above the fundamental - with their pitches retuned by the performer? And then add in the strings and woodwinds in the accompaniment....some of the woodwinds overblowing, some not, and the strings possibly playing harmonics (but probably not).....and the differential tones produced by the chorus effect....and....and....

OK. Maybe. Maybe it can be done. But it's not going to be by simply finding the right algorithm and then applying computational power. It's going to be a whole bunch of complicated algorithms with another series of complicated algorithms necessary to decide which algorithm to use at any particular instant of time. I wouldn't bet any money on it. Of course, I'm also the guy who told his wife that the results from search engines on the Internet would never be relevant enough to be useful. That was about two months before Google came out. I've been hearing about that ever since.

Cheers,

Bill

Re: music recognition/transcription?

Reply #30
Going slightly off topic but with some relevance
.It is common  to have a TV news broadcast running a written interpretation of the speaker’s words across the bottom of the screen, often with hilarious results as the computer program can be fooled by all sorts of things such as the speaker’s dialect or country of origin. Usually one can get the basic meaning of what has been spoken.
It is also common that one can get an instant translation from one language to another, again with variable results.
Put these two things together and you can get this scenario
1)   You telephone someone in, say,. China. You know absolutely no Chinese.
2)   The computer turns what it hears you say into a virtual printed script .
3)   The computer translates this virtual script into  one in a suitable Chinese text
4)   The computer then recites this Chinese text in a spoken language that your Chinese correspondent understands. .
5)   He replies to you in the same way and although he knows no English you hear his reply in the language you understand.

I understand that this has already been done in an experimental basis.  In the explanation I was reading it made the point that translation programs no longer work on fixed rules but by learning the association between different words and phrases. It would seem that  this is the possible route for a practical program of the kind under discussion; but it looks a long way off!

Tony

Re: music recognition/transcription?

Reply #31
I wasn't intending to suggest that (in anything like the medium term, at the very least) any and all musical performances will be seamlessly transcribed to musical notation, in the same way as I know to my cost that many musical performance sheets require a lot of messing about with to re-represent them as a new score.  Simply to say that something like, say, a straightforward string quartet could well (at some point in the not too distant future) be recognised by computer and given a fair representation in musical notation.  This is pretty much where optical recognition is now.  In other words, I don't subscibe to the view "it will never be possible to convert mp3 to midi".  I do subscribe to the view "there will be lots of MP3s that are too difficult to transcribe accurately and automatically".

Re: music recognition/transcription?

Reply #32
Whan I was a lad, we were often told that chess requires human intelligence and that a computer will never beat the best men.  This has been shown to be incorrect.
Very off topic, I didn't see a computer beat a master the first time, but knew the player and programmers--it happened in my state.  Info on my blog.
Since 1998

Re: music recognition/transcription?

Reply #33
2.  I think the term "semi-annual" (as well as "biannual") means every six months, when it's been more like "every other year" for the WAV to MIDI discussions?
2008 was blissfully quiet, and 2010 isn't over yet. Semi-annual may have been an understatement :-)


An Incomplete List
1998-02-14  Audio to MIDI conversion?
1998-04-27  .wav file import ?
1998-06-29  WAV to MIDI conversion
1998-12-20  WAV file conversion

1999-05-13  Reading music from a CD
1999-08-21  MP3 to MIDI
1999-09-07  Im not sure, But I could use some help.

2000-04-13  MP3 or WAV to MIDI

2001-01-29  Wav to midi
2001-03-19  Using Media Player files in NWC
2001-04-01  WAVE or AIFF to MIDI Posibble?
2001-04-11  file conversion
2001-06-06  Mid files
2001-10-18  Winamp file, convert to Midi?
2001-12-27  File Conversion
2001-12-31  Converting A WAV Into A MIDI

2002-02-27  MP3 to NWC
2002-07-22  Importing Files
2002-10-09  Producing standard notation from imported existing performance
2002-10-24  Mp3 to sheet music

2003-03-19  List of WAV to MIDI or AUDIO to MIDI converter software
2003-09-07  Midi??
2003-09-18  WMA to Midi

2004-08-20  Want MP3 Music score! Is it Possible?
2004-09-26  Convert Wave to Midi
2004-10-13  Mp3 Files??

2005-03-05  Convert singer voice wav format into NWC midi sheet
2005-03-08  is there any way to convert MP3 files to MIDI files?
2005-04-22  WAV to MIDI at last?
2005-04-30  Realplayer Audio Files
2005-05-14  playing wav files with NWC

2006-01-20  how do i get the nwc files in the first place
2006-07-31  Can I play a CD so that NWC can convert it to sheet music?
2006-08-20  Converting WAV to NWC
2006-10-23  File types for Conversion

2007-02-06  MP3 file convertion
2007-04-26  Converting WMA/WAV/MP3 to NWC/MIDI
2007-10-10  MP3 to Midi converter software

2009-03-03  Convert to Midi
2009-06-30  can noteworthy convert a music file to sheet music

2010-12-18  music recognition/transcription?
Registered user since 1996

Re: music recognition/transcription?

Reply #34
...An example I didn't mention in my previous post is overblown winds. When you overblow a wind instrument, the fundamental is no longer sounding; the note that appears to be the fundamental is actually the first overtone, unless the instrument is a clarinet, in which case it's the second overtone. And it doesn't stop there: the notes you hear from a good Bach trumpeter (or a good french horn player) are 'way up there among the upper partials, where they are mechanically out of tune but lipped into place.
G'day Bill,
not sure I can completely agree with you here...

There are 2 fundamentals in play here.  A brass example: assuming you are using the same valve/slide combination for both, the first is the fundamental frequency of the instrument in question, the second is the fundamental frequency of the note you are playing.  The fact that the note you are playing is part of the overtone series of the instrument is not really relevant.  The note will have significant harmonics of it's own that are an extension of the overtone series of the instrument, but are 2nd, 3rd, 4th, etc. harmonics etc. of that note.  The fact that they are also nth harmonics of the instrument is almost irrelevant as notes lower than that being played in the series don't exist in the sound.  (an analysis might find some trivial presence of lower notes but they would be just that, trivial)
I plays 'Bones, crumpets, coronets, floosgals, youfonymums 'n tubies.

 

Re: music recognition/transcription?

Reply #35
Thanks to Rick for resurrecting those threads.
The same discussion was going on 12 years ago.
And the technology hasn't advanced.

In 2000 I posted this:

Quote
To those who still believe that there are programs that will convert wav to midi - here's a challenge.
I will send you a few bars of a big band midi file converted to wav or (preferably) mp3.
All I ask as proof that your program works is to return to me the complete score in NWC format.
Couldn't be easier!

Over 10 years ago and still waiting!

Barry Graham
Melbourne, Australia


Re: music recognition/transcription?

Reply #36
Thanks, Rick, for the list. I remembered dealing with this topic before, but I didn't realize we had dealt with it quite so many times.

Lawrie, re the overtones of overtones, you have to remember that any overtone of a given harmonic is also an overtone of that harmonic's fundamental. If an instrument is overblown at the octave, it's "fundamental" corresponds to the first overtone of the instrument's true fundamental; its "first overtone" corresponds to the third overtone of the true fundamental; its "second overtone" corresponds to the true fifth overtone; and so on. So I'm not sure that what you have stated re the overtone patterns of overblown notes - although certainly true - changes anything in regard to the difficulties that overblown notes present to a sound-transcription program.

Bill

Re: music recognition/transcription?

Reply #37
G'day Bill,
oops, upon re-reading what you originally wrote I realise that we are both on the same side of the argument...  Dunno what I was thinking, so I'll blame fatigue... ;)

I plays 'Bones, crumpets, coronets, floosgals, youfonymums 'n tubies.

Re: music recognition/transcription?

Reply #38
Off topic but about the harmonics: do you know the Shepard's tones?

The very first time I heard them was at an exhibition on sensorial perception in which they accompanied Escher's infinite staircase (the monks on the staircase, that is).
I was very impressed and for a long time I was wondering how it could be (it was by far before internet ;-).
I'm pleased to say that I went very close to the solution by myself. :-)