Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17


compare audio files programatically

Posted on 2013-12-05
Medium Priority
Last Modified: 2013-12-17
Hi everyone, I'm looking for a solution to compare audio files of voice recordings.
One file will be recorded in the studio.
Users are asked to imitate that recording as adequately as possible.
We need to compare each user recording to the original, and determine programatically, preferably using server side scripting (php or node.js), which ones are most similar to the original.

I was thinking we could make a spectrogram out of each recording and compare the bitmap data looking for similarities. I guess there must be other ways of doing this.

Tips & Tricks are more than welcome !
Question by:Dreammonkey
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
LVL 34

Assisted Solution

Slick812 earned 2000 total points
ID: 39698700
greetings Dreammonkey, you say you need to "Compare" audio files to determine = "which ones are most similar to the original"
You will never be able to do this, even if you use people or one person to listen to these, how or what choice "parameters" are going to be used for the judges (human) to differentiate the audio entries? If you have more than one judge (human) there will be disagreements about what is "similar". You can probably get the "spectrogram" image for digital analysis (or other digital analysis method), BUT there is not a way to program a series of methods in IF-THEN programming logic, to have a subjective opinion in the digital IF-THEN result to determine what is or is not "most similar", even if you have only one word spoken in the digital audio files. The fine points and complexities of digital audio are massive, even if you just change the microphone for the same person, it can alter the digital footprint in the recording.

You might consider having your site users "Vote" on the audio entry that they think is "most similar", after they get to listen to the audio entries and form an opinion.

Author Comment

ID: 39699430
Thanks for your comment, Slick812,

I understand the challenge, that's part of the reason I posted the question on this forum ;)
I do understand the complexity of audio recordings and microphones, and you certainly have a point here ! ;)

Reading how Shazam works I do believe it must be possible to compare pitch & rhythm between 2 recordings ?

I found this open source library, will try to experiment with it tomorrow:

I know the mechanics are different:
I guess the software searches for a match (comparing highlights in a spectrogram) eventually resulting in a Boolean (or so I imagine). I wonder if it could be possible to have a return value that's not a Boolean, but a float ? ie. a value between 0.0 and 1.0.

I believe the difference is that this software is trying to compare the original version of a audio file with a recorded version of the original + added ambient noise... Ultimately filtering out the noise and finding a match , or not...

I'll keep you posted about my findings, looking forward to your thoughts...
LVL 34

Accepted Solution

Slick812 earned 2000 total points
ID: 39699714
I went to the  and read their  "How it works"  page, They reduce the audio to a more simple and variation-restrictive "11kHz mono signal" then they do something like go through the file time spot, by time spot and pick up a reference (evaluate the frequencies and volumes), then do a sort of "average" or combination (as a file checksum HASH does for bytes) and uses that hash-average in a lookup table to see if there is a match, if a match found = then song found. However I do not see any way this could work except for comparing the exact same digital recording as distributed for copyright laws by retailers-downloads. I really doubt that if another group sang the same exact song, same guitars, same drum set, same bass, same key, same tempo, but different singers and players it could make any sort of a match. This is designed to make EXACT matches, in order to pick out a specific song from thousands of other songs. But I do not know if I understand all of the factors they have in the way they "pick up a reference ".

Please get my main point -
have more than one judge (human) there will be disagreements about what is "similar"!

to define somehow in a digital evaluation and comparison what would be "similar" would be difficult, and nearly impossible (my opinion) that one audio was "more similar" than 10 other audios.

Featured Post

How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article discusses how to create an extensible mechanism for linked drop downs.
Introduction This article is intended for those who are new to PHP error handling (  It addresses one of the most common problems that plague beginning PHP develop…
This video will demonstrate how to customize windows, tools, and control bars, and save them as screen sets. Open and resize windows: Customize the toolbar: Customize the control bar: Customize your tool selections: Your screen set is alread…
The viewer will learn how to search for and apply Apple Loops, as well as create their own in Logic Pro X. Record a region in the tracks area: Select the region: Go to File > Export > Region to Loop Library: Select the appropriate search terms…

721 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question