Link to home
Start Free TrialLog in
Avatar of Zack Smush
Zack Smush

asked on

Matching two audio files using FFT (Android Studio)

I've been working on a part of my app for the past few days where I need to simultaneously play and record an audio file. The task I need to accomplish is just to compare the recording to the audio file played and return a matching percentage. Here's what I have done so far and some context to my questions:

 - The target API is >15
 - I decided to use a .wav audio file format to simplify decoding the file
 - I'm using AudioRecord for recording and MediaPlayer for playing the audio file
 - I created a decider class in order to pass my audio file and convert it to PCM in order to perform the matching analysis
 - I'm using the following specs for the recording AudioFormat (CHANNEL_MONO, 16 BIT, SAMPLE_RATE = 44100)
 - After I pass the audio file to the decoder, I then proceed to pass it to an FFT class in order to get the frequency domain data needed for my analysis.

And below are a few questions that I have:

 - When I record the audio using AudioRecord, is the format PCM by default or do I need to specify this some how?
 - I'm trying to pass the recording to the FFT class in order to acquire the frequency domain data to perform my matching analysis. Is there a way to do this without saving the recording on the user's device?
 - After performing the FFT analysis on both files, do I need to store the data in a text file in order to perform the matching analysis? What are some options or possible ways to do this?
 - After doing a fair amount of research, all the sources that I found cover how to match the recording with a song/music contained within a data base. My goal is to see how closely two specific audio files match, how would I go about this? Do I need to create/use hash functions in order to accomplish my goal? A detailed answer to this would be really helpful
 
 - Currently I have a separate thread for recording; separate activity for decoding the audio file; separate activity for the FFT analysis. I plan to run the matching analysis in a separate thread as well or an AsyncTask. Do you think this structure is optimal or is there a better way to do it? Also, should I pass my audio file to the decoder in a separate thread as well or can I do it in the recording thread or MatchingAnalysis thread?

  - Do I need to perform windowing in my operations on audio files before I can do matching comparison?
 - Do I need to decode the .wav file or can I just compare 2 .wav files directly instead?
 - Do I need to perform low-pitching operations on audio files before comparison?
 - In order to perform my matching comparison, what data exactly do I need to generate (power spectrum, energy spectrum, spectrogram etc)?
 - Am I going about this the right way or am I missing something?
ASKER CERTIFIED SOLUTION
Avatar of gheist
gheist
Flag of Belgium image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Very broad question. Had you diclosed FFT library it would rhyne better with people passing.