Zack Smush
asked on
Matching two audio files using FFT (Android Studio)
I've been working on a part of my app for the past few days where I need to simultaneously play and record an audio file. The task I need to accomplish is just to compare the recording to the audio file played and return a matching percentage. Here's what I have done so far and some context to my questions:
- The target API is >15
- I decided to use a .wav audio file format to simplify decoding the file
- I'm using AudioRecord for recording and MediaPlayer for playing the audio file
- I created a decider class in order to pass my audio file and convert it to PCM in order to perform the matching analysis
- I'm using the following specs for the recording AudioFormat (CHANNEL_MONO, 16 BIT, SAMPLE_RATE = 44100)
- After I pass the audio file to the decoder, I then proceed to pass it to an FFT class in order to get the frequency domain data needed for my analysis.
And below are a few questions that I have:
- When I record the audio using AudioRecord, is the format PCM by default or do I need to specify this some how?
- I'm trying to pass the recording to the FFT class in order to acquire the frequency domain data to perform my matching analysis. Is there a way to do this without saving the recording on the user's device?
- After performing the FFT analysis on both files, do I need to store the data in a text file in order to perform the matching analysis? What are some options or possible ways to do this?
- After doing a fair amount of research, all the sources that I found cover how to match the recording with a song/music contained within a data base. My goal is to see how closely two specific audio files match, how would I go about this? Do I need to create/use hash functions in order to accomplish my goal? A detailed answer to this would be really helpful
- Currently I have a separate thread for recording; separate activity for decoding the audio file; separate activity for the FFT analysis. I plan to run the matching analysis in a separate thread as well or an AsyncTask. Do you think this structure is optimal or is there a better way to do it? Also, should I pass my audio file to the decoder in a separate thread as well or can I do it in the recording thread or MatchingAnalysis thread?
- Do I need to perform windowing in my operations on audio files before I can do matching comparison?
- Do I need to decode the .wav file or can I just compare 2 .wav files directly instead?
- Do I need to perform low-pitching operations on audio files before comparison?
- In order to perform my matching comparison, what data exactly do I need to generate (power spectrum, energy spectrum, spectrogram etc)?
- Am I going about this the right way or am I missing something?
- The target API is >15
- I decided to use a .wav audio file format to simplify decoding the file
- I'm using AudioRecord for recording and MediaPlayer for playing the audio file
- I created a decider class in order to pass my audio file and convert it to PCM in order to perform the matching analysis
- I'm using the following specs for the recording AudioFormat (CHANNEL_MONO, 16 BIT, SAMPLE_RATE = 44100)
- After I pass the audio file to the decoder, I then proceed to pass it to an FFT class in order to get the frequency domain data needed for my analysis.
And below are a few questions that I have:
- When I record the audio using AudioRecord, is the format PCM by default or do I need to specify this some how?
- I'm trying to pass the recording to the FFT class in order to acquire the frequency domain data to perform my matching analysis. Is there a way to do this without saving the recording on the user's device?
- After performing the FFT analysis on both files, do I need to store the data in a text file in order to perform the matching analysis? What are some options or possible ways to do this?
- After doing a fair amount of research, all the sources that I found cover how to match the recording with a song/music contained within a data base. My goal is to see how closely two specific audio files match, how would I go about this? Do I need to create/use hash functions in order to accomplish my goal? A detailed answer to this would be really helpful
- Currently I have a separate thread for recording; separate activity for decoding the audio file; separate activity for the FFT analysis. I plan to run the matching analysis in a separate thread as well or an AsyncTask. Do you think this structure is optimal or is there a better way to do it? Also, should I pass my audio file to the decoder in a separate thread as well or can I do it in the recording thread or MatchingAnalysis thread?
- Do I need to perform windowing in my operations on audio files before I can do matching comparison?
- Do I need to decode the .wav file or can I just compare 2 .wav files directly instead?
- Do I need to perform low-pitching operations on audio files before comparison?
- In order to perform my matching comparison, what data exactly do I need to generate (power spectrum, energy spectrum, spectrogram etc)?
- Am I going about this the right way or am I missing something?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Very broad question. Had you diclosed FFT library it would rhyne better with people passing.