• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 780
  • Last Modified:

MP3 Waveform Image Generation

I'm trying to generate an image of the waveform of an audio file within a script or through system() type commands and have not found a way to do it yet.  Ideally I could read an MP3 file (decode it if necessary) and output a gif or jpeg image of the waveform.  When I say 'waveform' I mean the pretty graphical representation of an audio file that you see in audio editing programs.

If you don't know how to do it could you at least tell me what a waveform is actually representing?  The way I understand it, it's the amplitude (volume) of the audio over time.  I will write a script to make the image for me based on that if I have to but I'm sure it's been done before.

I am looking to develop this in Perl or PHP although I am open to Actionscript and C++ (yuck) as well.
  • 2
1 Solution
Dushan De SilvaTechnology ArchitectCommented:
You can use GTK (if need with OPenGL).

BR Dushan
Richard QuadlingSenior Software DeveloperCommented:
(I'm from the PHP pointer).

Follow this logic.

An MP3 is normally 1/12th of the original audio size.

Say the orginal audio is 16bit Stereo and was at 44.1KHz

This means that 1 second, there will be 44,100 * 16 * 2 bits = 1,411,200 bits = 176,400 bytes of data.

But more importantly, 1 second will contain 44,100 samples. This is 44,100 dots along the x axis of the image.

So, you will need to resample this to get an image of sensible proportions.

MP3 files use a form of compression called lossy compression.


Source => mp3 => Output

Source is NOT exactly the same as Output. But an audio equivalent. I can't give you figures but there will be significant differences between the two.

You will need to decode the MP3 data (it is DATA, it is NOT AUDIO SAMPLES) into the samples to generate the wave form picture.

The waveforms will be VERY wide.

A 3 minute track will contain 7,938,000 samples. That is nearly 8 million samples. To plot that on a screen, even at 1280x1024, you have to resample and shrink that by a factor of 6,200. A significant shrink.

The waveform you see is normally with 2 traces (left and right audio channels).

Each trace will be the value of the sample at that point in time (in increments of 1/44100 seconds).

kamermansAuthor Commented:
RQualing - Thank for the info!  Would it make sense for me to decode the MP3 to raw WAV, then resample it to like 1kHz - Mono or something very low just to cut down on CPU time, then use some language to determine the average amplitude of each 1000 samples?  For a 5 min song (5min * 60sec = 300sec) I would have (300sec * 1kHz = 300000 samples) 30000 samples and if I take the avg of every 1000 samples I would be left with (300000 samples / 1000 = 300) 300 total samples - one for each second, which would generate a nice 300px wide image?

Let me know if my logic is off.  I also would like to know if anyone knows how to determine the actual amplitude (volume) of an individual sample in a PCM.
Richard QuadlingSenior Software DeveloperCommented:
If you can, I would decode to a stream.


stream_WAV = new MP3_Decode_Stream('my.mp3')

left_sample = stream_WAV->left_sample()
right_sample = stream_WAV->right_sample()


The idea here being that the decode presents the left and right samples on the fly and sequentially. That way you do not have to physically convert to a wav first and then process the wav file. I've no idea on decoding an MP3 file, but I suspect there are good sources available as the MP3 encoding does all the real work.

I would then work out how many samples are needed for a pixel. If you have a 300px wide image and you have 300,000 samples wide audio, then you need to average 1,000 samples at a time. Add the values of 1,000 samples together and then divide by 1,000.

This SHOULD provide a fairly reasonable waveform. Don't forget you would need to do left and right simultaneously as the I the samples are interleaved.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Microsoft Office 2010

This course will introduce you to the interfaces and features of Microsoft Office 2010 Word, Excel, PowerPoint, Outlook, and Access. You will learn about the features that are shared between all products in the Office suite, as well as the new features that are product specific.

  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now