urgent: PCM data structure

Posted on 2004-09-24
Last Modified: 2006-11-17

I'm getting a little confused.  (i guess it's what comes of working on too many things at once).

I have an wave file which I'm analyzing in an attempt to be able to visualise and process the data in various ways.  Now with some help off jrandallsexton I managed to get the file header data fairly easily.  The problem I have now, is reading the data itself.  Now I think i am taking the data into a byte stream correctly.  However, I'm not sure how to process it.

I'm using for all my information, I'm not sure if it's the best source, but it seems pretty good.

The site describes the sample data as being in a structure of, for example "24 17 1e f3" etc, where "24 17" is the left channel and "1e f3" is the right channel.

Now, I'm getting results (from a different audio sample to the example given on the site) as "113 5 102 97" with no letter charactors involved at any point.  This makes me thing I'm doing something wrong.

Also, even if I get the data in the correct format.  I don't understand what the values are!?  "24 17".. is this Hex? if so, what should I do with it (programmatically) once I have an integer value.  Is this something like Amplitude and Frequency?

I'm affraid I need quite a lot of help with this hense the points.  It's also fairly urgent...


Question by:w3tim
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3

Expert Comment

ID: 12143987
as far as the conversion goes...

the first one could be hex (i would say definately)....but dont forget, the second one could be hex as well.

as far as converting it to decimal, just use the convention

0-9 = exact value (0-9)
a = 10
b = 11
c = 12
d = 13
e = 14
f = 15

now take the number 24, in decimal we pronounce this "twenty four", in hex we pronounce it "two four", and here ill show you why

in decimal 24 equates to (2 * 10^1) + (4 * 10^0) ==> (2 * 10) + (4 * 1) = 24

but in hex 24 equates to (2 * 16^1) + (4 * 1^0) ==> (2 * 16) + (4 * 1) =  36

to use a different example, the "113 5 102 97" you came up with could also be hex, and heres why

hex(113) = (1 * 16^2) + (1 * 16^1) + (3 * 16^0) ==> (1* 256) + (1 * 16) + (3 * 1) = 275

as far as what to do with these values, ill have to get back to you


Author Comment

ID: 12144113
Many thanks!

I'm not really used to working at even a slightly lower level such as this!


Author Comment

ID: 12158698

I'm not sure I'm getting anywhere to be honest... I found a website that seemed to have some useful information on it ( but I don't know if it's sending me in the wrong direction.
MS Dynamics Made Instantly Simpler

Make Your Microsoft Dynamics Investment Count  & Drastically Decrease Training Time by Providing Intuitive Step-By-Step WalkThru Tutorials.


Expert Comment

ID: 12170289
just to let you know i havent forgot about this not too familiar with audio formatting, so this will be mostly a new adventure for me.  but i did get how to change those pesky hex codes to decimal

create a button and put this in the event handler

MsgBox(CLng("&H" + "113"))



Author Comment

ID: 12177827
Ah ha, right... so...

There are 2 bytes per channel per sample!?

I read the byte (i.e. 113) then CLng("&H" + "113") so I'll end up with 2 decimal values per channel per sample!?  Is that right?!  Any ideas what the values would represent!?


Accepted Solution

bramsquad earned 500 total points
ID: 12193227
after looking through the net ive found about as much as you have...these two sites,1410,21300,00.html

have some info on them.  the first one explains pretty well what the data represents, the second has some C coding to play riff files.

from what ive deciphered, these "samples" are numeric data which is interpreted by the sound card or whatever device, to create a specific sound.

in saying that, it would not be wrong to assume that the two values we are looking at (two hex values for the right, and two hex values for the left) may just be one value.  

ill try to illustrate what i mean here.  if our two hex values were "0A" and "11", in decimal you would have

"0A" = 10
"11" = 17

now in binary you would have

10 ==>  0000 1010
17 ==>  0001 0001

now this could be one "value" or in this case "sound byte" which is read into the computer as

0000 1010 0001 0001

the point of me going through this is im not sure if these values are separate, one meaning amplitude and one meaning frequency.

im kinda suprised we havent ran into any pages saying how this data is interpreted by the computer.  it may be out of the scope of your problem, in that these hex values are standard in terms of how the computer interprets sound...regardless of the specific format the sound file is in.  (were getting pretty low level here.)

anyways, im going to try and search for how audio files are interpreted by the machine, and suggest you do to.  

i dont think youre going to find any new information on this if you search for the PCM data structure.


Author Comment

ID: 12345396
Sorry I haven't abandoned this.  I had a busy few weeks while trying to finish up by going on holiday!  I'll be continuing to look at this but I'm going to have to leave it for a bit but I might get back onto it when I catch up with other work.

I'll give everything suggested another go and if I don't get anywhere I'll be back to EE.  I can't see me getting much further on my own though.  If anyone does have any more suggestions then let me know.

Thanks bramsquad, I'll award you the points for now as you have been very helpful!


Author Comment

ID: 12582178
I will return to this in a new post at a later date.  Other things have come up and time is too sparse to give to this project at the moment.  Many thanks for the help.  As I stated in my last post, I will award the points to bramsquad.



Featured Post

Free Tool: Postgres Monitoring System

A PHP and Perl based system to collect and display usage statistics from PostgreSQL databases.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Since .Net 2.0, Visual Basic has made it easy to create a splash screen and set it via the "Splash Screen" drop down in the Project Properties.  A splash screen set in this manner is automatically created, displayed and closed by the framework itsel…
Microsoft Reports are based on a report definition, which is an XML file that describes data and layout for the report, with a different extension. You can create a client-side report definition language (*.rdlc) file with Visual Studio, and build g…
With Secure Portal Encryption, the recipient is sent a link to their email address directing them to the email laundry delivery page. From there, the recipient will be required to enter a user name and password to enter the page. Once the recipient …
How to Install VMware Tools in Red Hat Enterprise Linux 6.4 (RHEL 6.4) Step-by-Step Tutorial

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question