Link to home
Start Free TrialLog in
Avatar of jon8034
jon8034

asked on

Have a set of c files i need to extract data from but do not have source code and am not a c programmer - i do have a binary dump

I have a situation i need to extract data from files (1 time only so doesn't have to be pretty).  Files do not have source code or file layouts provided.  I have gotten a binary dump util and can use that to parse out the ascii data but of course the numeric data is a bit more complicated.  Any idea on how to get this info out - no i do not have file layouts but only have a few files to do and if i have to i can develop a data table through inspection of the binary and trial and error.  Hot item need to do ASAP (of course is there anything else??).  I am not a c programmer but do have other programming tools that i can use to restructure the data or to run the binary through a coversion in to readable numeric.  Ideally a simple dump utility i could stick on the redhat enterprise 3.0 O/S and say dump file xxxx.dat > ascii.dat would be great.  Thanks!
ASKER CERTIFIED SOLUTION
Avatar of Sadrul
Sadrul

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of jon8034
jon8034

ASKER

will this bring in the numerics
it should also bring the alphabets. print the characters as char.

if you are looking for only alpha-numeric data, then use

if(isalnum(buffer[i]))
...


-- Adil
Avatar of jon8034

ASKER

OK, i'll give it a try - thanks.
Avatar of sunnycoder
run this command and paste the output here

file "filename"

replace filename with the complete path to your binary file. This will enumerate the type of the file ...

If it is a compiled executable then you may need to fall back on some other tools ...
If it a simple data file, then you can use a text editor to view the file ...
If it is binary data file, you can use an editor like hex to view the contents ...
try

# strings filename

and see the result, if that is what you want.
Hi jon8034,

If you haven't already written your program do dump the data, perhaps I can save you the trouble.  Here's a program that I've used a lot.  It works pretty well.  :)

Kent



#include <stdlib.h>
#include <stdio.h>
#include <io.h>
#include <fcntl.h>
#include <mem.h>

#define DISPLAY_WIDTH 16

int  Input;
FILE *Output;

unsigned char BufferA[DISPLAY_WIDTH];
unsigned char BufferB[DISPLAY_WIDTH];

unsigned char *Buffer;
unsigned char *SaveBuffer;

unsigned char *Buffers[2] = {BufferA, BufferB};
int      BufferIndex = 0;

long     Address = 0;
int      Repeat = 0;

#pragma argsused
int main(int argc, char* argv[])
{
  int Length;
  int idx;

  if (argc != 2)
  {
    fprintf (stderr, "Usage:  dump <filename>\n");
    return (0);
  }
  Input = open (argv[1], O_RDONLY|O_BINARY, 0);
  if (Input < 0)
  {
    fprintf (stderr, "Could not open %s\n", argv[1]);
    return (0);
  }
  Output = stdout;
  Buffer = BufferA;
  SaveBuffer = NULL;

  while (1)
  {
    Length = read (Input, Buffer, DISPLAY_WIDTH);
    if (Length <= 0)
      break;

    if (SaveBuffer && Length == DISPLAY_WIDTH && memcmp (SaveBuffer, Buffer, DISPLAY_WIDTH) == 0)
    {
      Repeat = 1;
      Address += DISPLAY_WIDTH;
      continue;
    }
    if (Repeat)
      fprintf (Output, "        -- Above Line Repeated --\n");
    fprintf (Output, "%08x  ", Address);
    for (idx = 0; idx < Length; idx++)
    {
      if ((idx & 03) == 0)  /*  Add a space every 4 bytes  */
        fputc (' ', Output);
      fprintf (Output, " %02x", Buffer[idx]);
    }
    fputc ('\n', Output);      /*  fputs ("\r\n", Output); for Windows systems  */
    fprintf (Output, "          ");
    for (idx = 0; idx < Length; idx++)
    {
      if ((idx & 03) == 0)
        fputc (' ', Output);
      fprintf (Output, "  %c", Buffer[idx] >= 0x20 && Buffer[idx] <= 0x7F ? Buffer[idx] : ' ');
    }
    fputc ('\n', Output);
    Repeat = 0;
    SaveBuffer = Buffer;
    Buffer = Buffers[(++BufferIndex) & 1];
    Address += DISPLAY_WIDTH;
  }
  if (Repeat)                           /*  Just in case the last line repeats  */
    fprintf (Output, "        -- Above Line Repeated --\n");
  return 0;
}
jon8034,

If I understand you correctly you are wanting to convert binary data to ascii data from input that looks something like

Mary Johnson    $#@!   Fred Smith #$%#  Jack Jones $%^^

etc

Where some of the data is readable ascii and other data is binary representation of numbers - is this correct?

If so then a generic program is not going to cut it because you need to know what your data layout is.

There are two possibilities - fixed length fields and variable length fields. If you are working with fixed length fields then it is simple
#define ASCII 1
#define NUMERIC 2
... // expand as needed

typedef struct
{
    int size ;
    int type ;
} FIELD field[] =
  {
     {10, ASCII},
     {4, NUMERIC},
     ...
 }
while (!feof(fp)
{
for ( i = 0; i < nFields; i++)
{
   fread ( buffer, field[[i].size, 1, fp ) ;
   if ( field[i].type == ASCII )
     printf ( "%s", (char*)buffer)
   else if field[i].type == NUMERIC )
     printf ( "%d", *(DWORD*)buffer);
   else
      ... // repeat for all types you want to cater for
}
  printf ("\n")
}

Code is not perfect but it illustrates the point.

If you have variable length fields you will need to modify the above accordingly. With variable length fields some fields will have a size or length value preceedding them somewhere in the record. For some fields like int's it is implied and therefore not specified - typically ASCII data would require a length attribute. You would need to modify the code to first read this length and then use that value to read the next field.

It is important to first understand the layout of the file - once you have this the rest is easy - without it there is not much you will be able to achieve.