Have a set of c files i need to extract data from but do not have source code and am not a c programmer - i do have a binary dump

I have a situation i need to extract data from files (1 time only so doesn't have to be pretty).  Files do not have source code or file layouts provided.  I have gotten a binary dump util and can use that to parse out the ascii data but of course the numeric data is a bit more complicated.  Any idea on how to get this info out - no i do not have file layouts but only have a few files to do and if i have to i can develop a data table through inspection of the binary and trial and error.  Hot item need to do ASAP (of course is there anything else??).  I am not a c programmer but do have other programming tools that i can use to restructure the data or to run the binary through a coversion in to readable numeric.  Ideally a simple dump utility i could stick on the redhat enterprise 3.0 O/S and say dump file xxxx.dat > ascii.dat would be great.  Thanks!
jon8034Asked:
Who is Participating?
 
SadrulConnect With a Mentor Commented:
well.. if you want to extract only the ascii bytes (ie, bytes having 7-bit unsigned values) from a file, then the pseudocode would look like this:

size = read(file, buffer, BUFFER_SIZE);

/* we have read size number of bytes in buffer */

for(i=0;i<size;i++)
{
      if(isascii(buffer[i]))    /* defined in `ctype.h' */
      {
            /* print the data in file */
      }
}


-- Adil
0
 
jon8034Author Commented:
will this bring in the numerics
0
 
SadrulCommented:
it should also bring the alphabets. print the characters as char.

if you are looking for only alpha-numeric data, then use

if(isalnum(buffer[i]))
...


-- Adil
0
Become a Leader in Data Analytics

Gain the power to turn raw data into better business decisions and outcomes in your industry. Transform your career future by earning your MS in Data Analytics. WGU’s MSDA program curriculum features IT certifications from Oracle and SAS.  

 
jon8034Author Commented:
OK, i'll give it a try - thanks.
0
 
sunnycoderCommented:
run this command and paste the output here

file "filename"

replace filename with the complete path to your binary file. This will enumerate the type of the file ...

If it is a compiled executable then you may need to fall back on some other tools ...
If it a simple data file, then you can use a text editor to view the file ...
If it is binary data file, you can use an editor like hex to view the contents ...
0
 
van_dyCommented:
try

# strings filename

and see the result, if that is what you want.
0
 
Kent OlsenData Warehouse Architect / DBACommented:
Hi jon8034,

If you haven't already written your program do dump the data, perhaps I can save you the trouble.  Here's a program that I've used a lot.  It works pretty well.  :)

Kent



#include <stdlib.h>
#include <stdio.h>
#include <io.h>
#include <fcntl.h>
#include <mem.h>

#define DISPLAY_WIDTH 16

int  Input;
FILE *Output;

unsigned char BufferA[DISPLAY_WIDTH];
unsigned char BufferB[DISPLAY_WIDTH];

unsigned char *Buffer;
unsigned char *SaveBuffer;

unsigned char *Buffers[2] = {BufferA, BufferB};
int      BufferIndex = 0;

long     Address = 0;
int      Repeat = 0;

#pragma argsused
int main(int argc, char* argv[])
{
  int Length;
  int idx;

  if (argc != 2)
  {
    fprintf (stderr, "Usage:  dump <filename>\n");
    return (0);
  }
  Input = open (argv[1], O_RDONLY|O_BINARY, 0);
  if (Input < 0)
  {
    fprintf (stderr, "Could not open %s\n", argv[1]);
    return (0);
  }
  Output = stdout;
  Buffer = BufferA;
  SaveBuffer = NULL;

  while (1)
  {
    Length = read (Input, Buffer, DISPLAY_WIDTH);
    if (Length <= 0)
      break;

    if (SaveBuffer && Length == DISPLAY_WIDTH && memcmp (SaveBuffer, Buffer, DISPLAY_WIDTH) == 0)
    {
      Repeat = 1;
      Address += DISPLAY_WIDTH;
      continue;
    }
    if (Repeat)
      fprintf (Output, "        -- Above Line Repeated --\n");
    fprintf (Output, "%08x  ", Address);
    for (idx = 0; idx < Length; idx++)
    {
      if ((idx & 03) == 0)  /*  Add a space every 4 bytes  */
        fputc (' ', Output);
      fprintf (Output, " %02x", Buffer[idx]);
    }
    fputc ('\n', Output);      /*  fputs ("\r\n", Output); for Windows systems  */
    fprintf (Output, "          ");
    for (idx = 0; idx < Length; idx++)
    {
      if ((idx & 03) == 0)
        fputc (' ', Output);
      fprintf (Output, "  %c", Buffer[idx] >= 0x20 && Buffer[idx] <= 0x7F ? Buffer[idx] : ' ');
    }
    fputc ('\n', Output);
    Repeat = 0;
    SaveBuffer = Buffer;
    Buffer = Buffers[(++BufferIndex) & 1];
    Address += DISPLAY_WIDTH;
  }
  if (Repeat)                           /*  Just in case the last line repeats  */
    fprintf (Output, "        -- Above Line Repeated --\n");
  return 0;
}
0
 
Julian HansenCommented:
jon8034,

If I understand you correctly you are wanting to convert binary data to ascii data from input that looks something like

Mary Johnson    $#@!   Fred Smith #$%#  Jack Jones $%^^

etc

Where some of the data is readable ascii and other data is binary representation of numbers - is this correct?

If so then a generic program is not going to cut it because you need to know what your data layout is.

There are two possibilities - fixed length fields and variable length fields. If you are working with fixed length fields then it is simple
#define ASCII 1
#define NUMERIC 2
... // expand as needed

typedef struct
{
    int size ;
    int type ;
} FIELD field[] =
  {
     {10, ASCII},
     {4, NUMERIC},
     ...
 }
while (!feof(fp)
{
for ( i = 0; i < nFields; i++)
{
   fread ( buffer, field[[i].size, 1, fp ) ;
   if ( field[i].type == ASCII )
     printf ( "%s", (char*)buffer)
   else if field[i].type == NUMERIC )
     printf ( "%d", *(DWORD*)buffer);
   else
      ... // repeat for all types you want to cater for
}
  printf ("\n")
}

Code is not perfect but it illustrates the point.

If you have variable length fields you will need to modify the above accordingly. With variable length fields some fields will have a size or length value preceedding them somewhere in the record. For some fields like int's it is implied and therefore not specified - typically ASCII data would require a length attribute. You would need to modify the code to first read this length and then use that value to read the next field.

It is important to first understand the layout of the file - once you have this the rest is easy - without it there is not much you will be able to achieve.
0
All Courses

From novice to tech pro — start learning today.