Solved

Have a set of c files i need to extract data from but do not have source code and am not a c programmer - i do have a binary dump

Posted on 2004-10-31
300 Views
Last Modified: 2010-04-15
I have a situation i need to extract data from files (1 time only so doesn't have to be pretty).  Files do not have source code or file layouts provided.  I have gotten a binary dump util and can use that to parse out the ascii data but of course the numeric data is a bit more complicated.  Any idea on how to get this info out - no i do not have file layouts but only have a few files to do and if i have to i can develop a data table through inspection of the binary and trial and error.  Hot item need to do ASAP (of course is there anything else??).  I am not a c programmer but do have other programming tools that i can use to restructure the data or to run the binary through a coversion in to readable numeric.  Ideally a simple dump utility i could stick on the redhat enterprise 3.0 O/S and say dump file xxxx.dat > ascii.dat would be great.  Thanks!
0
Question by:jon8034
    8 Comments
     
    LVL 2

    Accepted Solution

    by:
    well.. if you want to extract only the ascii bytes (ie, bytes having 7-bit unsigned values) from a file, then the pseudocode would look like this:

    size = read(file, buffer, BUFFER_SIZE);

    /* we have read size number of bytes in buffer */

    for(i=0;i<size;i++)
    {
          if(isascii(buffer[i]))    /* defined in `ctype.h' */
          {
                /* print the data in file */
          }
    }


    -- Adil
    0
     

    Author Comment

    by:jon8034
    will this bring in the numerics
    0
     
    LVL 2

    Expert Comment

    by:Sadrul
    it should also bring the alphabets. print the characters as char.

    if you are looking for only alpha-numeric data, then use

    if(isalnum(buffer[i]))
    ...


    -- Adil
    0
     

    Author Comment

    by:jon8034
    OK, i'll give it a try - thanks.
    0
     
    LVL 45

    Expert Comment

    by:sunnycoder
    run this command and paste the output here

    file "filename"

    replace filename with the complete path to your binary file. This will enumerate the type of the file ...

    If it is a compiled executable then you may need to fall back on some other tools ...
    If it a simple data file, then you can use a text editor to view the file ...
    If it is binary data file, you can use an editor like hex to view the contents ...
    0
     
    LVL 5

    Expert Comment

    by:van_dy
    try

    # strings filename

    and see the result, if that is what you want.
    0
     
    LVL 45

    Expert Comment

    by:Kdo
    Hi jon8034,

    If you haven't already written your program do dump the data, perhaps I can save you the trouble.  Here's a program that I've used a lot.  It works pretty well.  :)

    Kent



    #include <stdlib.h>
    #include <stdio.h>
    #include <io.h>
    #include <fcntl.h>
    #include <mem.h>

    #define DISPLAY_WIDTH 16

    int  Input;
    FILE *Output;

    unsigned char BufferA[DISPLAY_WIDTH];
    unsigned char BufferB[DISPLAY_WIDTH];

    unsigned char *Buffer;
    unsigned char *SaveBuffer;

    unsigned char *Buffers[2] = {BufferA, BufferB};
    int      BufferIndex = 0;

    long     Address = 0;
    int      Repeat = 0;

    #pragma argsused
    int main(int argc, char* argv[])
    {
      int Length;
      int idx;

      if (argc != 2)
      {
        fprintf (stderr, "Usage:  dump <filename>\n");
        return (0);
      }
      Input = open (argv[1], O_RDONLY|O_BINARY, 0);
      if (Input < 0)
      {
        fprintf (stderr, "Could not open %s\n", argv[1]);
        return (0);
      }
      Output = stdout;
      Buffer = BufferA;
      SaveBuffer = NULL;

      while (1)
      {
        Length = read (Input, Buffer, DISPLAY_WIDTH);
        if (Length <= 0)
          break;

        if (SaveBuffer && Length == DISPLAY_WIDTH && memcmp (SaveBuffer, Buffer, DISPLAY_WIDTH) == 0)
        {
          Repeat = 1;
          Address += DISPLAY_WIDTH;
          continue;
        }
        if (Repeat)
          fprintf (Output, "        -- Above Line Repeated --\n");
        fprintf (Output, "%08x  ", Address);
        for (idx = 0; idx < Length; idx++)
        {
          if ((idx & 03) == 0)  /*  Add a space every 4 bytes  */
            fputc (' ', Output);
          fprintf (Output, " %02x", Buffer[idx]);
        }
        fputc ('\n', Output);      /*  fputs ("\r\n", Output); for Windows systems  */
        fprintf (Output, "          ");
        for (idx = 0; idx < Length; idx++)
        {
          if ((idx & 03) == 0)
            fputc (' ', Output);
          fprintf (Output, "  %c", Buffer[idx] >= 0x20 && Buffer[idx] <= 0x7F ? Buffer[idx] : ' ');
        }
        fputc ('\n', Output);
        Repeat = 0;
        SaveBuffer = Buffer;
        Buffer = Buffers[(++BufferIndex) & 1];
        Address += DISPLAY_WIDTH;
      }
      if (Repeat)                           /*  Just in case the last line repeats  */
        fprintf (Output, "        -- Above Line Repeated --\n");
      return 0;
    }
    0
     
    LVL 48

    Expert Comment

    by:Julian Hansen
    jon8034,

    If I understand you correctly you are wanting to convert binary data to ascii data from input that looks something like

    Mary Johnson    $#@!   Fred Smith #$%#  Jack Jones $%^^

    etc

    Where some of the data is readable ascii and other data is binary representation of numbers - is this correct?

    If so then a generic program is not going to cut it because you need to know what your data layout is.

    There are two possibilities - fixed length fields and variable length fields. If you are working with fixed length fields then it is simple
    #define ASCII 1
    #define NUMERIC 2
    ... // expand as needed

    typedef struct
    {
        int size ;
        int type ;
    } FIELD field[] =
      {
         {10, ASCII},
         {4, NUMERIC},
         ...
     }
    while (!feof(fp)
    {
    for ( i = 0; i < nFields; i++)
    {
       fread ( buffer, field[[i].size, 1, fp ) ;
       if ( field[i].type == ASCII )
         printf ( "%s", (char*)buffer)
       else if field[i].type == NUMERIC )
         printf ( "%d", *(DWORD*)buffer);
       else
          ... // repeat for all types you want to cater for
    }
      printf ("\n")
    }

    Code is not perfect but it illustrates the point.

    If you have variable length fields you will need to modify the above accordingly. With variable length fields some fields will have a size or length value preceedding them somewhere in the record. For some fields like int's it is implied and therefore not specified - typically ASCII data would require a length attribute. You would need to modify the code to first read this length and then use that value to read the next field.

    It is important to first understand the layout of the file - once you have this the rest is easy - without it there is not much you will be able to achieve.
    0

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    How to run any project with ease

    Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
    - Combine task lists, docs, spreadsheets, and chat in one
    - View and edit from mobile/offline
    - Cut down on emails

    Suggested Solutions

    An Outlet in Cocoa is a persistent reference to a GUI control; it connects a property (a variable) to a control.  For example, it is common to create an Outlet for the text field GUI control and change the text that appears in this field via that Ou…
    Summary: This tutorial covers some basics of pointer, pointer arithmetic and function pointer. What is a pointer: A pointer is a variable which holds an address. This address might be address of another variable/address of devices/address of fu…
    The goal of this video is to provide viewers with basic examples to understand and use structures in the C programming language.
    Video by: Grant
    The goal of this video is to provide viewers with basic examples to understand and use for-loops in the C programming language.

    913 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    17 Experts available now in Live!

    Get 1:1 Help Now