[2 days left] What’s wrong with your cloud strategy? Learn why multicloud solutions matter with Nimble Storage.Register Now


Have a set of c files i need to extract data from but do not have source code and am not a c programmer - i do have a binary dump

Posted on 2004-10-31
Medium Priority
Last Modified: 2010-04-15
I have a situation i need to extract data from files (1 time only so doesn't have to be pretty).  Files do not have source code or file layouts provided.  I have gotten a binary dump util and can use that to parse out the ascii data but of course the numeric data is a bit more complicated.  Any idea on how to get this info out - no i do not have file layouts but only have a few files to do and if i have to i can develop a data table through inspection of the binary and trial and error.  Hot item need to do ASAP (of course is there anything else??).  I am not a c programmer but do have other programming tools that i can use to restructure the data or to run the binary through a coversion in to readable numeric.  Ideally a simple dump utility i could stick on the redhat enterprise 3.0 O/S and say dump file xxxx.dat > ascii.dat would be great.  Thanks!
Question by:jon8034
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions

Accepted Solution

Sadrul earned 1500 total points
ID: 12459237
well.. if you want to extract only the ascii bytes (ie, bytes having 7-bit unsigned values) from a file, then the pseudocode would look like this:

size = read(file, buffer, BUFFER_SIZE);

/* we have read size number of bytes in buffer */

      if(isascii(buffer[i]))    /* defined in `ctype.h' */
            /* print the data in file */

-- Adil

Author Comment

ID: 12459261
will this bring in the numerics

Expert Comment

ID: 12459276
it should also bring the alphabets. print the characters as char.

if you are looking for only alpha-numeric data, then use


-- Adil
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.


Author Comment

ID: 12459310
OK, i'll give it a try - thanks.
LVL 45

Expert Comment

ID: 12460561
run this command and paste the output here

file "filename"

replace filename with the complete path to your binary file. This will enumerate the type of the file ...

If it is a compiled executable then you may need to fall back on some other tools ...
If it a simple data file, then you can use a text editor to view the file ...
If it is binary data file, you can use an editor like hex to view the contents ...

Expert Comment

ID: 12461003

# strings filename

and see the result, if that is what you want.
LVL 46

Expert Comment

by:Kent Olsen
ID: 12462703
Hi jon8034,

If you haven't already written your program do dump the data, perhaps I can save you the trouble.  Here's a program that I've used a lot.  It works pretty well.  :)


#include <stdlib.h>
#include <stdio.h>
#include <io.h>
#include <fcntl.h>
#include <mem.h>

#define DISPLAY_WIDTH 16

int  Input;
FILE *Output;

unsigned char BufferA[DISPLAY_WIDTH];
unsigned char BufferB[DISPLAY_WIDTH];

unsigned char *Buffer;
unsigned char *SaveBuffer;

unsigned char *Buffers[2] = {BufferA, BufferB};
int      BufferIndex = 0;

long     Address = 0;
int      Repeat = 0;

#pragma argsused
int main(int argc, char* argv[])
  int Length;
  int idx;

  if (argc != 2)
    fprintf (stderr, "Usage:  dump <filename>\n");
    return (0);
  Input = open (argv[1], O_RDONLY|O_BINARY, 0);
  if (Input < 0)
    fprintf (stderr, "Could not open %s\n", argv[1]);
    return (0);
  Output = stdout;
  Buffer = BufferA;
  SaveBuffer = NULL;

  while (1)
    Length = read (Input, Buffer, DISPLAY_WIDTH);
    if (Length <= 0)

    if (SaveBuffer && Length == DISPLAY_WIDTH && memcmp (SaveBuffer, Buffer, DISPLAY_WIDTH) == 0)
      Repeat = 1;
      Address += DISPLAY_WIDTH;
    if (Repeat)
      fprintf (Output, "        -- Above Line Repeated --\n");
    fprintf (Output, "%08x  ", Address);
    for (idx = 0; idx < Length; idx++)
      if ((idx & 03) == 0)  /*  Add a space every 4 bytes  */
        fputc (' ', Output);
      fprintf (Output, " %02x", Buffer[idx]);
    fputc ('\n', Output);      /*  fputs ("\r\n", Output); for Windows systems  */
    fprintf (Output, "          ");
    for (idx = 0; idx < Length; idx++)
      if ((idx & 03) == 0)
        fputc (' ', Output);
      fprintf (Output, "  %c", Buffer[idx] >= 0x20 && Buffer[idx] <= 0x7F ? Buffer[idx] : ' ');
    fputc ('\n', Output);
    Repeat = 0;
    SaveBuffer = Buffer;
    Buffer = Buffers[(++BufferIndex) & 1];
    Address += DISPLAY_WIDTH;
  if (Repeat)                           /*  Just in case the last line repeats  */
    fprintf (Output, "        -- Above Line Repeated --\n");
  return 0;
LVL 59

Expert Comment

by:Julian Hansen
ID: 12464123

If I understand you correctly you are wanting to convert binary data to ascii data from input that looks something like

Mary Johnson    $#@!   Fred Smith #$%#  Jack Jones $%^^


Where some of the data is readable ascii and other data is binary representation of numbers - is this correct?

If so then a generic program is not going to cut it because you need to know what your data layout is.

There are two possibilities - fixed length fields and variable length fields. If you are working with fixed length fields then it is simple
#define ASCII 1
#define NUMERIC 2
... // expand as needed

typedef struct
    int size ;
    int type ;
} FIELD field[] =
     {10, ASCII},
     {4, NUMERIC},
while (!feof(fp)
for ( i = 0; i < nFields; i++)
   fread ( buffer, field[[i].size, 1, fp ) ;
   if ( field[i].type == ASCII )
     printf ( "%s", (char*)buffer)
   else if field[i].type == NUMERIC )
     printf ( "%d", *(DWORD*)buffer);
      ... // repeat for all types you want to cater for
  printf ("\n")

Code is not perfect but it illustrates the point.

If you have variable length fields you will need to modify the above accordingly. With variable length fields some fields will have a size or length value preceedding them somewhere in the record. For some fields like int's it is implied and therefore not specified - typically ASCII data would require a length attribute. You would need to modify the code to first read this length and then use that value to read the next field.

It is important to first understand the layout of the file - once you have this the rest is easy - without it there is not much you will be able to achieve.

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Have you thought about creating an iPhone application (app), but didn't even know where to get started? Here's how: ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Important pre-programming comments: I’ve never tri…
Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode (http://en.wikipedia.org/wiki/Unicode)? They will have you believe that Unicode requires you to use…
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use while-loops in the C programming language.
The goal of this video is to provide viewers with basic examples to understand and use conditional statements in the C programming language.

656 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question