Solved

Need help ASAP with a binary file created in cobol under as/400 using pack maybe unpack using PHP?

Posted on 2008-06-16
11
668 Views
Last Modified: 2011-10-19
Forgive me if I sound naive. I'm trying to convert a data file for a program written in cobol into readable text. Half of the file is readable, but parts of it show up in weird characters which from my programming experience (C++) looks to be binary data.

A programmer in my group told me that I may be able to see it if I log in and view the file directly in as/400 since I am currently viewing it in notepad, but I don't think that's the case since most of the file is readable. Therefore he also suggested that it may be packed data.

My only experience with programming is C++, PHP, and some Visual Basic. I don't have enough knowledge to determine what the problem is but I think if there was some way I could write a script to unpack it, for example, using PHP's version of unpack, then maybe I can rewrite the file. Any ideas would be appreciated. I can't upload the file as it contains sensitive information.
0239AAA  LA SP L 350102     Sñë"023902     Sñë"3501000043512969 kSEU9004                   06.18.00E¬   
0239AARREST2E5R. 350102     `023902     `3501000557638172 kSEU9004                   06.18.00E¬

Open in new window

0
Comment
Question by:MeridianManagement
  • 5
  • 3
  • 2
  • +1
11 Comments
 
LVL 29

Expert Comment

by:rdivilbiss
ID: 21794367
Get an EBCDIC chart and check if that is the encoding used.  Use a binary editor to read the files.



0
 
LVL 2

Author Comment

by:MeridianManagement
ID: 21794686
any recommendations on a binary editor?
0
 
LVL 29

Expert Comment

by:rdivilbiss
ID: 21794764
Look around MajorGeeks for one that does side by side binary and ebcidic.  Most do side by side binary and ASCII, but IMB loved EBCIDIC.

If you don't find one that does side by side your stuck interpreting the characters by hand with a chart.  Either way they are probably control characters of some sort.
0
Gigs: Get Your Project Delivered by an Expert

Select from freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely and get projects done right.

 
LVL 16

Expert Comment

by:theo kouwenhoven
ID: 21795336
I don't know how you get it in notepat, but if you didn't download it the right way,
I think its a CCSID 66535 problem, you better check the file on the as 400,
or get it via the IFS for good translation of EBCDIC to ASCII,

To view the contents on the as/400 just login, go to a command line and  type :
RUNQRY *N LibrName/FileName

If you viewing on a PC is a must, get it in Excel with the the "Data > Transfer from iSeries" option,
and activate the check-box by the properties  (I think 4th window)

Regards,
Murph
transfer.jpg
0
 
LVL 27

Expert Comment

by:tliotta
ID: 21797064
MeridianManagement:

Absolute first step is determining what format the odd data is in. If it's packed-decimal, you'll get one result. But if it's one of the COBOL binary formats (or even floating point), you'll get totally different results.

And required second step is determining any network-transport transformations. If you're receiving an EBCDIC file that has been translated to ASCII, then the packed/binary 'characters' will also have been mostly translated. (Whether a translation has actually been done for any given byte, or 'character', will depend on the particular CCSID or language involved on bothe ends of the network. Character translations can have surprises in some character positions for various code pages.

Tom
0
 
LVL 16

Expert Comment

by:theo kouwenhoven
ID: 21801467
Hi MeridianManagement,

I checked yor example and can tell my idea about it:

Pos  1  to 17 is Character data
Pos  18 to 18 is Packed data ?
Pos  19 to 29 is Character data
Pos  30 to 33 is Packed data
Pos  34 to 44 is Character data
Pos  45 to 48 is Packed data

Pos  49 to 65 is Character data
Pos  66 to 66 is Packed data ?
Pos  67 to 68 is Packed data
Pos  69 to 72 is Packed data
Pos  73 to 85 is Character data
Pos  86 to 86 is Packed data ?
Pos  87 to 89 is Character data
Pos  90 to 90 is Packed data ?
Pos  91 to 109 is Character data
Pos  110 to 113 is Possible Binary

The fields with ? can also start at a lower position but in that case it are wrongly initiated packed fields filled with spaces  4040404040 :)

Regards,
Murph
 
(For anybody that like to help, see the example in the attachment for the real contents)

EBCDIC.bmp
0
 
LVL 16

Expert Comment

by:theo kouwenhoven
ID: 21801483
Sorry  MeridianManagement,

Forget to say.......  Listen to your programmer whos telling you:
"A programmer in my group told me that I may be able to see it if I log in and view the file directly in as/400"

Probably an AS/400 programmer, sowijs a very wise (and older) man :) lol
0
 
LVL 27

Expert Comment

by:tliotta
ID: 21807022
Murph's analysis is pretty good. There are a few positions that I disagree -- looks to me like pos. 30 is a 1-byte field rather than packed because it contains an invalid packed character (x'E2') in the first example, and pos. 45 is the same. Pos. 66 is hard to tell; _might_ be packed or might be character or might be binary.

Looking out in pos. 110-113, we see x'BA3C143F' in the first example and x'BA3C1440' in the second example. Since pos. 113 appears to have been incremented as a binary number from x'3F' to x'40', we have to consider that there is either a 2-byte or 4-byte binary value that ends in pos. 113. Further, the ending bytes with x'143F' are the same as what we see back in pos. 67-68; possibly a repeated value? A key field that links the second example to the first?

I'm not sure that analysis of just these two examples is going to get us close enough.

Tom
0
 
LVL 16

Expert Comment

by:theo kouwenhoven
ID: 21807360
Hi Tom and MeridianManagement,

It doesn't matter, as long as MeridianManagement following the advise of the wise men "og in and view the file directly in as/400" the layout will be clear.

Regards,
Murph

0
 
LVL 27

Expert Comment

by:tliotta
ID: 21807654
Murph:

Fully agreed. It _probably_ has an external definition that can be viewed. If not, the COBOL itself would be the logical place to look.

However, it only gets a single step closer to the resolution.

Resolution requires knowing about any/all network-transport character translations. It then requires knowing how to extract substrings of both packed bytes and binary bytes and then converting those into a faithful ASCII respresentation. (PC ASCII, UTF-8, UCS-2... whatever the needed final form is.)

Ideally, the method of transport should be chosen so that it all is done in the _transport itself_. This problem shouldn't even exist.

For example, if this is a direct FTP of a physical file, it would probably be far better to create a VIEW that presents any packed/binary values as characters in the first place. Then FTP the VIEW (or the logical file or whatever the chosen intermediate will be).

Tom
0
 
LVL 16

Accepted Solution

by:
theo kouwenhoven earned 500 total points
ID: 21808586
Hi MeridianManagement,

You ask for a binary editor? why to update the file?
every tool to do that is inside the AS/400  (of course).

To get all detailed info about the file, you can also login on the AS400 and use the  DSPFFD command (Display File Field Description)
DSPFFD MyLibrary/MyFile
This will show you the complete description of all fields in the file Type, length, labels, fieldnames etc.

Regards,
Murph
0

Featured Post

Courses: Start Training Online With Pros, Today

Brush up on the basics or master the advanced techniques required to earn essential industry certifications, with Courses. Enroll in a course and start learning today. Training topics range from Android App Dev to the Xen Virtualization Platform.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Deprecated and Headed for the Dustbin By now, you have probably heard that some PHP features, while convenient, can also cause PHP security problems.  This article discusses one of those, called register_globals.  It is a thing you do not want.  …
Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
The viewer will learn how to dynamically set the form action using jQuery.

808 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question