Converting EBCDIC files and/or ASCII files where fields contain COMP-3 data

I have written code in Visual Studio that can handle the conversion of an EBCDIC flat file to ASCII to CSV format output, and everything works great.  Except for all of the COMP-3 translations.  My CVS output file is showing strange characters for all of the columns with COMP-3 data. After perusing the internet all afternoon, I am still confused as to how best to go about solving this problem correctly.

I am essentially trying to convert COMP-3 data found in an EBCDIC flat file into a CSV format, via ASCII if necessary.  Is there code that I could add to my current Solution that will accurately translate the COMP-3 data lines into CSV format?
gregematthewAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

aikimarkCommented:
If you do an EBCDIC-to-ASCII conversion, those Comp-3 fields will appear as if they were unicode characters, especially if the high order nibble of a byte is 8 or 9.  What is the primary language you are using?
0
pcelbaCommented:
How do you recognize COMP-3 data from the EBCDIC? Their codes can collide in some cases. Do you know the input file structure or field sizes?

And if you recognize such data then it is easy to convert them to numbers in ASCII.

Post the data sample or look here for some basic info: http://www.3480-3590-data-conversion.com/article-packed-fields.html
0
gregematthewAuthor Commented:
It is a Texas Railroad Commission EDCDIC file, and I have the data breakdown listed in a manual that they offer.  Everything converted over except for certain columns. All of these problem columns are listed in the manual as COMP-3 line records.  Also, if you translate the original file to ASCII and into cvs, the COMP-3 positions come out at as weird symbols. It would appear they need to be unpacked in ebcdic first to retain the correct output data.

Will I need to use something ther than visual studio community to do this? Such as java or COBOL? I have very little experience with either.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

pcelbaCommented:
No, you don't need any other development tool you just need to unpack the columns in COMP-3 format also called packed decimal numbers. Read these numbers as byte array and then decode it.

You may look e.g. here for some conversion code sample.

Much more code samples are available if you Google for c# packed decimal conversion
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
gregematthewAuthor Commented:
Here is what the data output structure looks like in the Texas Railroad Commission text-file manual:

 01  RAILROAD-COMMISSION-TAPE-REC.                                    POS.
          02  RRC-TAPE-RECORD-ID                                PIC X(02). 1
          02  OIL-PRODUCTION-ROOT.
              03  PDROOT-KEY.
                  05 PD-OIL-CODE                PIC X(1)    VALUE ZEROS.   3
                  05 PD-OIL-DISTRICT            PIC 9(2)    VALUE ZEROS.   4
                  05 PD-OIL-LEASE-NBR           PIC 9(06)   VALUE ZEROS.   6
              03  PD-MOVABLE-BALANCE            PIC S9(09)  COMP-3         12
                                                            VALUE ZEROS.
              03  PD-BEGINNING-OIL-STATUS       PIC S9(09)  COMP-3         17
                                                            VALUE ZEROS.
              03  PD-BEGINNING-CSGHD-STATUS     PIC S9(09)  COMP-3         22
                                                            VALUE ZEROS.
              03  PD-OIL-OLDEST-EOM-BALANCE     PIC S9(09)  COMP-3         27
                                                            VALUE ZEROS.
              03  FILLER                        PIC X(21)   VALUE ZEROS.   32
          02  RRC-TAPE-FILLER                                 PIC X(0050). 53
0
gregematthewAuthor Commented:
This is what it looks like during translation, except the comment box deletes the COMP-3...where the spaces are in each line represents the COMP-3 data:
021111                        2011110120111019                    00000
021110                        2011100120110927                    00000
021109                        2011090120110927                    00000
13        2011090120110825        00000000000000
021108                        2011080120110927                    00000
13        2011080120110726        00000000000000
021107                        2011070120110927                    00000
13        2011070120110627        00000000000000

Open in new window

0
gregematthewAuthor Commented:
Ok, sorry for the multiple posts, but attached is the text file which will show you what I am dealing with.
0
aikimarkCommented:
no file attached, Greg
0
pcelbaCommented:
No problem, but you should look at the data in some hexadecimal editor and you'll see the COMP-3 numbers unpacked (in hexadecimal format). It will also show the decoding should not be as big problem.

All the COMP-3 fields have equal size (5 bytes) in your case so you may even use some fixed algorithm to decode them. But even the principle of any COMP-3 decoding is simple:

1. Read the packed decimal number as byte string
2. Extract the last 4 bits in the last byte (you may use AND operation with 0x0F) - this represents the result sign
3. Extract the first four bits in the last byte (operation:  AND  0xF0) - you have the least significant digit of the result
4. Repeat above two steps for other bytes and build the resulting number (multiply the previous result by 10 and add the result of the current AND operation)
5. Apply the sign (from step 1) to the final result

COMP-3 numbers could also have implied decimal point but let suppose we are working with whole numbers.
0
gregematthewAuthor Commented:
Attached now.
test.txt
0
pcelbaCommented:
The documentation says:
             03  PD-MOVABLE-BALANCE            PIC S9(09)  COMP-3         12
                                                            VALUE ZEROS.
              03  PD-BEGINNING-OIL-STATUS       PIC S9(09)  COMP-3         17
                                                            VALUE ZEROS.
              03  PD-BEGINNING-CSGHD-STATUS     PIC S9(09)  COMP-3         22
                                                            VALUE ZEROS.
              03  PD-OIL-OLDEST-EOM-BALANCE     PIC S9(09)  COMP-3         27
                                                            VALUE ZEROS.
and the file attached really contains zeros at the COMP-3 field places... (in hex format: 000000000c)

So what would you like to convert then? Simply place the zero on the output and skip any conversion.
Or replace all occurrences of above string (0x000000000c) by zeros in ASCII code.
0
pcelbaCommented:
Here is attached the hex output where you can see all the COMP-3 numbers as female acronym:Hex view
0
aikimarkCommented:
It would be helpful to post some lines that contain non-zero values in those Comp-3 fields.
0
aikimarkCommented:
I answered a similar question about this data last year
http://www.experts-exchange.com/Q_28485331.html#a40225280

I assume the drilling permit data is defined in this (or later) document:
http://www.rrc.state.tx.us/media/1250/drillingpermitmasterpluslatitudeslongitudes_oga049m_july1.pdf
0
Bill PrewCommented:
@gregematthew,

If you are looking for Java code, here is an example that you could build off of.

http://www.coderanch.com/t/279125/java-io/java/ASCII-EBCDIC-conversion-preserving-COMP

If you are just trying to fully understand the format of the COMP-3 data, you can take a  look at:

http://www.3480-3590-data-conversion.com/article-packed-fields.html

http://documentation.microfocus.com/help/topic/com.microfocus.eclipse.infocenter.studee60ux/HRLHLHCLANU942.html

~bp
0
Martin LissOlder than dirtCommented:
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Programming

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.