Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

Convert ASCII file with several rows of data that are inconsistent to a column format in Excel

Posted on 2014-07-28
11
390 Views
Last Modified: 2014-07-29
I have an ASCII file that was generated in Cobol.  Trying to convert the rows of data into column format in Excel 2013.  The problem is that there are different number of rows for each record.  Some records may be 15 lines while another may have 20.
0
Comment
Question by:dman19691
  • 4
  • 4
  • 3
11 Comments
 
LVL 8

Expert Comment

by:5teveo
ID: 40224350
Can you submit a sample file with some records to review?

Are there any patterns? Record row tags / headings?

You will need to append multi-row records into 1 row for excel which will require a macro - in my opinion.
0
 

Author Comment

by:dman19691
ID: 40224438
5Teveo,

I will have to check in see if the data is confidential.  But I was able to put every record that started with an 01 and changed it to a unique identifier of XX.  So that the start of a new record what be identified with XX.  I don't know if that makes sense?
0
 
LVL 8

Expert Comment

by:5teveo
ID: 40224634
Just make a pattern similar to what you need so enable a better sample script... that's my thought
0
Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

 

Author Comment

by:dman19691
ID: 40224845
Here is a sample of the data.  The first two positions in the lines of data represent a record.  For example 01, 02, 03 up to 15 represent one record.   I will attach a sample of the data.
sample.txt
0
 
LVL 8

Expert Comment

by:5teveo
ID: 40224904
Upon quick review... it appears a data definition may be needed...
I can guess row types 01-?? thru next 01 are all associated with 1 record. I can also see multiple record rows concatenation for a text type description but it appears some other array type work is occurring within record.
Do you have a data definition describing how data is defined in variable length record in cobol? that's may be what you need to get better answer/script?

A vb script to join records will not help you unless script knows how to handle each type.

make sense?
0
 

Author Comment

by:dman19691
ID: 40225024
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40225029
since you have very different layouts for these various record types, please do a manual parsing of the data and post that workbook.
0
 
LVL 8

Assisted Solution

by:5teveo
5teveo earned 250 total points
ID: 40225063
yup - that's it....

Well that's a lot analysis for me right now for free (or only points). We can pull records all records into 1 row but you would still have a mess unless you walk thru column definitions by record type from Data definition for better excel alignment.
0
 
LVL 45

Accepted Solution

by:
aikimark earned 250 total points
ID: 40225280
This is not a trivial question.  Look at some of the work researchers in this field had to do:
http://assets.conferencespot.org/fileserver/file/42340/filename/2vdsq5.pdf


Given the size of the task, I might suggest looking for pre-parsed versions of this data, such as
http://archive.today/bomAY
http://www.capcog.org/documents/airquality/reports/2013/Task_3.1-2012_and_2018_Emissions_Modeling_for_CAPCOG_Region_and_Milam_Counties_2013-12-02.pdf

==============
I suggest you look for some software or professional help on this problem.
* Sourceforge projects:
    http://cobol-dde.sourceforge.net/
    http://sourceforge.net/projects/record-editor/
    http://record-editor.sourceforge.net/Cobol.html

* github (python) project: https://github.com/bpeterso2000/pycobol

* It wouldn't surprise me if there weren't some other open source applications that can interpret file content, based on COBOL layout.

* A (pricey) software solution example:  http://www.cobolproducts.com/datafile/pricing.html
* MicroFocus -- a COBOL developer tool
* If you already have Oracle in place: http://www.oracle.com/technetwork/testcontent/flat-file-cobol-copybook-sources-100800.html

* Contact environmental or open data groups to help you.
* Look at the Excel, VB, or COBOL top experts to see if they have enabled their Hire Me button.
0
 

Author Closing Comment

by:dman19691
ID: 40227418
Thanks 5teveo and aikimark.  I figured this would not be easy since the data is in Cobol and has several lines, but you pointed me in the right direction.  Thanks again.
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40227618
The better formats for this data would be:
* mark-up (XML, JSON, YAML)
* relational database with separate tables for each kind of record (01-15)
* NoSQL database (most likely JSON documents)

If you want to save this in Excel, you should have separate tabs for each kind of record (01-15)

Note: When parsing the data, the primary key of the 01 (or 02) record must be added to the 03-15 records as a foreign key, allowing you to join the tables for your queries.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Convert between Excel file formats (.XLS, .XLSX, .XLSM) with/without macro option David Miller (dlmille) Intro Over this past Fall, I've had the opportunity to see several similar requests and have developed a couple related solutions associate…
When you see single cell contains number and text, and you have to get any date out of it seems like cracking our heads.
The viewer will learn how to create two correlated normally distributed random variables in Excel, use a normal distribution to simulate the return on different levels of investment in each of the two funds over a period of ten years, and, create a …
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

829 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question