asked on

Python convert long row into multiple columns

I have a CSV file that has just one column but needs to be multiple columns.

Example only one column

row 1 aaaaaaa aaaa aaaaaaaaa aaaaa a aaaaaaa

I want it be with multiple columns

column(fixed length) column2(fixed length) column3(fixed length) column4(fixed length)

and the rows will be parsed into multiple fixed column lengths.

Anybody have an idea to do this in Python?

pepr

Firstly, CSV originally means "Comma Separated Values". These days, also tab or semicolon is used sometimes. Definitely, the fixed width of columns is something very different in principle. There is some ambiguity in your question. The example seems to be fixed width. But it is not clear whether you want the fixed width values after reading the resulting file (i.e. memory representation) while the file itself should be stored as CSV, or whether the wanted file should be stored using fixed width. In other words... Are you asking how to read the fixed-width columns file or to write the read values back as CSV?

Try the following code. The reading part shows how to extract the values from the fixed-width columns, the writing part shows how to write tha values to a CSV file.

#!python2

import csv

fname_in = 'a.txt'
fname_out = 'a.csv'

cols = [0, 9, 24, 32, 46, 52, 57 ]  # positions of the columns (zero based)
with open(fname_in) as fin, open(fname_out, 'wb') as fout:  
                            # for the csv file, must be open in binary mode
    writer = csv.writer(fout)
    for line in fin:
        line = line.rstrip()  # removing the '\n' and other trailing whitespaces
        # if needed, you can remove only the '\n' char like this:
        # if line[-1] == '\n':
        #     line = line[:-1]

        # Get the row (the list) of the column values.        
        row = []                     # init -- empty list
        pos1 = cols[0]
        for pos2 in cols[1:]:
            value = line[pos1:pos2]  # slice the column value
            value = value.rstrip()   # if you want to remove trailing whitespaces
            row.append(value)        # append to the row of values    
            pos1 = pos2              # get ready for the next column
            
        value = line[pos1:]          # last column value
        value = value.rstrip()       # if you want to remove trailing whitespaces
        row.append(value)            # append to the row of values    
        
        # Process the row -- here print and writing to the CSV file.
        print row
        writer.writerow(row)

Open in new window

The sample of 3 rows was saved to a.txt. It prints

c:\tmp\___python\sonofstimpy1\Q_28125767>py a.py
['row 1', 'aaaaaaa', 'aaaa', 'aaaaaaaaa', 'aaaaa', 'a', 'aaaaaaa']
['row 2', 'aaaaaaa', 'aaaa', 'aaaaaaaaa', 'aaaaa', 'a', 'aaaaaaa']
['row 3', 'aaaaaaa', 'aaaa', 'aaaaaaaaa', 'aaaaa', 'a', 'aaaaaaa']

Open in new window

and it writes the following content to the a.csv:

row 1,aaaaaaa,aaaa,aaaaaaaaa,aaaaa,a,aaaaaaa
row 2,aaaaaaa,aaaa,aaaaaaaaa,aaaaa,a,aaaaaaa
row 3,aaaaaaa,aaaa,aaaaaaaaa,aaaaa,a,aaaaaaa

Open in new window

sonofstimpy1

ASKER

Thanks for replying. Lemmie be more specific.

I have a CSV file with 1,000+ rows.

Each row is just a single column, looking something like this...

Here is an example Here is a second example blah blah

I need this row should be cut into multiple columns. I know the fixed length of each column...

column 1 is the first 8 characters
column 2 is the next 4 characters
column 3 is the next 5 characters
column 4 is the next 6 characters

So I want to open the file and read, then write to CSV with the revised rows with columns.

ASKER CERTIFIED SOLUTION

pepr

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

sonofstimpy1

ASKER

Thanks Pepr. I was re-checking your original post and it works exactly how I want it. I've noted your point about CSV files in general. Thanks for taking the time to help.

pepr

I am glad if it helped. Anyway, there is another part of the problem when working with fixed-width columns -- the formatting of the value to the width. Feel free to continue in discussion just below if you have the related questions.