Link to home
Start Free TrialLog in
Avatar of CalmSoul
CalmSoulFlag for United States of America

asked on

python export content to text file

I have following files with extension (*.olu)  (i.e. really mean txt file) in one directory. More than 1 million.

-Read anything within [] and make it column in text file - shown in example below (element-type) value of that column is after = sign (for example: OU Document with type in example provided)

[element]
Type=OU Document with type

[Docs]
1=\ft\ou\ds\comb\5623589556536.TIF
2=\ft\ou\ds\comb\5623589556586.TIF

[Indices.1]
account_number=9999999
first_name=test user
address=0
sub_class=OUT
linux_date=20000630
bentch_date=  /  /  
mod_acct_num=



[Indices.2]
account_number=9999999
first_name=test user
address=0
sub_class=OUT
linux_date=20000630
bentch_date=  /  /  
mod_acct_num=222

Open in new window



I am looking for python to parse through each *.olu file in the directory and export following information out in text file (pipe delimited)

+----------+-----------------------+----------------------------------+----------------------------------+--------------------------+----------------------+-------------------+---------------------+----------------------+-----------------------+------------------------+--+
| Filename | element-type          | Docs.1                           | Docs.2                           | Indices.1-account_number | Indices.1-first_name | Indices.1-address | Indices.1-sub_class | Indices.1-linux_date | Indices.1-bentch_date | Indices.2-mod_acct_num | *** |
+----------+-----------------------+----------------------------------+----------------------------------+--------------------------+----------------------+-------------------+---------------------+----------------------+-----------------------+------------------------+--+
| *.olu    | OU Document with type | \ft\ou\ds\comb\5623589556536.TIF | \ft\ou\ds\comb\5623589556586.TIF |9999999                   |    test user         |          0        |        OUT          |      20000630        |     /  /              |                        |     |
+----------+-----------------------+----------------------------------+----------------------------------+--------------------------+----------------------+-------------------+---------------------+----------------------+-----------------------+------------------------+--+
|          |                       |                                  |                                  |                          |                      |                   |                     |                      |                       |                        |     |
+----------+-----------------------+----------------------------------+----------------------------------+--------------------------+----------------------+-------------------+---------------------+----------------------+-----------------------+------------------------+--+
|          |                       |                                  |                                  |                          |                      |                   |                     |                      |                       |                        |     |
+----------+-----------------------+----------------------------------+----------------------------------+--------------------------+----------------------+-------------------+---------------------+----------------------+-----------------------+------------------------+--+

Open in new window

Avatar of pepr
pepr

I suggest to use the configparser (standard module; for Python 3 -- https://docs.python.org/3/library/configparser.html). I also suggest to focus on getting the information first, and playing with formatting later.

Here is the code that gets the info from the file (read the comments):
#! python3

import configparser

cfg = configparser.ConfigParser()

# The OLU file name
fname = 'test.olu'
cfg.read(fname)         # parse the file content

print(cfg.sections())   # we know these [sections] now
print('------------------------------------')

print('Filename:', fname)
print('Element type:', cfg['element']['Type'])

# Demo how to get the values
for k in cfg['Docs']:
    print()
    print('Docs.{}: {}'.format(k, cfg['Docs'][k]))
    sec_key = 'Indices.{}'.format(k)
    print(sec_key)              # to know what is the section id
    section = cfg[sec_key]      # this is a shortcut
    for x in section:
        print('{}: {}'.format(x, section[x]))
        
print('------------------------------------')
        
# The alternative processing to feed the info into a list.
for k in cfg['Docs']:
    lst = [fname]
    lst.append(cfg['Docs'][k])
    for x in section:
        lst.append(section[x])

    # Join by the bar and print.
    print(' | '.join(lst))

Open in new window


The code gets the attributes in the order they are used in the file. It will be better to use the list of attribute names that you need and process them in the given order Also, the list can be used to create a header, if you really need it.

It is a question why you would need that pretty formatting. But if you want it, it is possible. The alternative could be to use the csv standard module to output the rows to the csv file.

The code prints the following for your example:
[]
['element', 'Docs', 'Indices.1', 'Indices.2']
------------------------------------
Filename: test.olu
Element type: OU Document with type

Docs.1: \ft\ou\ds\comb\5623589556536.TIF
Indices.1
account_number: 9999999
first_name: test user
address: 0
sub_class: OUT
linux_date: 20000630
bentch_date: /  /
mod_acct_num:

Docs.2: \ft\ou\ds\comb\5623589556586.TIF
Indices.2
account_number: 9999999
first_name: test user
address: 0
sub_class: OUT
linux_date: 20000630
bentch_date: /  /
mod_acct_num: 222
------------------------------------
test.olu | \ft\ou\ds\comb\5623589556536.TIF | 9999999 | test user | 0 | OUT | 20000630 | /  / | 222
test.olu | \ft\ou\ds\comb\5623589556586.TIF | 9999999 | test user | 0 | OUT | 20000630 | /  / | 222    

Open in new window

Avatar of CalmSoul

ASKER

thanks pepr - let me test this ...
ASKER CERTIFIED SOLUTION
Avatar of pepr
pepr

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial