Anne H
asked on
How to transpose a wide file to a skinny file with key-value pairs using python
Hi all,
I am new to python and would appreciate your help in transpose a wide file to a key value pairs to be loaded to database. Here is the input and output file format. Thanks!
Input file
visitor_id param_1 param_2 param_3
1 vp1 vp2 vp3
2 vp12 vp22 vp23
Output file
vistor_id converted_key converted_value
1 param_1 vp1
1 param_2 vp2
1 param_3 vp3
2 param_1 vp12
2 param_2 vp22
2 param_3 vp23
I am new to python and would appreciate your help in transpose a wide file to a key value pairs to be loaded to database. Here is the input and output file format. Thanks!
Input file
visitor_id param_1 param_2 param_3
1 vp1 vp2 vp3
2 vp12 vp22 vp23
Output file
vistor_id converted_key converted_value
1 param_1 vp1
1 param_2 vp2
1 param_3 vp3
2 param_1 vp12
2 param_2 vp22
2 param_3 vp23
which Db are you using?
ASKER
redshift
ASKER
we have over 13k fields in the file and the order could change. that's why we design the table this way.
Just to be clear, your input and output files are tab-delimited text files? Comma-delimited text files?
ASKER
thanks for responding. the input and output files can be any delimited file. And the order of the output lines do not matter either since I can sort them in the database.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you so much Flabio!
I am using python3. After slightly change I got this to work.
import csv
if __name__ == '__main__':
inpath = 'input.txt'
outpath = 'output.txt'
with open(inpath, 'r') as inf, open(outpath, 'w') as outf:
reader = csv.reader(inf, delimiter='|')
writer = csv.writer(outf, delimiter='|')
header=next(reader)
h0=header.pop(0)
print (h0)
writer.writerow((h0, 'converted_key', 'converted_value'))
for row in reader:
vi = row.pop(0)
print (vi)
for values in zip(header, row):
print (values)
writer.writerow((vi,)+valu es)
another question, since the order of original fields in the raw file could change, I might need to dynamically look for the visitor_id in the header and write the value of visitor id in the first column. How could I do that?
Thanks a lot!
I am using python3. After slightly change I got this to work.
import csv
if __name__ == '__main__':
inpath = 'input.txt'
outpath = 'output.txt'
with open(inpath, 'r') as inf, open(outpath, 'w') as outf:
reader = csv.reader(inf, delimiter='|')
writer = csv.writer(outf, delimiter='|')
header=next(reader)
h0=header.pop(0)
print (h0)
writer.writerow((h0, 'converted_key', 'converted_value'))
for row in reader:
vi = row.pop(0)
print (vi)
for values in zip(header, row):
print (values)
writer.writerow((vi,)+valu
another question, since the order of original fields in the raw file could change, I might need to dynamically look for the visitor_id in the header and write the value of visitor id in the first column. How could I do that?
Thanks a lot!
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you so much! It worked. :)
Question inactive