How to convert this to CSV

Hi,

can you guide on how to convert this to bellow format  there is I have uploaded two sample records

Id | ADate etc
499| 2017-08-25 20:43:35

i'm open to shell script or php thanks in advance for your time and help
sample.txt
LVL 1
hi4pplAsked:
Who is Participating?
 
Walter RitzelSenior Software EngineerCommented:
I was able to do it in Python.
It should work on your linux system.
just call:
python converttocsv.py

Open in new window

Save the code below in a converttocsv.py
import sys
import csv
import traceback

fileToParse = open("sample.txt","r")

fileCsv = open("file_to_write.csv","wb")

validKeys = ['ADate','FDate','Is','Isked','LTH','s_PId','Index','eID','Period','Amount','LDate','OntID_FU','s_Bonus_FU','s_Active','ExpiryDate','CreditMax']
printKeys = ['Id','ADate','FDate','Is','Isked','LTH','s_PId','Index','eID','Period','Amount','LDate','OntID_FU','s_Bonus_FU','s_Active','ExpiryDate','CreditMax']

textToParse = fileToParse.readlines()

parsedData = []

for row in textToParse:
#	print row.replace('\n','')
	textParsed = row.replace('Read:Customer (','(Start":"True, ').replace('Read:ROP ','').replace('Read:RPP ','').replace('Read:CDP ','').replace('Read:PSP ','').replace('Read:RPP_s_Reward ','').replace('Read:RODP_s_Counter ','').replace('\n','')
#	print textParsed
	textParsed = textParsed.replace('=["','-[').replace(']',']"').replace('=','":"').replace('{','{"').replace('}','"}').replace('["','[').replace(', ','", "')
#	print textParsed
	textParsed = textParsed.replace('" s_A074"','" s_A074":"0"').replace('"s_A074"','"s_A074":"0"').replace('"s_A000"','"s_A000":"0"')
#	print textParsed
	textParsed = textParsed.replace('(','{"').replace(');','"}').replace('""','"')
#	print textParsed
	textParsed = textParsed.replace('"{"','{"').replace('"}"','"}').replace('-[','":"')
#	print textParsed
	textParsed = textParsed.replace('":",','":"",').replace('":"},','":""},')
#	print textParsed
	parsedData.append(textParsed)
#	print ''

lineToSave=dict()
firstTime = True
for item in parsedData:
	try:
		parsedItem =eval(item)
		if 'Start' in parsedItem:
			if firstTime:
				lineToSave['Id'] = parsedItem['Id']
				firstTime = False
			else:
				writeList = []
				for key in printKeys:
					if key in lineToSave:
						writeList.append(lineToSave[key])
				fileCsv.write(str(writeList).replace('[','').replace(']','') + '\n')
				lineToSave = dict()
				lineToSave['Id'] = parsedItem['Id']
		else:
			for key in validKeys:
				if (key in parsedItem) and (key not in lineToSave):
					lineToSave[key] = str(parsedItem[key])
	except:
		print traceback.format_exc()
		print item

writeList = []
for key in printKeys:
	if key in lineToSave:
		writeList.append(lineToSave[key])
fileCsv.write(str(writeList).replace('[','').replace(']','') + '\n')
fileCsv.close()
fileToParse.close()

Open in new window


The idea was to transform each row of your original file in a data structure called dictionary. To do that, I have used  a series of replaces on the string.
After that, we just need to build one row per id, based on all the rows with the same id.
One of the issues that you may find is that you have repeated fields on the same id, and that would lead to have always the last value of that id on the csv column.
But the script certainly do the conversion for you.

[UPDATE 1]:
I think it is possible to convert this to PHP, but I dont have the environment for that. So, maybe another expert can do it.
[UPDATE 2]:
I would suggest that you have better requirements defined if you want to really use this script.
0
 
Radek BaranowskiFull-stack Java DeveloperCommented:
sample code for writing csv in PHP:

<?php
$list = array
(
"Peter,Griffin,Oslo,Norway",
"Glenn,Quagmire,Oslo,Norway",
);

$file = fopen("contacts.csv","w");

foreach ($list as $line)
  {
  fputcsv($file,explode(',',$line));
  }

fclose($file); ?>

Open in new window

0
 
hi4pplAuthor Commented:
But that sample that I uploaded will that work? can this array parse the attached
each have keypair and value like id=499 etc
0
Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
Radek BaranowskiFull-stack Java DeveloperCommented:
the way of writing the array will work, but you need to prepare array from the data you have in the sample yourself.
0
 
Walter RitzelSenior Software EngineerCommented:
Can you please tell from your sample data, what are the columns that you want on your csv file?
I have observed the information on each row is not the same, so it is better if you determine which columns needs to be on the file.
Also, let me understand: are you going to run that in a linux or unix machine?
0
 
hi4pplAuthor Commented:
Hi,

the sample output I posted is two records actually there is around 1 million record like this... so the field I want to out is all but the fields that says key=XX

and I run this in linux.
0
 
Walter RitzelSenior Software EngineerCommented:
Ok, but can you provide a list of the fields you want, just to make it clear from the start and we dont have to go back and forth adjusting the code?
0
 
hi4pplAuthor Commented:
Hi,

thanks, here are the fields samples

Id
ADate
FDate
Is
Isked
LTH
s_PId
Index
eID
Period
Amount
LDate
OntID_FU
s_Bonus_FU
s_Active
ExpiryDate
CreditMax

Open in new window


regards
0
 
Walter RitzelSenior Software EngineerCommented:
Ok, so fields like these:
s_icBonus_FU
stive
s_hargeModifyTasks
OfferProfileKey
and others... 

Open in new window


could be ignored, right?
0
 
hi4pplAuthor Commented:
yes those are the fields... the fields will always be there if there is value it will show up their and if not it will be blank... and I want to be able to ignore or add fields in array.... thanks
0
 
hi4pplAuthor Commented:
Hi,

how flelxible is it... I try to add more fields in $validKeys  but could not make it work it thourgh error all over the place... can you guide if I change or add fields what the key points to take in consideration.

regards
0
 
Walter RitzelSenior Software EngineerCommented:
Can you please post the modified code?
0
 
hi4pplAuthor Commented:
Hi ,

here is the modified code... it through alot of error but give result also and the result is like

bellow is how the result will look like...

'45645'
'45644'
'45658'
'45689',  '2014-09-14', '2015-04-12'
'45684'
'45681'

Open in new window


code is here

import sys
import csv
import traceback

fileToParse = open("*_1.dat","r")

fileCsv = open("file_to_write.csv","wb")

validKeys = ['ActiveDate','FirstDate']
printKeys = ['CustomerId','ActiveDate','FirstDate']

textToParse = fileToParse.readlines()

parsedData = []

for row in textToParse:
#	print row.replace('\n','')
	textParsed = row.replace('Read:Customer (','(Start":"True, ').replace('Read:ROP ','').replace('Read:RPP ','').replace('Read:CDP ','').replace('Read:PSP ','').replace('Read:RPP_s_Reward ','').replace('Read:RODP_s_Counter ','').replace('\n','')
#	print textParsed
	textParsed = textParsed.replace('=["','-[').replace(']',']"').replace('=','":"').replace('{','{"').replace('}','"}').replace('["','[').replace(', ','", "')
#	print textParsed
	textParsed = textParsed.replace('" s_A074"','" s_A074":"0"').replace('"s_A074"','"s_A074":"0"').replace('"s_A000"','"s_A000":"0"')
#	print textParsed
	textParsed = textParsed.replace('(','{"').replace(');','"}').replace('""','"')
#	print textParsed
	textParsed = textParsed.replace('"{"','{"').replace('"}"','"}').replace('-[','":"')
#	print textParsed
	textParsed = textParsed.replace('":",','":"",').replace('":"},','":""},')
#	print textParsed
	parsedData.append(textParsed)
#	print ''

lineToSave=dict()
firstTime = True
for item in parsedData:
	try:
		parsedItem =eval(item)
		if 'Start' in parsedItem:
			if firstTime:
				lineToSave['CustomerId'] = parsedItem['CustomerId']
				firstTime = False
			else:
				writeList = []
				for key in printKeys:
					if key in lineToSave:
						writeList.append(lineToSave[key])
				fileCsv.write(str(writeList).replace('[','').replace(']','') + '\n')
				lineToSave = dict()
				lineToSave['CustomerId'] = parsedItem['CustomerId']
		else:
			for key in validKeys:
				if (key in parsedItem) and (key not in lineToSave):
					lineToSave[key] = str(parsedItem[key])
	except:
		print traceback.format_exc()
		print item

writeList = []
for key in printKeys:
	if key in lineToSave:
		writeList.append(lineToSave[key])
fileCsv.write(str(writeList).replace('[','').replace(']','') + '\n')
fileCsv.close()
fileToParse.close()

Open in new window

0
 
Walter RitzelSenior Software EngineerCommented:
First problem on your code: you are trying to open many files at a time. You cannot use the open command with * on the name.
Second, you have introduced a field called CustomerId. On the sample file you have provided, there is no CustomerId field inside. So this will give error.
If the files you are using now are different than the sample, please provide a new sample file accordingly.
0
 
hi4pplAuthor Commented:
Hi Walter,

thanks for replay... yeah I removed that * for files, the file is the same I only annoymized the fields name in sample it was "Id" but on real it was "CustomerId" so I replace that only rest is the same... can you tell if I have to add or remove fields where should I add or remove them...

regards
omer
0
 
Walter RitzelSenior Software EngineerCommented:
Send me the "real" file (just a few rows) so I can run and check the problems.
0
 
hi4pplAuthor Commented:
Hi,

I have even tried with sample it didn't work out... can you check the same as the sample records I provided when I try that it gives me error :

syntaxtError: unexpected EOF while parsing

Open in new window

0
 
Walter RitzelSenior Software EngineerCommented:
Send me the "real" file (just a few rows) so I can run and check the problems.
0
 
hi4pplAuthor Commented:
sorry I was away for couple of months will send you the example soon
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.