CSV export from databse parsing and manipulation

I have a CSV file that I am creating with SQLPlus from an oracle databse.  The layout of the file is as follows:

"number","idType","state","country","IDnumber"

What I need to do is create a VB script or batch file to read the file and do the following:

if (idType == "1") { output to a new file "number","DL""state""IDnumber" }
if (idType == "2") { output to a new file "number","PA""country""IDnumber" }

So if my orignal file looked liek this:

123456789,1,TX,,12345
987654321,2,,US,54321

The new file would look like this:

123456789,DLTX12345
987654321,PAUS54321

Any help would be greatly appreciated
LVL 5
rkeith2412Asked:
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

x
 
TakedaTConnect With a Mentor Commented:
I guess I took the stating of "a new file" twice as creating a new file twice.  Either way, vbscript would be easier given the fact that there are double commas.

Here is the code to work with or without quotes around the idType.
Const ForReading = 1, ForWriting = 2, ForAppending = 8

set objFSO=createobject("scripting.filesystemobject")
strSourceFile="test.csv"
strOutputFile1="newfile.csv"

Set objOutPutFile1 = objFSO.CreateTextFile(strOutputFile1,True)

Set objFile=objFSO.opentextfile(strSourceFile,ForReading)
Do While objFile.atendofstream<>true
	strLineRead=objFile.Readline
	arrLine=split(strLineRead,",")
	If arrLine(1)="1" or arrLine(1)=chr(34)&"1"&chr(34) then
		objOutputFile1.WriteLine arrLine(0)&","&"DL"&arrLine(2)&arrLine(3)&arrLine(4)
	Elseif arrLine(1)="2" or arrLine(1)=chr(34)&"2"&chr(34) then
		objOutputFile1.WriteLine arrLine(0)&","&"PA"&arrLine(2)&arrLine(3)&arrLine(4)
	End If
Loop
objOutputFile1.close

Open in new window

0
 
TakedaTCommented:
Something like this?
Const ForReadIng = 1, ForWriting = 2, ForAppending = 8

set objFSO=createobject("scripting.filesystemobject")
strSourceFile="test.csv"
strOutputFile1="1.csv"
strOutputFile2="2.csv"
Set objOutPutFile1 = objFSO.CreateTextFile(strOutputFile1,True)
Set objOutPutFile2 = objFSO.CreateTextFile(strOutputFile2,True)

Set objFile=objFSO.opentextfile(strSourceFile,ForReading)
Do While objFile.atendofstream<>true
	strLineRead=objFile.Readline
	arrLine=split(strLineRead,",")
	If arrLine(1)=chr(34)&"1"&chr(34) then
		objOutputFile1.WriteLine strLineRead
	Elseif arrLine(1)=chr(34)&"2"&chr(34) then
		objOutputFile2.WriteLine strLineRead
	End If
Loop
objOutputFile1.close
objOutputFile2.close

Open in new window

0
 
TakedaTCommented:
Actually, after looking again.  Does your file have quotes around each value or not?  I changed the code for without quotes.
Const ForReadIng = 1, ForWriting = 2, ForAppending = 8

set objFSO=createobject(scripting.filesystemobject)
strSourceFile=test.csv
strOutputFile1=1.csv
strOutputFile2=2.csv
Set objOutPutFile1 = objFSO.CreateTextFile(strOutputFile1,True)
Set objOutPutFile2 = objFSO.CreateTextFile(strOutputFile2,True)

Set objFile=objFSO.opentextfile(strSourceFile,ForReading)
Do While objFile.atendofstream<>true
	strLineRead=objFile.Readline
	arrLine=split(strLineRead,,)
	If arrLine(1)=1 then
		objOutputFile1.WriteLine arrLine(0)&,&DL&arrLine(2)&arrLine(3)&arrLine(4)
	Elseif arrLine(1)=2 then
		objOutputFile2.WriteLine arrLine(0)&,&PA&arrLine(2)&arrLine(3)&arrLine(4)
	End If
Loop
objOutputFile1.close
objOutputFile2.close

Open in new window

0
The new generation of project management tools

With monday.com’s project management tool, you can see what everyone on your team is working in a single glance. Its intuitive dashboards are customizable, so you can create systems that work for you.

 
Steve KnightIT ConsultancyCommented:
If you wanted a batch solution" then this will do it with the data as it stands.... the potential issue being that multiple delimiters with no data are seen as one :

@Echo off
set sourcefile="C:\input.csv"
set outdir=c:\
set outfile=outfile.csv

for /f "tokens=1-5 delims=," %%a in ('type %sourcefile%') do (
  if "%%b"=="1" (echo %%a,DL%%c%%d)>>"%outdir%\1_%outfile%"
  if "%%b"=="2" (echo %%a,PA%%c%%d)>>"%outdir%\2_%outfile%"
)
0
 
t0t0Commented:
Here is MY solution....

NOTE: Be sure to change the names: FILE.CSV and NEWFILE.CSV to whatever your own filenames are named.


@echo off
(for /f "tokens=1-5 usebackq delims=," %%a in ("file.csv") do (
   if "%%b"=="1" echo %%a,DL%%c%%d
   if "%%b"=="2" echo %%a,PA%%c%%d
))>"newfile.csv"
0
 
t0t0Commented:
Here's an interactive version:

NOTE: Name the batch file PARSE.BAT and run it in DOS like this:

   PARSE file.csv newfile.csv

(Use double-quotes around filenames if they contain spaces)


@echo off
(for /f "tokens=1-5 usebackq delims=," %%a in ("%~1") do (
   if "%%b"=="1" echo %%a,DL%%c%%d
   if "%%b"=="2" echo %%a,PA%%c%%d
))>"%~2"

0
 
Steve KnightIT ConsultancyCommented:
Bizarre, I could have sworn that said earlier that he wanted them split into two different files for the "1" and "2" bits... clearly not, and the only reason why I split it all up like that!!

oh well!


I couldn't think of an easy way around the multiple seperator issue in batch unless you know of a way t0t0, i.e. if there is data in the state and country fields at present it will throw the data into %%e etc.
Steve
0
 
Steve KnightIT ConsultancyCommented:
Ahh maybe I am not going (quite so) mad, I read it in the VBScript code answers from TakedaT, doh!  Without the split into two files it effectively comes back down to what t0t0 has posted so I'll leave you with him.  Must RTFQ.
0
 
t0t0Commented:
dragon-it

I'm aware of the blank csv field issue (see below) and it's one which I had yet to address - like a million other things.

The %%c%%d works here providing, and ONLY providing, either a State is specified OR a Country is specified - but NOT both.

Otherwise, it can be parsed using the following 3 constructs:

   tokens=*

   set line="!line:,=","!"

   tokens=1-5 delims=,

as in the code below - which correctly assigns each field including blank ones:



@echo off
setlocal enabledelayedexpansion

(for /f "tokens=* usebackq" %%A in ("file.csv") do (
   set Line=%%A
   set Line="!Line:,=","!"

   for /f "tokens=1-5 delims=," %%a in ("!Line!") do (
      if "%%~b"=="1" echo %%~a,DL%%~c%%~e
      if "%%~b"=="2" echo %%~a,PA%%~d%%~e
   )

))>newfile.csv
exit /b
0
 
t0t0Commented:
rkeith2412

Please try my code (as described in my previous comment)

NOTE: As before, don't forget to change FILE.CSV and NEWFILE.CSV to filenames of your own choice.



@echo off
setlocal enabledelayedexpansion

(for /f "tokens=* usebackq" %%A in ("file.csv") do (
   set Line=%%A
   set Line="!Line:,=","!"

   for /f "tokens=1-5 delims=," %%a in ("!Line!") do (
      if "%%~b"=="1" echo %%~a,DL%%~c%%~e
      if "%%~b"=="2" echo %%~a,PA%%~d%%~e
   )

))>newfile.csv
0
 
AmazingTechCommented:
I would tend to setup the line variable this way in this case.

    Set Line=!Line:,,=,"",!

This is so if the csv has or doesn't have quotes the script would work either way.
0
 
Steve KnightIT ConsultancyCommented:
Good idea there AmazingTech, i.e. just add "" inbetween a double comma.  Was intending sorting out something out like t0t0/your suggestions here in my original script but it MAY not be needed anyway if the data is always as per the demo data and b) ran out of time frankly.

Steve
0
 
Bill PrewCommented:
Just be aware that if the input file had more than 2 commas in a row the above replacement of ,, to ,"", will only replace the first matching pair of commas.  For example:

[c:\] set var1=123456789,1,,,12345
[c:\] echo %var1%
123456789,1,,,12345
[c:\] set var2=%var1:,,=,"",%
[c:\] echo %var2%
123456789,1,"",,12345

To get them all you would need multiple replacements, either in a FOR loop, or just one after the other.

~bp
0
 
Bill PrewCommented:
Here's a BAT version that will handle quoted or non quoted input fields.  Currently it always writes the output unquoted, since that's what the sample given showed.  The ParseIt routine could be modified if a different format is desired.

Defaults for the input and output file name can be coded into the BAT script, but it also allows passing this in to the BAT file when it is run, right on the command line.  So if you saved this script as EE25485303.BAT, then you can run it like this:

EE25485303 "c:\mydir1\myinput.txt" "c:\mydir2\myoutput.txt"

Hope this helps,

~bp
@echo off
setlocal EnableDelayedExpansion
 
REM If input file passed on command line use it, else use default
set InputFile=c:\temp\EE25485303\input.txt
if not [%1] == [] set InputFile=%~1
 
REM If output file passed on command line use it, else use default
set OutputFile=c:\temp\EE25485303\output.txt
if not [%2] == [] set OutputFile=%~2
if exist "%OutputFile%" del "%OutputFile%"
 
REM Loop reading each line from input file
for /f "usebackq tokens=*" %%A in ("%InputFile%") do (
  REM Replace any comma pairs with a comma, two double qoutes, and a comma.
  REM This is because BAT files parse repeating delims as a single delim
  set LineIn=%%A
  for /l %%B in (1,1,5) do set LineIn=!LineIn:,,=,"",!
 
  REM Call parse routine to inspect line and reformat
  (call :ParseIt !LineIn!) >> "%OutputFile%"
)
exit /b
 
:ParseIt
  REM Reformat input line pased on value of second field
  if "%~2" == "1" echo %~1,DL%~3%~5
  if "%~2" == "2" echo %~1,PA%~4%~5
  exit /b

Open in new window

0
 
rkeith2412Author Commented:
Works perfectly.  Thank you.  Sorry it took so long to get back to this.
0
 
t0t0Commented:
Did my code in 28408701 not work then?

Curious after so much effort as to why i didn't receive at least a share of the points. Afterall, you were presented with two working options - a vb solution as well as a DOS bach file solution (which is what you asked for).

Please clarify whether my code met or didn't meet your requirements and why as it addressed (and solved) and age old issue regarding blank fields.
0
All Courses

From novice to tech pro — start learning today.