• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 719
  • Last Modified:

batch file to read csv files, invert fields

I have a batch file which currently reads all the csv files in a folder, strips certain characters within the files, and saves them with their original names. Now, I also need to look in the file and see if field 7 has no numbers in it, but field 8 does. If so, I need to clear the contents of field 7 and replace it with the contents of field 8. Field 8 can remain unchanged. This is on a W2008 Server 64 bit. Thank you.
0
pkromer
Asked:
pkromer
  • 8
  • 7
  • 5
  • +1
1 Solution
 
Steve KnightIT ConsultancyCommented:
OK.  Before you gave an example of the data as:

085856,,170545,30537,F/C,1,0000000            
085674,,232455,4356,F/C,1,0000000    

So that makes 7 fields in total that I can see, not 8, or is this different data?

Also can you explain about what you mean by being no numbers in field 7, does that mean if it is 0000000 ?  Is there always a field 8?

Steve

We can read a file and look at those entries using another for command:

@echo off
for /f "tokens=1-8* delims=," %%A in ('type afile.csv') do (
  echo Fields are %%a %%b %%c %%d %%e %%f
  echo Field 7 is %%g
  echo Field 8 is %%h
  echo and the rest of the line is %%i
  echo We can put the rest of the line back togather with
  echo %%a,%%b,%%c,%%d,%%e,%%f,%%g,%%h,%%i
)

0
 
pkromerAuthor Commented:
That data i gave before was not complete lines, i shortened it for ease of viewing. there are about 30 fields in each line. This would be an example of what I am talking about...

085856,,170545,30537,F/C,1,0000000,12344,billy,etc            
085674,,232455,4356,F/C,1,WordsThatShouldntBeHere,65463,fred,etc  
0
 
pkromerAuthor Commented:
So field 7 may either have words OR blank, but either way, if there are numbers in field 8, erase what's in 7 and put what's in 8 there. IF 7 has words AND numbers, erase all words leaving only the number.

This seems compliacted, I don't even know if a batch file can do this. But I'll bet you do :-)
0
Get your Conversational Ransomware Defense e‑book

This e-book gives you an insight into the ransomware threat and reviews the fundamentals of top-notch ransomware preparedness and recovery. To help you protect yourself and your organization. The initial infection may be inevitable, so the best protection is to be fully prepared.

 
Steve KnightIT ConsultancyCommented:
I can do but it is 11:30pm and currently have a baby on my lap who decided to wake up so will look tomorrow if no-one else gets there first!

Is a vbscript ok to go with the batch?
0
 
pkromerAuthor Commented:
Yes, i can just run it as a .vbs file instead of .bat, right? And will that run on a W2008 Server 64 bit?
0
 
pbolintxCommented:
it seems like this could be easily done with an awk script.  You could download gawk.exe for windows from
http://gnuwin32.sourceforge.net/packages/gawk.htm

Then create a file called awkfile.awk as:
BEGIN { FS = "," }
{if ($7 ~ /[0-9]/) {
      $7=$8
      }
for(i=1;i<=NF;i++) {
      if (i < NF)
            printf $i FS
      else
            printf $i RS
      }
}


and then run
\path\to\gawk -f \path\to\awkfile.awk inputfilename > outputfilename

The only part this doesn't satisfy is that the outputfile name will be different form the input file name, but thats easily renamed.  I tested with the below input:

1111,111,111,111,111,111,555,666
1111,111,111,111,111,111,nonumbers,888

and output is:
1111,111,111,111,111,111,666,666
1111,111,111,111,111,111,nonumbers,888
0
 
pbolintxCommented:
Whoops. just realized thats the opposite of what you asked for :)
heres the corrected awkfile (also a slightly faster version)

BEGIN { FS = "," }
{if ($7 !~ /[0-9]/) {
      $7=$8
      for(i=1;i<=NF;i++) {
            if (i < NF)
                  printf $i FS
            else
                  printf $i RS
      }
}
else
      print $0
}

I expanded my input test this time also:
1111,111,111,111,111,111,555,666
1111,111,111,111,111,111,nonumbershere,888
1111,111,111,111,111,111,555a,777
1111,111,111,111,111,111,stillnonumbers,randomstring

output:
1111,111,111,111,111,111,555,666
1111,111,111,111,111,111,888,888
1111,111,111,111,111,111,555a,777
1111,111,111,111,111,111,randomstring,randomstring
0
 
pkromerAuthor Commented:
I may try that, but i really don't want to add any programs to the server to do it. Is awk even 64 bit compatible?
0
 
pbolintxCommented:
I could understand that, I just figured I would offer it as a possibility.  Gawk is an unobtrusive, relatively small program (its a 350kb executable) with no setup or installation, you just drop the executable somewhere.  And when it comes to text processing, it's generally very good at that.  

The 64 bit part, well, I didn't really think about that part.  Another plan foiled by a simple fact. :)
0
 
pkromerAuthor Commented:
All good, like I said I may try it if nothing else comes up. Thanks very much for your efforts.
0
 
t0t0Commented:
Interesting question. Listening...
0
 
pbolintxCommented:
I'm going to go with, this is not the prettiest code ive ever written, but it gets the job done.. it might need some slight modifications to work in your script, but this is basically it.
I really don't like having to use the for loops to write the output, but the ado CSV text driver is read-only, so its about the only option.

The wscript.Arguments(0) would grab the file name if it was passed as a command line parameter.  I dont know exactly how your script works, so I dont know if you need that or not.
0
 
pbolintxCommented:
On Error Resume Next
Const adOpenDynamic =  2
Const adOpenStatic = 3
Const adLockOptimistic = 3
Const adCmdText = &H0001

Set objConnection = CreateObject("ADODB.Connection")
Set objRecordSet = CreateObject("ADODB.Recordset")
Set objFSO = CreateObject("Scripting.FileSystemObject")

strFileName = Wscript.Arguments(0)
strFileName = "testdata.csv"
strPathtoTextFile = "C:\cygwin\home\pbolin\"
strFileName2 = strFileName & ".output"

Set objFile = objFSO.CreateTextFile(strPathtoTextFile & strFileName2)

objConnetion.Mode = 3
objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
          "Data Source=" & strPathtoTextFile & ";" & _
          "Extended Properties=""text;HDR=YES;FMT=Delimited"""

objRecordset.Open "SELECT * FROM " & strFileName, _
          objConnection, adOpenDynamic, adLockOptimistic, adCmdText

Do Until objRecordset.EOF
    found = 0
    For i = 0 To 9
          if instr(objRecordset.Fields.Item("f7"),cStr(i)) then
                found = 1
                For j = 0 to objRecordSet.Fields.count
                      if j = 6 then
                            objFile.Write(objRecordset.Fields(7).value & ",")
                      else
                            if j < objRecordSet.Fields.count - 1 then
                                  objFile.Write(objRecordset.Fields(j).value & ",")
                            else
                                  objFile.Write(objRecordset.Fields(j).value)
                            end if
                      End If
                Next
            objFile.Writeline("")
            break
          End If
    Next
    if found <> 1 then
          For j = 0 to objRecordSet.Fields.count
              if j < (objRecordSet.Fields.count - 1) then
                         objFile.Write(objRecordset.Fields(j).value & ",")
              else
                      objFile.Write(objRecordset.Fields(j).value)
              End if
          Next
    objFile.Writeline("")
    end if
   
    objRecordset.MoveNext
Loop
0
 
pbolintxCommented:
ok.. that doesnt work either.. i wont post again until i have something that is guaranteed to work.. sorry
0
 
pbolintxCommented:
allright, this one is well tested.  important things of note:

1.  it handles null values.
2.  It handles a quoted string with a comma in it.
3. It really, really wants a schema.ini defined.  I dont see any reason why they cant all be defined as text fields for our purposes, so I did not test with other data  types.  I do not anticipate any problems if there were though.
4.  The files must have the first row as header values.  I tried setting the connection string to HDR=NO, but it still treated the first row as headers.

schema.ini:
[testdata.csv]
Col1=A Text Width 100
Col2=A Text Width 100
Col3=A Text Width 100
Col4=A Text Width 100
Col5=A Text Width 100
Col6=A Text Width 100
Col7=A Text Width 100
Col8=A Text Width 100
Col9=A Text Width 100

test input:
f1,f2,f3,f4,f5,f6,f7,f8,f9
1111,111,111,111,111,111,"fdfh555",666
1111,111,111,111,111,111,"stillnonumbers","5452","blah"
1111,111,111,111,111,111,"nonumbershere",888
222,222,222,222,222,222,"noth,ing","nsddd,saff"
222,222,222,222,222,222,"noth,ing4","nsddd,saff"
1111,111,111,111,111,111,555a,777
aaaa,bbbb,cccc,dddd,eeee,fffff,ggggg,hhhhh

test output:
1111,111,111,111,111,111,666,666,""
1111,111,111,111,111,111,stillnonumbers,5452,blah
1111,111,111,111,111,111,nonumbershere,888,""
222,222,222,222,222,222,"noth,ing","nsddd,saff",""
222,222,222,222,222,222,"nsddd,saff","nsddd,saff",""
1111,111,111,111,111,111,777,777,""
aaaa,bbbb,cccc,dddd,eeee,fffff,ggggg,hhhhh,""
tester.vbs
0
 
pkromerAuthor Commented:
I altered it to this...

strFileName = "*.csv"
strPathtoTextFile = ""

because it needs to go through all the csv files and deal with them, but it didn't work.

And, a process called wscript.exe runs continuously after i run the .vbs file, using 100% of my resources.

Any ideas?
0
 
Steve KnightIT ConsultancyCommented:
pkromer - have been busy on other things since looking at this initially.  If you don't get anywhere with the methods here I might have some time tomorrow to look.

Steve
0
 
pkromerAuthor Commented:
Still no luck with that script, pbolintx. dragon-it, do you think this could be done along the lines of that last thing you did, which deals with the same files?

http://www.experts-exchange.com/Programming/Languages/Scripting/Shell/Batch/Q_25025469.html
0
 
Steve KnightIT ConsultancyCommented:
Hadn't forgot about this, have it bookmarked but have been busy sorry.  Think in pure batch it would be unnecessarily complex but can do a hybrid batch / vbs or maybe just vbs.  Will take a look tomorrow if I can

Steve
0
 
pkromerAuthor Commented:
Cool, thanks much.
0
 
Steve KnightIT ConsultancyCommented:
pkromer - sorry this one seemed to have got lost along the way and missed that it needed some feedback.  Did you get anywhere, and if not let us know if you want some more help with it.
Steve
0
 
t0t0Commented:
Please allow me to make a late contribution to this question.

I notice this question has not been answered. I do not want to tread on peoples' toes however, such an interesting problem merited a full resolve.

Some of the complexities of this problem required a thorough work-around to enable DOS to successfully process CSV files containing blank fields (especially at the end of lines) as well as spaces.

Having spent a whole day writing, testing and debugging this code, I wonder if it's possible to bag the points with this late offering.

I have verified this code works 100% with the supplied data as well as test data which includes spaces.



@echo off
setlocal enabledelayedexpansion

if "%~1"=="?" (
   call :help
   exit /b 1
)

if "%~1"=="/?" (
   call :help
   exit /b 1
)

if not "%~1"=="" (
   set csvfile=%~1
) else (
   set csvfile=
   set /p csvfile=Enter name of CSV file:

   if "!csvfile!"=="" (
      echo Name of CSV file required. Aborted.
      exit /b 1
   )
)

if not exist "!csvfile!" (
   if exist "!csvfile!.csv" (
      set csvfile=!csvfile!.csv
   ) else (
      echo.
      echo Cannot find CSV file !csvfile!. Aborted.
      exit /b 1
   )
)



echo Processing file: !csvfile!

del "%~0.tmp" 2>nul

for /f "tokens=* usebackq" %%a in ("!csvfile!") do (
   set string=%%a
   set string="[!string:,=]","[!]"
   call :parse !string!
   echo !string!>>"%~0.tmp"
)
move "%~0.tmp" "!csvfile!"
exit /b 0



:parse
   set count=0

   :loop
      set /a count+=1
      set token=%~1
      set token[!count!]=!token:~1,-1!
      shift
   if not "%~1"=="" goto :loop

   echo.!token[7]! | findstr "[0-9]" >nul
   if !errorlevel!==0 (
      call :RemoveNonNumerics
   ) else (
      set token[7]=
      echo.!token[8]! | findstr "[0-9]" >nul

      if !errorlevel!==0 set token[7]=!token[8]!
   )

   set string=!token[1]!
   for /l %%a in (2,1,!count!) do set string=!string!,!token[%%a]!
goto :eof



:RemoveNonNumerics
   set numeric=

   echo !token[7]!>len
   for %%a in (len) do set /a len=%%~za - 3

   for /l %%a in (0,1,!len!) do (
      if "!token[7]:~%%a,1!" geq "0" (
         if "!token[7]:~%%a,1!" leq "9" set numeric=!numeric!!token[7]:~%%a,1!
      )
   )

   set token[7]=!numeric!
goto :eof



:help
   echo.
   echo Call using: %~n0 "filename"
   echo.
   echo Where filename is the name of your CSV file.
   echo.
   echo Can be CALLed from another batch feil as a function
   echo when processing multiple CSV files.
   echo.
goto :eof

0
 
t0t0Commented:
Oops! - Just noticed an editing error. This has been corrected below:


@echo off
setlocal enabledelayedexpansion

if "%~1"=="?" (
   call :help
   exit /b 1
)

if "%~1"=="/?" (
   call :help
   exit /b 1
)

if not "%~1"=="" (
   set csvfile=%~1
) else (
   set csvfile=
   set /p csvfile=Enter name of CSV file:

   if "!csvfile!"=="" (
      echo Name of CSV file required. Aborted.
      exit /b 1
   )
)

if not exist "!csvfile!" (
   if exist "!csvfile!.csv" (
      set csvfile=!csvfile!.csv
   ) else (
      echo.
      echo Cannot find CSV file !csvfile!. Aborted.
      exit /b 1
   )
)



echo Processing file: !csvfile!

del "%~0.tmp" 2>nul

for /f "tokens=* usebackq" %%a in ("!csvfile!") do (
   set string=%%a
   set string="[!string:,=]","[!]"
   call :parse !string!
   echo !string!>>"%~0.tmp"
)
move "%~0.tmp" "!csvfile!"
exit /b 0



:parse
   set count=0

   :loop
      set /a count+=1
      set token=%~1
      set token[!count!]=!token:~1,-1!
      shift
   if not "%~1"=="" goto :loop

   echo.!token[7]! | findstr "[0-9]" >nul
   if !errorlevel!==0 (
      call :RemoveNonNumerics
   ) else (
      set token[7]=
      echo.!token[8]! | findstr "[0-9]" >nul
      if !errorlevel!==0 (
         set token[7]=!token[8]!
         call :RemoveNonNumerics
      )
   )

   set string=!token[1]!
   for /l %%a in (2,1,!count!) do (
      set string=!string!,!token[%%a]!
   )
goto :eof



:RemoveNonNumerics
   set numeric=

   echo !token[7]!>len
   for %%a in (len) do set /a len=%%~za - 3

   for /l %%a in (0,1,!len!) do (
      if "!token[7]:~%%a,1!" geq "0" (
         if "!token[7]:~%%a,1!" leq "9" (
            set numeric=!numeric!!token[7]:~%%a,1!
         )
      )
   )

   set token[7]=!numeric!
goto :eof



:help
   echo.
   echo Call using: %~n0 "filename"
   echo.
   echo Where filename is the name of your CSV file.
   echo.
   echo Can be CALLed from another batch feil as a function
   echo when processing multiple CSV files.
   echo.
goto :eof
0
 
t0t0Commented:
Finally!! Below is the FULL version of my code. It is identical to the code immediately above except for the added coments.




:: -----------------------------------
:: EE ID 25028780 - Written by Paul Tomasi 2010
::
:: Process CSV file. See comments for
:: description. Special design
:: consideration given to:
::
:: (1) possible blank fields,
:: (2) possible spaces in data,
:: (3) possibly more than 30 fields/line,
:: (4) passing line as parameter,
:: (5) preserving correct positions of data.
:: -----------------------------------
@echo off
setlocal enabledelayedexpansion

:: -----------------------------------
:: If command line parameter is '?' or '/?'
:: then display help and exit with error 1.
:: -----------------------------------
if "%~1"=="?" (
   call :help
   exit /b 1
)

if "%~1"=="/?" (
   call :help
   exit /b 1
)

:: -----------------------------------
:: Get CSV filename from command line
:: or from user. Or, exit.
:: -----------------------------------
if not "%~1"=="" (
   set csvfile=%~1
) else (
   set csvfile=
   set /p csvfile=Enter name of CSV file:
   if "!csvfile!"=="" (
      echo Name of CSV file required. Aborted.
      exit /b 1
   )
)

:: -----------------------------------
:: Verify CSV file exists or attempt to match
:: filename to CSV extension otherwise, exit.
:: -----------------------------------
if not exist "!csvfile!" (
   if exist "!csvfile!.csv" (
      set csvfile=!csvfile!.csv
   ) else (
      echo.
      echo Cannot find CSV file !csvfile!. Aborted.
      exit /b 1
   )
)




:: -----------------------------------
:: Start of process. Delete tempprary file.
:: -----------------------------------
echo Processing file: !csvfile!
del "%~0.tmp" 2>nul

:: -----------------------------------
:: Get lines from CSV file. 'tokens=*' is the
:: only way to reliably preserve lines due to
:: blank fields. Convert lines to "[..]","[..]"
:: where '..' is the actual data (or blank).
:: ".." required for passing line as parameter.
:: [..] required when testing "%~1"=="" to
:: determine if %1 exists when determining the
:: end loop condition. Commas preserves field
:: position when line passed as parameter.
:: Pass formatted line as parameter, then
:: append processed line to temporary file.
:: When all lines processed, replace CSV file
:: with contents of temporary file and exit.
:: -----------------------------------
for /f "tokens=* usebackq" %%a in ("!csvfile!") do (
   set line=%%a
   set line="[!line:,=]","[!]"
   call :parse !line!
   echo !line!>>"%~0.tmp"
)
move "%~0.tmp" "!csvfile!"
exit /b 0



:: -----------------------------------
:: Process comma-delimited CSV lines.
:: -----------------------------------
:parse
   set count=0

   :: --------------------------------
   :: Break line down into seperate tokens
   :: stored in token[] array until there
   :: are no more SHIFTed tokens signalled
   :: by "[..]" == token / ""=="" == no
   :: more tokens. Remove " "s when reading
   :: in each token also, remove [ ]s when
   :: storing tokens in array.  
   :: --------------------------------
   :loop
      set /a count+=1
      set token=%~1
      set token[!count!]=!token:~1,-1!
      shift
   if not "%~1"=="" goto :loop

   :: --------------------------------
   :: If token[7] contains numerals then
   :: remove any non-numeric data otherwise,
   :: set it to "" and if token[8] contains
   :: numeric data then copy token[8] to
   :: token[7] and remove any non-numeric data.
   :: --------------------------------
   echo.!token[7]! | findstr "[0-9]" >nul
   if !errorlevel!==0 (
      call :RemoveNonNumerics
   ) else (
      set token[7]=
      echo.!token[8]! | findstr "[0-9]" >nul
      if !errorlevel!==0 (
         set token[7]=!token[8]!
         call :RemoveNonNumerics
      )
   )

   :: --------------------------------
   :: Re-form the line by concatonating
   :: all the tokens seperated by commas.
   :: --------------------------------
   set line=!token[1]!
   for /l %%a in (2,1,!count!) do (
      set line=!line!,!token[%%a]!
   )
goto :eof



:: -----------------------------------
:: Function. Remove non-numeric data.
:: -----------------------------------
:RemoveNonNumerics
   set numeric=

   :: --------------------------------
   :: Get length of token[7]
   :: --------------------------------
   echo !token[7]!>len
   for %%a in (len) do set /a len=%%~za - 3

   :: --------------------------------
   :: Compare each character. If it is
   :: numeric, add it to newly built token.
   :: --------------------------------
   for /l %%a in (0,1,!len!) do (
      if "!token[7]:~%%a,1!" geq "0" (
         if "!token[7]:~%%a,1!" leq "9" (
            set numeric=!numeric!!token[7]:~%%a,1!
         )
      )
   )

   :: --------------------------------
   :: Set token[7] to numeric-only token.
   :: --------------------------------
   set token[7]=!numeric!
goto :eof




:: -----------------------------------
:: Function. Display user guidance.
:: Exit with error code 1.
:: -----------------------------------
:help
   echo.
   echo Call using: %~n0 "filename"
   echo.
   echo Where filename is the name of your CSV file.
   echo.
   echo Can be CALLed from another batch feil as a function
   echo when processing multiple CSV files.
   echo.
exit /b 1
0

Featured Post

Prep for the ITIL® Foundation Certification Exam

December’s Course of the Month is now available! Enroll to learn ITIL® Foundation best practices for delivering IT services effectively and efficiently.

  • 8
  • 7
  • 5
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now