Extracting Elements from a line in a file in a Windows Batch Scirpt

Julie Kurpa
Julie Kurpa used Ask the Experts™
on
Using Windows Batch Scripting, I'm working to pull certain strings from an error log so I can build a SQL Statement.  

So far, I've gotten what I need into a file on one continuous line.  It should be that way each time I pull in an error.  I'm interested in pulling out only the UPDATE statements and certain elements related to each one.

The line may contain one or several INSERT, UPDATE or DELETE statements peppered throughout so it's very tough to decide what to search for.  

As I pull in the strings I want for each UPDATE statement, I'll build the SQL and call a stored procedure to do what I need to do.  Then return to the Batch Script to get the next UPDATE statement (if there is one).

I've already managed to get a Count of the number of UPDATE statements so I can use that for a loop.

The line below is an example of what I've managed to place in a file (all one continuous line).

} SQL Exception (state )(using SQL: "[UPDATE owner.tableName SET i_unwanted_column1=yyyy,i_unwanted_column2=yyy,i_unwanted_column3=yyyyy,d_unwanted_column4=to_timestamp('1900-01-01 00:00:00','yyyy-mm-dd hh24:mi:ss.ff3') WHERE i_pkey_column = yyyyy , UPDATE owner.tableName SET i_unwanted_column1=xxxx,i_unwanted_column2=xxxx,i_unwanted_column3=xxxxx,d_unwanted_column4=to_timestamp('2018-10-29 00:00:00','yyyy-mm-dd hh24:mi:ss.ff3') WHERE i_pkey_column = xxxxxx ]"):

I need to extract the "owner", "tableName" (found after the UPDATE), the primary key column name and it's value (located after the WHERE) for each UPDATE statement.

I've been looking at many websites but none have helped me put together something that I can get to work.

I'm a novice at Windows Batch scripting and only now exploring Powershell.  In fact, it's with powershell that I've been able to extract the line I need from the error log into a file all on one line and count the number of UPDATE statements.  :)

Thank you.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Julie KurpaSr. Systems Programmer

Author

Commented:
Here's what I've gotten so far if that helps:

@echo off

setlocal enabledelayedexpansion


REM Set up Variables
set PSscriptsPath=C:\Powershell_Scripts
set ISMCurrentLogPath=E:\Error_Logs\cdc*.log
set ISMWorkLog=E:\work_area\cdc_Work_log.txt
set ISMErrMsg=E:\work_area\cdc_Error_Msg.txt


REM Clean Up
if exist %ISMWorkLog% del %ISMWorkLog%
if exist %ISMErrMsg% del %ISMErrMsg%


REM Get name of most current Error log and copy into a work area.
for /f "tokens=*" %%a in (' powershell "Get-ChildItem -Path %ISMCurrentLogPath% | Sort-Object LastAccessTime -Descending | Select-Object -First 1 | Select-Object FullName" ') do set "ISMCurrentLogName=%%a"

copy %ISMCurrentLogName% %ISMWorkLog%

REM Execute powershell script that extracts all errors from the work file and writes it to an error file.
REM The error is constantly repeating so we only want the first one we find.
REM I could only figure out how to do this using a separate PS script.  

powershell -ExecutionPolicy remotesigned -command %PSscriptsPath%\SearchBetweenStrings2.ps1 %ISMWorkLog% %ISMErrMsg%


REM Test size of error messages file. If zero then no errors were found so exit script.  
for /f %%i in ("%ISMErrMsg%") do set size=%%~zi
  if %size% leq 6 (
     echo No Errors found.  Exiting
     goto :EOF
  )


REM Count how many UPDATE statements we have in the error message.  This will help us with the loop to parse each into SQL statements.
for /f %%b in (' powershell "(Get-Content -Path %ISMErrMsg% | Select-String -pattern 'UPDATE owner.' -AllMatches).matches.count" ') do set "Update_Count=%%b"

REM Now parse out the elements we want so we can build our SQL statement.  Ugh how do I do this???
Julie KurpaSr. Systems Programmer

Author

Commented:
I've managed to extract the string between the brackets in the text file by calling a powershell script (I pass the input file name and output file name to the PS script):

$importPath=$args[0]
$outputPath=$args[1]

$string1="["
$string2="]"
$pattern = [regex] "\[([^\[]*)\]"

$file = Get-Content $importPath

[regex]::match($file, $pattern).Groups[1].Value > $outputPath


Below is what I now have to parse out.  

UPDATE owner.tableName SET i_unwanted_column1=yyyy,i_unwanted_column2=yyy,i_unwanted_column3=yyyyy,d_unwanted_column4=to_timestamp('1900-01-01 00:00:00','yyyy-mm-dd hh24:mi:ss.ff3') WHERE i_pkey_column = yyyyy , UPDATE owner.tableName2 SET i_unwanted_column1=xxxx,i_unwanted_column2=xxxx,i_unwanted_column3=xxxxx,d_unwanted_column4=to_timestamp('2018-10-29 00:00:00','yyyy-mm-dd hh24:mi:ss.ff3') WHERE i_pkey_column2 = xxxxxx

Need to get the owner, tableName,  i_pkey_column and value for i_pkey_column for each UPDATE.

In this case, the first loop would end up with:

OWNER=owner
TABLE=tableName
COLUMN=i_pkey_column
VALUE=yyyyy

(I pass this to my stored procedure and return...this I know how to do)
(next loop gets the next UPDATE statement contents):

OWNER=owner
TABLE=tableName2
COLUMN=i_pkey_column2
VALUE=xxxxxx

(run my stored procedure again)
(No more to read, exit loop).  

I'll off for the weekend and won't look at this again until Monday when I start hacking at it again.

Commented:
Aside from doing it in a higher level language, even just VB script if you want it commanding  or the whole thing in powershell from a quick read there I think you most easily would read the whole line using for using delims of space and then loop through, when you get to word update then get the next value and keep looping through until you get WHERE then take next value again and write them out where you need to or put in a variable.

Steve
Bootstrap 4: Exploring New Features

Learn how to use and navigate the new features included in Bootstrap 4, the most popular HTML, CSS, and JavaScript framework for developing responsive, mobile-first websites.

Julie KurpaSr. Systems Programmer

Author

Commented:
hmmmm Ok so I just go ahead and make the space the delimiter and just loop through each element in the line?  Then test for the UPDATE or WHERE and do a subroutine to do a minor loop until I pull the values I want, then return to the big loop.  I can see myself getting confused by it but I will try on Monday.

I wrote this same routine in Unix ksh.  To summarize, I had it doing  the following:
(1) find on UPDATE
(2) pull everything following the UPDATE  using the dot as the delimiter  
(3) awk {print $1, $2}
(4) Find the WHERE
(5) get everything following the WHERE the space is the delimiter
(6) awk {print $1, $3)
(7) Send my information to the stored procedure.
(8) Test if I've reached the last UPDATE, if not, loop around and get the next one.  

I need to be able to do this in Windows.  Batch scripting has been my only experience in this environment and I'm a novice.   I took a class on Powershell and want to use it but I'm a novice there as well.    

After trying several things and reviewing various websites on Batch scripting, I have not been able to successfully put anything together for what I need to do here.   Very frustrating.

Commented:
I'll show you how, just not typing that out on phone, will do on pc later!

Commented:
I think this will work from your example entry above.... lots of questionables using batch when ' " <> and other characters could be involved in the data your parse though:

@echo off
setlocal enabledelayedexpansion
set file=info.txt
set updateflag=0
set whereflag=0
set wherecolumn=
for /f "tokens=* delims=" %%a in (%file%) do (
   for %%A in (%%a) do (
      IF !updateflag!==1 (
        set updateflag=0
        ECHO UPDATE %%~A
      )
      IF "%%~A"=="UPDATE" set updateflag=1
      IF !whereflag!==1 (
        set whereflag=2
        set wherecolumn=%%~A
      ) ELSE (
        IF !whereflag!==2 (
          set whereflag=0
          ECHO WHERE !wherecolumn! = %%~A
        )
      )
      IF "%%~A"=="WHERE" set whereflag=1
   )
)

Open in new window


Based on this text in info.txt:

UPDATE owner.tableName SET i_unwanted_column1=yyyy,i_unwanted_column2=yyy,i_unwanted_column3=yyyyy,d_unwanted_column4=to_timestamp('1900-01-01 00:00:00','yyyy-mm-dd hh24:mi:ss.ff3') WHERE i_pkey_column = yyyyy , UPDATE owner.tableName2 SET i_unwanted_column1=xxxx,i_unwanted_column2=xxxx,i_unwanted_column3=xxxxx,d_unwanted_column4=to_timestamp('2018-10-29 00:00:00','yyyy-mm-dd hh24:mi:ss.ff3') WHERE i_pkey_column2 = xxxxxx

Open in new window


it produces this output:

D:\temp\EE29127933>findinfo
UPDATE owner.tableName
WHERE i_pkey_column = yyyyy
UPDATE owner.tableName2
WHERE i_pkey_column2 = xxxxxx

Open in new window

Commented:
That loops through lines in file info.txt - you could do away with that loop if you just have one line and can put the data straight into variable for parsing with the second loop.

The inner loop splits the line up and stores a flag to say whether the previous item found was "UPDATE" or "WHERE" and then writes out the results.  You can obviously write out what you want to at that stage:

For Update %%~A is the bit after the UPDATE command
For Where !wherecolumn! is the first bit after the WHERE command and %%~A the next bit.

Steve
"Batchelor", Developer and EE Topic Advisor
Top Expert 2015
Commented:
In PowerShell you can use something similar to this test code:
$line = @'
} SQL Exception (state )(using SQL: "[UPDATE owner.tableName SET i_unwanted_column1=yyyy,i_unwanted_column2=yyy,i_unwanted_column3=yyyyy,d_unwanted_column4=to_timestamp('1900-01-01 00:00:00','yyyy-mm-dd hh24:mi:ss.ff3') WHERE i_pkey_column = yyyyy , UPDATE owner.tableName SET i_unwanted_column1=xxxx,i_unwanted_column2=xxxx,i_unwanted_column3=xxxxx,d_unwanted_column4=to_timestamp('2018-10-29 00:00:00','yyyy-mm-dd hh24:mi:ss.ff3') WHERE i_pkey_column = xxxxxx ]"): 
'@

$line -replace '.*"\[(.*)\]".*', '$1' -split "UPDATE " |
  select -Skip 1 | # $_ now is "owner.table SET blabla WHERE column = value blabla"
  ? { $_ -match '(?<owner>[^.]+)\.(?<table>\S*) SET .* WHERE (?<column>\S+) = (?<value>\S+)' } |
  % {
    write-host $matches.owner, $matches.table, $matches.column, $matches.value
  }

Open in new window

Note: Please do not try too hard to mix cmd and PowerShell batch scripting. You should stay inside of PowerShell as much as possible, otherwise you get all kind of issues with special characters like parens, braces, percent aso.

Commented:
Agreed there Qlemo!
Julie KurpaSr. Systems Programmer

Author

Commented:
Thanks Steve Knight. :)

I'm trying to test your code and for some reason it's not getting into the first FOR loop. Is there something wrong with my FILE variable?  

@echo off
setlocal enabledelayedexpansion
set file=E:\myTestDir\Error_Code_wrk1.txt
set updateflag=0
set whereflag=0
set wherecolumn=
echo checking for loop  <-- I see this OK
for /f "tokens=* delims=" %%a in (%file%) do (
   echo inside first for loop  <-- but not getting this echo.
  for %%A in (%%a) do (
      IF !updateflag!==1 (
        set updateflag=0
        ECHO UPDATE %%~A
      )
      IF "%%~A"=="UPDATE" set updateflag=1
      IF !whereflag!==1 (
        set whereflag=2
        set wherecolumn=%%~A
      ) ELSE (
        IF !whereflag!==2 (
          set whereflag=0
          ECHO WHERE !wherecolumn! = %%~A
        )
      )
      IF "%%~A"=="WHERE" set whereflag=1
   )
)
Julie KurpaSr. Systems Programmer

Author

Commented:
I just put double quotes around the file path and now I'm getting inside the loop.

set file="E:\myTestDir\Error_Code_wrk1.txt"
Qlemo"Batchelor", Developer and EE Topic Advisor
Top Expert 2015

Commented:
That would mean that E:\myTestDir\Error_Code_wrk1.txt is empty.
Qlemo"Batchelor", Developer and EE Topic Advisor
Top Expert 2015

Commented:
Double quotes inside of for /f make it a string, not a file name anymore. Your loop is running thru the text "E:\myTestDir\Error_Code_wrk1.txt" now ;-).
Julie KurpaSr. Systems Programmer

Author

Commented:
OK I understand now about the quotes.  

I just tripple checked and the file is not empty...it just has one line...but it's a long line though.... 1,389 characters.    It's using real data instead of the one I posted.   Some lines will be much, much longer.  

Is there some limitation on how much can be pulled in from one line?
Qlemo"Batchelor", Developer and EE Topic Advisor
Top Expert 2015

Commented:
AFAIK there is no limitation of the line size. Try if adding a line break at the end of the file helps. I can't recall, but there might be a bug involved with single-line files as input.

Commented:
Run it from an open CMD prompt rather than launching the batch from explorer and remove/comment out the @echo off might give you a clue.  Is there anything masty batch file wise in the line you are processing.  As we both said if there is anything potentially batch-file breaking in there might be better to use just powershell or VBScript with none of those issues.

Using < character in your ECHO statement will cause an error BTW as it is trying to pipe file name into your echo command...

What file are you are running this over, I suspect you might have " " characters in it perhaps?  if so might be able to get around it

  set data=%%~a
  set data=!data:"=!
  for %%A in (!data!) do (

Open in new window



Steve

Commented:
I get this then using your same filename and a copy of the line from your original question in that text file...

checking for loop  -- I see this OK
inside first for loop  -- but not getting this echo.
WHERE i_pkey_column = yyyyy
UPDATE owner.tableName
WHERE i_pkey_column = xxxxxx
Julie KurpaSr. Systems Programmer

Author

Commented:
It works when I use the test data I uploaded.   I just tweaked my real data to protect the innocent so that you get the full format instead of the shortened one and uploading the file.

There aren't any double quotes in the data but there's a 100% chance that there could be.
Julie KurpaSr. Systems Programmer

Author

Commented:
Qlemo,
Your code works and gives me exactly what I need.  How do I have it read the content of the file instead of specifying the data in the code?
Does @ mean it's an array?
Julie KurpaSr. Systems Programmer

Author

Commented:
Thanks Steve,
I am indeed running this from the CMD window.  The <-- was not included in the code when it ran.  I put it there to direct your attention to those lines when I posted it in EE.  :)

Commented:
I'd go with Powershell for this one.... Batch is great and can be cobbled into doing most things but for parsing specific data with potential unknown characters in go for that PS or VBS option!
Julie KurpaSr. Systems Programmer

Author

Commented:
Qlemo,
I got your PS code working with my batch script using my real data!   Here's my output:

myname my_table1 pkey_col 88888888
myname my_table2 pkey_col 88888889

It seems there are 3 options to decide on from here (not sure which is best practice):

1. Write the output from the PS script to a file and read the file from the batch script to pass each line to my Oracle stored procedure?
2. Or can can the PS script execute the oracle stored procedure and pass each line to the procedure?
3. Can the PS script return the values back to the batch script so that the batch script calls the Oracle stored procedure?
Qlemo"Batchelor", Developer and EE Topic Advisor
Top Expert 2015

Commented:
Option 2 is best. If Oracle*Net is installed on the machine, you can use the .NET adapters directly.
Or send the prepared SQL cmd strings to sqlplus as you would do with a cmd batch.
But that is now a completely different topic ;-) - please post as a new question.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial