Solved

Scan files for strings of text and log into another file

Posted on 2007-04-11
73
295 Views
Last Modified: 2010-04-16
Problem: I have archives of files stored in directories by date.  For example, directory called "z:\journal\fhodict\200702\25" is one directory with 800 files/reports in them.  They are all in HL7 format viewable/modifiable in a txt file.  I need a way to scan all of these reports for a directory "z:\journal\fhodict\200702\25" .  In this scan its going to look at the top 4 lines of the file which will look like:

MSH|^~\`|DICT|020-01||FHIS|20070206235659||ORU^R01|
PID|1||001248568||BURKS^LENTON^E||
PV1|| .........................(NOT IMPORTANT)
OBR||RA070370369600|5459057|RA020006^CHEST,PORT SINGLE VW^01

What I need is a way to scan this files to grab specific parts of this header info and log it into another file so it looks like the below :

20070224CARL BETH: PRE OP|RA070540600
20070224MILDRED NEOL: LINE PLACEMENT|RA07059100
20070224LAUTERIA GAURIDO J: CVA|RA070540400
20070224RODRIGUEZ YENELISA: SIZE & DATES|RA07056700
20070224KOBOW BELLA: COPD|RA0705400700

Each directory will have it's own file log file that will be saved in as a .dat.  This will be done to many directories but I don't mind doing it one by one.  Also, this has to get rid of duplicate entries.  Is this possible?

thanks!!!
0
Comment
Question by:rortiz77
  • 37
  • 28
  • 6
  • +1
73 Comments
 
LVL 11

Expert Comment

by:alexcohn
ID: 18891289
There is no Windows built-in utility that will do the work for you. Install Perl, or Python, or maybe even cygwin with awk.
0
 

Author Comment

by:rortiz77
ID: 18891403
Well, I was thinking more along the lines of a VB script or Batch that might be able to do this.  
0
 
LVL 11

Expert Comment

by:alexcohn
ID: 18891456
there are no suitable batch commands; vbscript may be used, but it's not well suited for the task.
0
 
LVL 38

Expert Comment

by:Shift-3
ID: 18891548
Are the first elements in each header line (MSH, PID, PV1, OBR) always the same?  Are these items unique to the header lines?
0
 

Author Comment

by:rortiz77
ID: 18891576
Yes, they are always in the same order and positioning.  They also have the same number of "bars" in between.  The bars are the delimiter.  
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18891675
Given the following input:

MSH|^~\`|DICT|020-01||FHIS|20070206235659||ORU^R01|
PID|1||001248568||BURKS^LENTON^E||
PV1|| .........................(NOT IMPORTANT)
OBR||RA070370369600|5459057|RA020006^CHEST,PORT SINGLE VW^01

What would be the output?
0
 

Author Comment

by:rortiz77
ID: 18891733
Output would be...

20070206Burks Lenton E: CHEST, PORT SINGLE | RA070370369600

The 20070206 can come from the files system date somehow or by the 1st line in MSH where is has that long string of 20070206235659.
0
 

Author Comment

by:rortiz77
ID: 18891819
Or the date, 20070206, can be something that can be manually coded to be tagged in the front of the output and i'll modify it per directory it scans.  It would make that part easier :-)
0
 
LVL 38

Expert Comment

by:Shift-3
ID: 18891984
@echo off
setlocal

set targetdir=z:\journal\fhodict\200702\25

set outputfile=output.txt

for /F "tokens=* usebackq" %%G in (`dir "%targetdir%\*.hl7" /B`) do (
 for /F "tokens=1,2,3,4,5,6 delims=|^" %%H in (%%G) do call :_process "%%H" "%%I" "%%J" "%%K" "%%L" "%%M"
)

goto :_end

:_process
if /I [%~1] EQU [MSH] set element1=%~6

if /I [%~1] EQU [PID] (
 set element2=%~4
 set element3=%~5
 if [%~6] NEQ [] set element4=%~6
)

if /I [%~1] EQU [OBR] (
 set element5=%~5
 set element6=%~2
)

if defined element6 (
 if defined element4 (
  echo %element1:~0,8%%element2% %element3% %element4%: %element5% ^| %element6% >> %outputfile%
 ) else (
  echo %element1:~0,8%%element2% %element3%:%element5% ^| %element6% >> %outputfile%
 )
)

goto :eof


:_end
endlocal
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18892035
Here's the best  I could some up with. The batch processing accepts two optional parameters. The first is the directory where the files exist. The second is the output file name:

@echo off

setlocal enabledelayedexpansion

set fileMask=*.txt
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

popd

if exist "%outFile%" (echo Output in %outFile%)&goto :EOF

echo No files found

goto :EOF

:PROCESS

set /a lineCnt=0

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

:PROCDONE

(echo %fld1%%fld2% %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

set fld1=%~1
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld4=%~1

for /f "tokens=2 delims=^" %%a in ('echo %~2') do set fld3=%%a

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0
 

Author Comment

by:rortiz77
ID: 18892119
SteveGTR,

ok, so is this correct?

set fileMask= 'output.txt'
set workDir= 'C:\Test2'

Is that what you ment?
0
 
LVL 38

Expert Comment

by:Shift-3
ID: 18892161
Ah, my script creates a bunch of duplicate entries.  I'll see if I can correct it.
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18892162
No, if you named the batch file PROCTEXT.BAT you'd say:

PROCTEST c:\test2 output.dat

I'd avoid naming the output file with the .txt extension because the file is create in the same directory where the *.txt file are located.
0
 
LVL 38

Expert Comment

by:Shift-3
ID: 18892279
This is a little better.

@echo off
setlocal

set targetdir=z:\journal\fhodict\200702\25

set outputfile=output.dat

for /F "tokens=* usebackq" %%G in (`dir "%targetdir%\*.hl7" /B`) do (
 for /F "tokens=1,2,3,4,5,6 delims=|^" %%H in (%%G) do call :_process "%%H" "%%I" "%%J" "%%K" "%%L" "%%M"
)

goto :_end

:_process
if /I [%~1] EQU [MSH] set element1=%~6

if /I [%~1] EQU [PID] (
 set element2=%~4
 set element3=%~5
 if [%~6] NEQ [] set element4=%~6
)

if /I [%~1] EQU [OBR] (
 set element5=%~5
 set element6=%~2
)

if defined element6 (
 if defined element4 (
  echo %element1:~0,8%%element2% %element3% %element4%: %element5% ^| %element6% >> "%outputfile%"
 ) else (
  echo %element1:~0,8%%element2% %element3%:%element5% ^| %element6% >> "%outputfile%"
 )
 set element6=
)

goto :eof

:_end
endlocal
0
 

Author Comment

by:rortiz77
ID: 18892288
Shift-3,

When I tried your batch it formated things correctly but it created duplicates and only did 4 of the same person.  

20060801BLANKENSHIP HOLLY R: 200608011019 | 4714683
20060801BLANKENSHIP HOLLY R: 200608011019 | 4714683
20060801BLANKENSHIP HOLLY R: 200608011019 | 4714683
20060801BLANKENSHIP HOLLY R: 200608011019 | 4714683
0
 

Author Comment

by:rortiz77
ID: 18892341
Shift-3,

This next one created only two duplicates but it's only looking at the same person.  It's not reading down the list of files.

20060801BLANKENSHIP HOLLY R: 200608011019 | 4714683
20060801BLANKENSHIP HOLLY R: 200608011019 | 4714683
0
 

Author Comment

by:rortiz77
ID: 18892508
SteveGTR,

It still makes no sense to me.  Can you use your example and fill in the source directory as an example? what file format its looking for? the destination already looks like output.dat so that part is fine.  
0
 
LVL 38

Expert Comment

by:Shift-3
ID: 18892566
Ok, one more try.  This should process all the files and should remove all duplicates.  For the last part I borrowed a command from here:
http://www.jsifaq.com/SF/Tips/Tip.aspx?id=3530


@echo off
setlocal

set targetdir=z:\journal\fhodict\200702\25

set outputfile=output.dat

for /F "tokens=* usebackq" %%G in (`dir "%targetdir%\*.hl7" /B`) do (
 for /F "tokens=1,2,3,4,5,6 delims=|^ usebackq" %%H in ("%%G") do call :_process "%%H" "%%I" "%%J" "%%K" "%%L" "%%M"
)

for /F "tokens=* usebackq" %%S in (`Sort^<"%outputfile%"`) do @Find "%%S" tempoutput.txt||@echo %%S>>tempoutput.txt
move /Y tempoutput.txt "%outputfile%"

goto :_end

:_process
if /I [%~1] EQU [MSH] set element1=%~6

if /I [%~1] EQU [PID] (
 set element2=%~4
 set element3=%~5
 if [%~6] NEQ [] set element4=%~6
)

if /I [%~1] EQU [OBR] (
 set element5=%~5
 set element6=%~2
)

if defined element6 (
 if defined element4 (
  echo %element1:~0,8%%element2% %element3% %element4%: %element5% ^| %element6% >> "%outputfile%"
 ) else (
  echo %element1:~0,8%%element2% %element3%:%element5% ^| %element6% >> "%outputfile%"
 )
 set element1=
 set element2=
 set element3=
 set element4=
 set element5=
 set element6=
)

goto :eof

:_end
endlocal
0
 

Author Comment

by:rortiz77
ID: 18892568
Steve,

Ok, I get it.  At the DOS promt to type in "PROCTEST c:\test2 output.dat"  I ran it but it only has 3 problems:

1.  It only managed to extract 3 unique reports
2.  It didn't put in a ":" right after the persons name
3. It has a "2-3 V" right after the report name.
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18892585
It appears my mask is incorrect. I was using *.txt it should be *.hl7. Just run this batch file from the root directory where the files exist (z:\journal\fhodict\200702\25). The processing doesn't produce what you want exactly, but it's close.

@echo off

setlocal enabledelayedexpansion

set fileMask=*.hl7
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

popd

if exist "%outFile%" (echo Output in %outFile%)&goto :EOF

echo No files found

goto :EOF

:PROCESS

set /a lineCnt=0

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

:PROCDONE

(echo %fld1%%fld2% %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

set fld1=%~1
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld4=%~1

for /f "tokens=2 delims=^" %%a in ('echo %~2') do set fld3=%%a

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18892607
Here's some adjustments with your latest comments taken into consideration:

@echo off

setlocal enabledelayedexpansion

set fileMask=*.hl7
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

popd

if exist "%outFile%" (echo Output in %outFile%)&goto :EOF

echo No files found

goto :EOF

:PROCESS

set /a lineCnt=0

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

:PROCDONE

(echo %fld1%%fld2%: %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

set fld1=%~1
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld4=%~1

for /f "tokens=2 delims=^" %%a in ('echo %~2') do set fld3=%%a

set fld3=%fld3:~0,-3%

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0
 

Author Comment

by:rortiz77
ID: 18892795
Steve,

Well, looks good...it's just reading one file :-)  MAN we're close hahaha!!!!!

Shift-3,

Your's has the perfect format but it's also just reading one file and not the entire thing...not sure why but both are soooo close!!!
0
 

Author Comment

by:rortiz77
ID: 18892825
Not sure if it's related but I've been renaming the source files to .txt  Yes, they are HL7 formated data but the file types are all random...they are .234, .342, .830...its all random as far as the file type is concerned.  That's why I renamed the type to .txt.
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18892912
My file looks for files with a specific mask (*.hl7). Is there a specific mask or should it just process all files?
0
 

Author Comment

by:rortiz77
ID: 18892918
All files in the directory would be best.
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18893003
@echo off

setlocal enabledelayedexpansion

set fileMask=*.*
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

popd

if exist "%outFile%" (echo Output in %outFile%)&goto :EOF

echo No files found

goto :EOF

:PROCESS

set /a lineCnt=0

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

:PROCDONE

(echo %fld1%%fld2%: %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

set fld1=%~1
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld4=%~1

for /f "tokens=2 delims=^" %%a in ('echo %~2') do set fld3=%%a

set fld3=%fld3:~0,-3%

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18893070
We may want to place a sanity check in place in case other files are in the directory. Is it true that the first field on the first line of each file will be equal to 'MSH'?

If so, then this should do the trick:

@echo off

setlocal enabledelayedexpansion

set checkFld=MSH
set fileMask=*.*
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

popd

if exist "%outFile%" (echo Output in %outFile%)&goto :EOF

echo No files found

goto :EOF

:PROCESS

set /a lineCnt=0
set abort=

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  if "!abort!"=="Y" goto :EOF
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

REM ** File must have at least 4 lines

goto :EOF

:PROCDONE

(echo %fld1%%fld2%: %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~1" "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

if not "%~1"=="%checkFld%" set abort=Y&goto :EOF

set fld1=%~2
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld4=%~1

for /f "tokens=2 delims=^" %%a in ('echo %~2') do set fld3=%%a

set fld3=%fld3:~0,-3%

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0
 
LVL 38

Expert Comment

by:Shift-3
ID: 18893091
@echo off
setlocal

set targetdir=z:\journal\fhodict\200702\25

set outputfile=output.dat

for /F "tokens=* usebackq" %%G in (`dir "%targetdir%\*.*" /B`) do (
 for /F "tokens=1,2,3,4,5,6 delims=|^ usebackq" %%H in ("%%G") do call :_process "%%H" "%%I" "%%J" "%%K" "%%L" "%%M"
)

for /F "tokens=* usebackq" %%S in (`Sort^<"%outputfile%"`) do @Find "%%S" tempoutput.txt||@echo %%S>>tempoutput.txt
move /Y tempoutput.txt "%outputfile%"

goto :_end

:_process
if /I [%~1] EQU [MSH] set element1=%~6

if /I [%~1] EQU [PID] (
 set element2=%~4
 set element3=%~5
 if [%~6] NEQ [] set element4=%~6
)

if /I [%~1] EQU [OBR] (
 set element5=%~5
 set element6=%~2
)

if defined element6 (
 if defined element4 (
  echo %element1:~0,8%%element2% %element3% %element4%: %element5% ^| %element6% >> "%outputfile%"
 ) else (
  echo %element1:~0,8%%element2% %element3%:%element5% ^| %element6% >> "%outputfile%"
 )
 set element1=
 set element2=
 set element3=
 set element4=
 set element5=
 set element6=
)

goto :eof

:_end
endlocal
0
 

Author Comment

by:rortiz77
ID: 18893095
Steve,

I keep getting:

C:\Test2>PROCTEXT c:\test2 output.dat
> was unexpected at this time.
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18893127
Could be something in one of the files. Try this, it echos out the files as they are processing:

@echo off

setlocal enabledelayedexpansion

set checkFld=MSH
set fileMask=*.*
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

popd

if exist "%outFile%" (echo Output in %outFile%)&goto :EOF

echo No files found

goto :EOF

:PROCESS

echo Processing %~1...

set /a lineCnt=0
set abort=

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  if "!abort!"=="Y" goto :EOF
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

REM ** File must have at least 4 lines

goto :EOF

:PROCDONE

(echo %fld1%%fld2%: %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~1" "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

if not "%~1"=="%checkFld%" set abort=Y&goto :EOF

set fld1=%~2
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld4=%~1

for /f "tokens=2 delims=^" %%a in ('echo %~2') do set fld3=%%a

set fld3=%fld3:~0,-3%

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0
 

Author Comment

by:rortiz77
ID: 18893145
Steve,

Got this:

C:\Test2>PROCTEXT c:\test2 output.dat
Processing dt102050.479...
> was unexpected at this time.
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18893173
Can you post the first 5 lines of dt102050.479?
0
 

Author Comment

by:rortiz77
ID: 18893249
MSH|^~\`|DICT|020-06||FHIS|20060801102049||ORU^R01|20060801102049000|T|2.2|1|L||||||
PID|1||002000049||RONO^JENERS||1002000||||||(407)517-7377|(407)540-4919||||16678092||||||||||
PV1||E|^+||||^^^^^^FRID, VICKI KY DO DR^Y9||||||||||^^^^^^FRID, VICKI KY DO DR^Y9|O|
OBR||RA062120313100|4700009|RA420137^KNEE,=>4 VWS-RT*^06|||
OBX|1|TX|xxxxxxxR^^^913||||||||F|||||||
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18893332
It's that > sign in the OBR line 4. What do you expect this output to look like?
0
 

Author Comment

by:rortiz77
ID: 18893351
From that line all that is needed is

RA062120313100    and
KNEE,=>4 VWS-RT*

If special characters are messing it up then I'd just not include it as part of the export.  
0
 

Author Comment

by:rortiz77
ID: 18893382
Ok, when removed manually it begins to process until it hits another one with that character....at least now we know the problem!
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 30

Expert Comment

by:SteveGTR
ID: 18895192
Well, give this a try and cross your fingers :)

@echo off

setlocal enabledelayedexpansion

set checkFld=MSH
set fileMask=*.*
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

popd

if exist "%outFile%" (echo Output in %outFile%)&goto :EOF

echo No files found

goto :EOF

:PROCESS

echo Processing %~1...

set /a lineCnt=0
set abort=

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  if "!abort!"=="Y" goto :EOF
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

REM ** File must have at least 4 lines

goto :EOF

:PROCDONE

(echo %fld1%%fld2%: %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~1" "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

if not "%~1"=="%checkFld%" set abort=Y&goto :EOF

set fld1=%~2
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld4=%~1

set _esc="%~2"

set _esc=%_esc:^=@%
set _esc=%_esc:&=^^^&%
set _esc=%_esc:,=^^^,%
set _esc=%_esc:\=^^^\%
set _esc=%_esc:|=^^^|%
set _esc=%_esc:<=^^^<%
set _esc=%_esc:>=^^^>%

for /f "tokens=2-3 delims=@" %%a in ('echo %_esc%') do set fld3=%%a

set fld3=%fld3:~0,-4%

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0
 

Author Comment

by:rortiz77
ID: 18897763
Steve,

This one got really far before it bombed out.  Not sure what caused it this time but here's the top 4 lines again and the output message.

MSH|^~\`|DICT|020-01||FHIS|20060801103218||
PID|1||000170002||NEWRI^TARIT^J||100000080000
PV1||E|7S^7329^01^732901||||
OBR||RA062120300000|4715572|RA490046^HUMERUS (R) 1-2 VWS^01|||

Processing dt103043.216...
Processing dt103053.154...
Processing dt103055.873...
Processing dt103116.780...
Processing dt103119.296...
Processing dt103121.702...
Processing dt103129.452...
Processing dt103153.78...
Processing dt103204.672...
Processing dt103219.657...
1-2|RA062120300000) was unexpected at this time.

Not sure if this is due to another special character but it almost looks like this issue came in the output process.
0
 

Author Comment

by:rortiz77
ID: 18897793
Actually, it seemed to have been the "()" in the OBR line that through it off.  When removed it worked fine.  Is there a way to add a line to the batch that automatically removes certain types of characters before it processes?
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18897814
We could handle it like the other special characters:

@echo off

setlocal enabledelayedexpansion

set checkFld=MSH
set fileMask=*.*
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

popd

if exist "%outFile%" (echo Output in %outFile%)&goto :EOF

echo No files found

goto :EOF

:PROCESS

echo Processing %~1...

set /a lineCnt=0
set abort=

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  if "!abort!"=="Y" goto :EOF
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

REM ** File must have at least 4 lines

goto :EOF

:PROCDONE

(echo %fld1%%fld2%: %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~1" "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

if not "%~1"=="%checkFld%" set abort=Y&goto :EOF

set fld1=%~2
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld4=%~1

set _esc="%~2"

set _esc=%_esc:^=@%
set _esc=%_esc:&=^^^&%
set _esc=%_esc:,=^^^,%
set _esc=%_esc:\=^^^\%
set _esc=%_esc:|=^^^|%
set _esc=%_esc:<=^^^<%
set _esc=%_esc:>=^^^>%
set _esc=%_esc:(=^^^(%
set _esc=%_esc:)=^^^)%

for /f "tokens=2-3 delims=@" %%a in ('echo %_esc%') do set fld3=%%a

set fld3=%fld3:~0,-4%

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0
 

Author Comment

by:rortiz77
ID: 18897911
Another weird issue...not sure on this one.  

Processing fd150433.820...
Processing fd151501.540...
Processing fd151858.962...
Processing fd151952.712...
'RA070570180000' is not recognized as an internal or external command,
operable program or batch file.

MSH|^~\`|DICT|020-84||FHIS|20070226141951||
PID|1||000580016||MAULI^CHARLES^E||1
PV1||O|^+||||0000014008
OBR||RA070570180000|5540269|RA140026^ABD PAIN,DIARRHEA, R^84|||



0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18897990
Looks like it's taking too much off the end of fld3. I just changed the code to use the whole part.

@echo off

setlocal enabledelayedexpansion

set checkFld=MSH
set fileMask=*.*
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

popd

if exist "%outFile%" (echo Output in %outFile%)&goto :EOF

echo No files found

goto :EOF

:PROCESS

echo Processing %~1...

set /a lineCnt=0
set abort=

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  if "!abort!"=="Y" goto :EOF
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

REM ** File must have at least 4 lines

goto :EOF

:PROCDONE

(echo %fld1%%fld2%: %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~1" "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

if not "%~1"=="%checkFld%" set abort=Y&goto :EOF

set fld1=%~2
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld4=%~1

set _esc="%~2"

set _esc=%_esc:^=@%
set _esc=%_esc:&=^^^&%
set _esc=%_esc:,=^^^,%
set _esc=%_esc:\=^^^\%
set _esc=%_esc:|=^^^|%
set _esc=%_esc:<=^^^<%
set _esc=%_esc:>=^^^>%
set _esc=%_esc:(=^^^(%
set _esc=%_esc:)=^^^)%

for /f "tokens=2-3 delims=@" %%a in ('echo %_esc%') do set fld3=%%a

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0
 

Author Comment

by:rortiz77
ID: 18898315
Nice!!!!  It ran with no problems!!!  The only thing I noticed was "duplicate" entries.  It's not a literal duplicate because the RA# is different but the Report Type is the same. For example on the below, "MAL NEOPL BREAST-CEN"  and “TEMILL DOKIS” together would be looked at by the system as a duplicate.  

20070226TEMILL DOKIS: MAL NEOPL BREAST-CEN|RA070000083400
20070226TEMILL DOKIS: MAL NEOPL BREAST-CEN|RA070000083500

Is there a way to have it not log another entry for the same person and report type if one exists?
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18898428
If the lines are the same we could do a comparision prior to saving.

@echo off

setlocal enabledelayedexpansion

set checkFld=MSH
set fileMask=*.*
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

del _temp.dat 2>NUL

popd

if exist "%outFile%" (echo Output in %outFile%)&goto :EOF

echo No files found

goto :EOF

:PROCESS

echo Processing %~1...

set /a lineCnt=0
set abort=

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  if "!abort!"=="Y" goto :EOF
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

REM ** File must have at least 4 lines

goto :EOF

:PROCDONE

if not exist "%outFile%" goto WRITELINE

(echo %fld1%%fld2%: %fld3%^|%fld4%)>_temp.dat

findstr /G:_temp.dat "%outFile%" >NUL
if ERRORLEVEL 1 goto WRITELINE

goto :EOF

:WRITELINE

(echo %fld1%%fld2%: %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~1" "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

if not "%~1"=="%checkFld%" set abort=Y&goto :EOF

set fld1=%~2
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld4=%~1

set _esc="%~2"

set _esc=%_esc:^=@%
set _esc=%_esc:&=^^^&%
set _esc=%_esc:,=^^^,%
set _esc=%_esc:\=^^^\%
set _esc=%_esc:|=^^^|%
set _esc=%_esc:<=^^^<%
set _esc=%_esc:>=^^^>%
set _esc=%_esc:(=^^^(%
set _esc=%_esc:)=^^^)%

for /f "tokens=2-3 delims=@" %%a in ('echo %_esc%') do set fld3=%%a

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0
 

Author Comment

by:rortiz77
ID: 18898535
It ran but still got some duplicates…

20070226BIKER MARK A: HX CANCER|RA000550069700
20070226BIKER MARK A: HX CANCER|RA000550069800
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18898545
That's because the lines are not the same. What is the unqiue portion that I can check?
0
 

Author Comment

by:rortiz77
ID: 18898577
Basically if the name and report type are the same it's a duplicate even though the RA# is different.  So it has to ignore the RA part.  

So if this part is the same its a duplicate:
BIKER MARK A: HX CANCER
BIKER MARK A: HX CANCER
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18898677
@echo off

setlocal enabledelayedexpansion

set checkFld=MSH
set fileMask=*.*
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

del _temp.dat 2>NUL

popd

if exist "%outFile%" (echo Output in %outFile%)&goto :EOF

echo No files found

goto :EOF

:PROCESS

echo Processing %~1...

set /a lineCnt=0
set abort=

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  if "!abort!"=="Y" goto :EOF
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

REM ** File must have at least 4 lines

goto :EOF

:PROCDONE

if not exist "%outFile%" goto WRITELINE

(echo %fld3%)>_temp.dat

findstr /G:_temp.dat "%outFile%" >NUL
if ERRORLEVEL 1 goto WRITELINE

goto :EOF

:WRITELINE

(echo %fld1%%fld2%: %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~1" "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

if not "%~1"=="%checkFld%" set abort=Y&goto :EOF

set fld1=%~2
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld4=%~1

set _esc="%~2"

set _esc=%_esc:^=@%
set _esc=%_esc:&=^^^&%
set _esc=%_esc:,=^^^,%
set _esc=%_esc:\=^^^\%
set _esc=%_esc:|=^^^|%
set _esc=%_esc:<=^^^<%
set _esc=%_esc:>=^^^>%
set _esc=%_esc:(=^^^(%
set _esc=%_esc:)=^^^)%

for /f "tokens=2-3 delims=@" %%a in ('echo %_esc%') do set fld3=%%a

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0
 

Author Comment

by:rortiz77
ID: 18898790
Hmm...ok.  It got rid of nearly half the logs.  It went from 266 to 164.  Trying to verify if its accurate on a spreadsheet.  
0
 

Author Comment

by:rortiz77
ID: 18900619
Ok, after taking a look it should only be removing about 6% of the total logs instead of 90%.  When looking at how it was done in the past there were 2600 files and after the job it had an output of about 2500.  Now if i run this job of 1600 files I get only 162 on the output file when really I should expect maybe 1500.

What's causing this?
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18900734
I'd say there are more rows matching then you expect. Maybe if we dumped out the duplicates you could get a better idea.

One possibility is that a person has the same name as another person with the same symtom. Another is that a person has multiple entries on different dates for the same symtom.

What do you think?
0
 

Author Comment

by:rortiz77
ID: 18900740
Maybe if you can take off the compare before writing and then I'll see how many it logs after that.  
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18900755
Sure we can bypass that processing:

@echo off

setlocal enabledelayedexpansion

set checkFld=MSH
set fileMask=*.*
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

del _temp.dat 2>NUL

popd

if exist "%outFile%" (echo Output in %outFile%)&goto :EOF

echo No files found

goto :EOF

:PROCESS

echo Processing %~1...

set /a lineCnt=0
set abort=

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  if "!abort!"=="Y" goto :EOF
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

REM ** File must have at least 4 lines

goto :EOF

:PROCDONE

REM ** Bypass duplicate processing
goto WRITELINE

if not exist "%outFile%" goto WRITELINE

(echo %fld3%)>_temp.dat

findstr /G:_temp.dat "%outFile%" >NUL
if ERRORLEVEL 1 goto WRITELINE

goto :EOF

:WRITELINE

(echo %fld1%%fld2%: %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~1" "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

if not "%~1"=="%checkFld%" set abort=Y&goto :EOF

set fld1=%~2
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld4=%~1

set _esc="%~2"

set _esc=%_esc:^=@%
set _esc=%_esc:&=^^^&%
set _esc=%_esc:,=^^^,%
set _esc=%_esc:\=^^^\%
set _esc=%_esc:|=^^^|%
set _esc=%_esc:<=^^^<%
set _esc=%_esc:>=^^^>%
set _esc=%_esc:(=^^^(%
set _esc=%_esc:)=^^^)%

for /f "tokens=2-3 delims=@" %%a in ('echo %_esc%') do set fld3=%%a

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0
 

Author Comment

by:rortiz77
ID: 18900881
Hmm...ok.  When I run it without checking for duplicates it give me the correct number of 1690 logs.  Also, I looked at the data on a spreadsheet and its definitely not the case of different people with same names and reports or a person with multiple entries on different dates for the same symptom.

There's something else we're missing....
0
 

Author Comment

by:rortiz77
ID: 18900921
Ok, I just did a sort for unique records in excel and it shows me the correct amount of 1570 from 1690 files.  
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18900923
It could be the switches I'm using for findstr. I didn't include case-insensitive or literal. It should. Try this:

@echo off

setlocal enabledelayedexpansion

set checkFld=MSH
set fileMask=*.*
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

del _temp.dat 2>NUL

popd

if exist "%outFile%" (echo Output in %outFile%)&goto :EOF

echo No files found

goto :EOF

:PROCESS

echo Processing %~1...

set /a lineCnt=0
set abort=

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  if "!abort!"=="Y" goto :EOF
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

REM ** File must have at least 4 lines

goto :EOF

:PROCDONE

if not exist "%outFile%" goto WRITELINE

(echo %fld3%)>_temp.dat

findstr /I /L /G:_temp.dat "%outFile%" >NUL
if ERRORLEVEL 1 goto WRITELINE

goto :EOF

:WRITELINE

(echo %fld1%%fld2%: %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~1" "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

if not "%~1"=="%checkFld%" set abort=Y&goto :EOF

set fld1=%~2
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld4=%~1

set _esc="%~2"

set _esc=%_esc:^=@%
set _esc=%_esc:&=^^^&%
set _esc=%_esc:,=^^^,%
set _esc=%_esc:\=^^^\%
set _esc=%_esc:|=^^^|%
set _esc=%_esc:<=^^^<%
set _esc=%_esc:>=^^^>%
set _esc=%_esc:(=^^^(%
set _esc=%_esc:)=^^^)%

for /f "tokens=2-3 delims=@" %%a in ('echo %_esc%') do set fld3=%%a

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0
 

Author Comment

by:rortiz77
ID: 18900981
Nope, back down to 250 logs.
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18901016
Could be that lines are being excluded because they don't start with the initial tag of MSH.
0
 

Author Comment

by:rortiz77
ID: 18901116
All the files there start with MSH.  
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18901202
Time to debug...

Check out skiprec.log after processing completes:

@echo off

setlocal enabledelayedexpansion

set checkFld=MSH
set fileMask=*.*
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat
set foundFiles=

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL
del skiprec.log 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

del _temp.dat 2>NUL
del _temp2.dat 2>NUL

if exist "%outFile%" (echo Output in %outFile%)&set foundFiles=Y
if exist skiprec.log (echo Skipped record log in skiprec.log)

popd

if "%foundFiles%"=="" echo No files found

goto :EOF

:PROCESS

echo Processing %~1...

set fileName=%~1

set /a lineCnt=0
set abort=

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  if "!abort!"=="Y" goto :EOF
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

REM ** File must have at least 4 lines

(echo %~1: There were not at least 4 lines in the file)>>skiprec.log

goto :EOF

:PROCDONE

if not exist "%outFile%" goto WRITELINE

(echo %fld3%)>_temp.dat

findstr /I /L /G:_temp.dat "%outFile%" >_temp2.dat
if ERRORLEVEL 1 goto WRITELINE

(echo %~1: Matched found in output file on %fld3%)>>skiprec.log

goto :EOF

:WRITELINE

(echo %fld1%%fld2%: %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~1" "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

if "%~1"=="%checkFld%" goto PROC0_CNT

set abort=Y
(echo %~1: %~1 does not equal %checkFld% on 1st field in 1st line of file)>>skiprec.log
goto :EOF

:PROC0_CNT

set fld1=%~2
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld4=%~1

set _esc="%~2"

set _esc=%_esc:^=@%
set _esc=%_esc:&=^^^&%
set _esc=%_esc:,=^^^,%
set _esc=%_esc:\=^^^\%
set _esc=%_esc:|=^^^|%
set _esc=%_esc:<=^^^<%
set _esc=%_esc:>=^^^>%
set _esc=%_esc:(=^^^(%
set _esc=%_esc:)=^^^)%

for /f "tokens=2-3 delims=@" %%a in ('echo %_esc%') do set fld3=%%a

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0
 

Author Comment

by:rortiz77
ID: 18901337
90% of the the files were in this log skiprec.log:
dt001821.652: Matched found in output file on CT PELVIS W
dt002316.857: Matched found in output file on CHEST^,PORT SINGLE VW
dt002441.61: Matched found in output file on CTA CHEST
dt002553.577: Matched found in output file on CT BRAIN WO
dt002723.390: Matched found in output file on CT BRAIN WO
dt002845.625: Matched found in output file on CT BRAIN WO

what is the logic being used for the comparison?
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18901416
Can you post back the first 4 lines of:

dt002553.577
dt002723.390
dt002845.625
0
 

Author Comment

by:rortiz77
ID: 18901531

dt002553.577
MSH|^~\`|DICT|020-02||FHIS|20070105232551||
PID|1||000071463||SWER^MION^K||190008130000||
PV1||E|^+||||^^^^^^CHAN, HAANG L. MD DR^Y9|
OBR||RA070000304900|5329834|RA420070^CT BRAIN WO^02||

dt002723.390
MSH|^~\`|DICT|020-01||FHIS|20070105232722||
PID|1||000050753||LEMON^DORTHY^R||
PV1||E|5ETW^5209^01^520901|||
OBR||RA070050002200|5329835|RA140009^CT BRAIN WO^01||

dt002845.625
MSH|^~\`|DICT|020-03||FHIS|20070105232844||
PID|1||000500426||CREY^JAMES^B||
PV1||E|^+||||^^^^^^BALER, STEVEN J. MD DR^Y9||
OBR||RA070050307000|5329837|RA420070^CT BRAIN WO^03||
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18901567
My comparision is being done on "CT BRAIN WO". What should it be done on? What makes each of these unique?
0
 

Author Comment

by:rortiz77
ID: 18901615
These are all different people.  Only when you have:

John Doe: Head injury
John Doe: Head injury

should it be considered a duplicate and only log one version to say:

John Doe: Head injury

So if John Doe with Head injury is on the next line it would skip over it and continue to the next file.
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18901735
Yes, I see so we should match on person and line 4 item?
0
 

Author Comment

by:rortiz77
ID: 18901745
Correct, on person and report type.
0
 
LVL 30

Accepted Solution

by:
SteveGTR earned 500 total points
ID: 18901771
@echo off

setlocal enabledelayedexpansion

set checkFld=MSH
set fileMask=*.*
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat
set foundFiles=

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL
del skiprec.log 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

del _temp.dat 2>NUL
del _temp2.dat 2>NUL

if exist "%outFile%" (echo Output in %outFile%)&set foundFiles=Y
if exist skiprec.log (echo Skipped record log in skiprec.log)

popd

if "%foundFiles%"=="" echo No files found

goto :EOF

:PROCESS

echo Processing %~1...

set fileName=%~1

set /a lineCnt=0
set abort=

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  if "!abort!"=="Y" goto :EOF
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

REM ** File must have at least 4 lines

(echo %~1: There were not at least 4 lines in the file)>>skiprec.log

goto :EOF

:PROCDONE

if not exist "%outFile%" goto WRITELINE

(echo %fld2%: %fld3%)>_temp.dat

findstr /I /L /G:_temp.dat "%outFile%" >_temp2.dat
if ERRORLEVEL 1 goto WRITELINE

(echo %~1: Matched found in output file on %fld2%: %fld3%)>>skiprec.log

goto :EOF

:WRITELINE

(echo %fld1%%fld2%: %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~1" "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

if "%~1"=="%checkFld%" goto PROC0_CNT

set abort=Y
(echo %~1: %~1 does not equal %checkFld% on 1st field in 1st line of file)>>skiprec.log
goto :EOF

:PROC0_CNT

set fld1=%~2
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld4=%~1

set _esc="%~2"

set _esc=%_esc:^=@%
set _esc=%_esc:&=^^^&%
set _esc=%_esc:,=^^^,%
set _esc=%_esc:\=^^^\%
set _esc=%_esc:|=^^^|%
set _esc=%_esc:<=^^^<%
set _esc=%_esc:>=^^^>%
set _esc=%_esc:(=^^^(%
set _esc=%_esc:)=^^^)%

for /f "tokens=2-3 delims=@" %%a in ('echo %_esc%') do set fld3=%%a

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0
 

Author Comment

by:rortiz77
ID: 18901848
You got it!!!!!!!!!  
0
 

Author Comment

by:rortiz77
ID: 18901865
Wow this was a tough one...i'm no programmer so this works great!  Thanks for your patience...you rock!!!
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18902867
Glad it work :)
0
 

Author Comment

by:rortiz77
ID: 18905733
Steve,

Just noticed another problem :-(  
It crashes when looking for the Report Name and finds ^^ (nothing) in its place.  

OBR||RA070000075100|5500040|RA400107^^01|

where a normal one looks like:
OBR||RA070000181000|5007126|RA000001^INFILTRATE^01|

Can it be set to ignore ones where it is left blank?
0
 
LVL 30

Expert Comment

by:SteveGTR
ID: 18918317
Give this a try:

@echo off

setlocal enabledelayedexpansion

set checkFld=MSH
set fileMask=*.*
set workDir=.

if not "%~1"=="" set workDir=%~1

set outFile=output.dat
set foundFiles=

if not "%~2"=="" set outFile=%~2

pushd "%workDir%"

del "%outFile%" 2>NUL
del skiprec.log 2>NUL

for /f "tokens=*" %%a in ('dir /b /a-d "%fileMask%" 2^>NUL') do call :PROCESS "%%a"

del _temp.dat 2>NUL
del _temp2.dat 2>NUL

if exist "%outFile%" (echo Output in %outFile%)&set foundFiles=Y
if exist skiprec.log (echo Skipped record log in skiprec.log)

popd

if "%foundFiles%"=="" echo No files found

goto :EOF

:PROCESS

echo Processing %~1...

set fileName=%~1

set /a lineCnt=0
set abort=

for /f "tokens=1-9 delims=|" %%a in ('type "%~1"') do (
  call :PROCLINE "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i"
  if "!abort!"=="Y" goto :EOF
  set /a lineCnt+=1
  if /i !lineCnt! EQU 4 goto :PROCDONE
)

REM ** File must have at least 4 lines

(echo %~1: There were not at least 4 lines in the file)>>skiprec.log

goto :EOF

:PROCDONE

if not exist "%outFile%" goto WRITELINE

(echo %fld2%: %fld3%)>_temp.dat

findstr /I /L /G:_temp.dat "%outFile%" >_temp2.dat
if ERRORLEVEL 1 goto WRITELINE

(echo %~1: Matched found in output file on %fld2%: %fld3%)>>skiprec.log

goto :EOF

:WRITELINE

(echo %fld1%%fld2%: %fld3%^|%fld4%)>>"%outFile%"

goto :EOF

:PROCLINE

if /i %lineCnt% EQU 0 call :PROC0 "%~1" "%~6"
if /i %lineCnt% EQU 1 call :PROC1 "%~4"
if /i %lineCnt% EQU 3 call :PROC3 "%~2" "%~4"

goto :EOF

:PROC0

if "%~1"=="%checkFld%" goto PROC0_CNT

set abort=Y
(echo %~1: %~1 does not equal %checkFld% on 1st field in 1st line of file)>>skiprec.log
goto :EOF

:PROC0_CNT

set fld1=%~2
set fld1=%fld1:~0,8%

goto :EOF

:PROC1

call :FIXCARET fld2 "%~1"

goto :EOF

:PROC3

set fld3=
set fld4=%~1

set _esc="%~2"

set _esc=%_esc:^=@%
set _esc=%_esc:&=^^^&%
set _esc=%_esc:,=^^^,%
set _esc=%_esc:\=^^^\%
set _esc=%_esc:|=^^^|%
set _esc=%_esc:<=^^^<%
set _esc=%_esc:>=^^^>%
set _esc=%_esc:(=^^^(%
set _esc=%_esc:)=^^^)%

set _t=%_esc:@@@@@@@@=%

REM Test for a blank field
if not %_t%==%_esc% goto :EOF

for /f "tokens=2 delims=@" %%a in ('echo %_esc%') do set fld3=%%a

goto :EOF

:FIXCARET

set _fc=%~2
set _fc=%_fc:^^^^= %

set %~1=%_fc%

goto :EOF
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Over the years I have built up my own little library of code snippets that I refer to when programming or writing a script.  Many of these have come from the web or adaptations from snippets I find on the Web.  Periodically I add to them when I come…
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)
The viewer will learn the basics of jQuery including how to code hide show and toggles. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery…

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now