Lionel MM
asked on
Find and Replace in Batch File
I had this question answered earlier, -- which basically removes everything before the @sign in an email address (including the @sign) now instead of simply removing the text before the @ sign I need something more complicated. I need is to make a line like this
"queued message from onlinedoctorate@tikritnews .net was deleted."
into a line like this
"from .*\@tikritnews.\net"
and yes that is a dot before the * (.*\) and a dot after the word deleted (deleted.). I would prefer a batch file but if vb or ps is best then OK--thank you.
"queued message from onlinedoctorate@tikritnews
into a line like this
"from .*\@tikritnews.\net"
and yes that is a dot before the * (.*\) and a dot after the word deleted (deleted.). I would prefer a batch file but if vb or ps is best then OK--thank you.
ASKER
There is a text file with a whole bunch of lines similar to one example I gave so it needs to go through the text file and do this to each line.
And the (only) lines you're looking for all have the "queued message from ..." in them?
And I assume you want to do something with each line, other than echoing them to the console?
And I assume you want to do something with each line, other than echoing them to the console?
Try this; add your code at line 20:
@echo off
setlocal enabledelayedexpansion
set File=D:\Temp\test.txt
set Search=queued message from
for /f "tokens=1* delims=@" %%a in ('type "%File%" ^| find.exe /i "%Search%"') do (
REM *** Everything before the "@" is removed in %%b:
set From=%%b
REM *** Now that the leading spaces have been removed, remove everything after the email address:
for /f "tokens=1" %%f in ("!From!") do set From=%%f
REM *** Now add the ".*@" and escape the dots:
set From=.*\@!From:.=.\!
call :Process "!From!"
)
echo Done.
goto :eof
:Process
set From=%~1
echo Processing '%From%' ...
REM ...
goto :eof
ASKER
ok it seems to be working but I only get this echoed to the screen and not a file
Processing '.*\@tikritnews.\net' ...
Processing '.*\@ma-tesol.\com' ...
Processing '.*\@shaggybike.\com' ...
PLUS
The results needs to be "from .*\@" at the start and not just ".*\@" so would you change line 11 to
set from= from .*\@!from:.=.\!
?
And in answer to your earlier q's yes all these lines are similar and these are the only types of lines in this file
Processing '.*\@tikritnews.\net' ...
Processing '.*\@ma-tesol.\com' ...
Processing '.*\@shaggybike.\com' ...
PLUS
The results needs to be "from .*\@" at the start and not just ".*\@" so would you change line 11 to
set from= from .*\@!from:.=.\!
?
And in answer to your earlier q's yes all these lines are similar and these are the only types of lines in this file
So you want the output in a file (you didn't mention that before)?
@echo off
setlocal enabledelayedexpansion
set InFile=D:\Temp\test.txt
set OutFile=D:\Temp\testout.txt
set Search=queued message from
if exist "%OutFile%" del "%OutFile%"
echo Processing '%InFile%' ...
for /f "tokens=1* delims=@" %%a in ('type "%InFile%" ^| find.exe /i "%Search%"') do (
REM *** verything before the "@" is removed in %%b:
set From=%%b
REM *** Now that the leading spaces have been removed, remove everything after the email address:
for /f "tokens=1" %%f in ("!From!") do set From=%%f
REM *** Now add the ".*@" and escape the dots:
set From=from .*\@!From:.=.\!
>>"%OutFile%" echo !From!
)
echo Done.
ASKER
I did not sorry but I assumed you would have looked at my earlier question where I did have the file sorted and deduped from one file and created another -- my apologies I should have re-stated it again
ASKER
Can you help me dedupe the results--I thought i could use my earlier solution to do it on my own but can't. I need to remove the duplicates in the resulting file, can't do it at the start because they may not be exact duplicates until after the text has been removed. Thank you.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
That works perfectly--thank you. I am trying to better understand how all these commands work. What each line is doing, well not what but how so that I can learn and apply to other situations so do you think you could explain how each command does what it does?
The main part is done by "for /f" loops. A "for /f" loop splits lines from an input stream into single tokens based on the delimiters passed.
In line 10, the beginning of the main loop that reads the input file, each line read is split at the first "@" character ("delims=@"), and the resulting two tokens ("tokens=1*") will be found in %%a (explicitly defined) and %%b (implicitly defined because "tokens=1*" expects two tokens). So %%b now contains everything after the "@".
In line 15, it's basically the same, only it's now splitting at space (and tab), the default if "delims=" is not defined, and this time the only interesting token is the first one. So "From" now contains the domain part of the email address.
Line 17 adds the new prefix and replaces all "." in "From" with "\.".
To remove the duplicates, a new "for /f" loop reads the sorted results of the extracted addresses (here the full line is read because no delims are defined: "delims="), and will only write the current line to the output file if it hasn't been already written in a previous run.
Which brings me to a minor correction: there should be a "/i" after "if" in line 24 to make a case insensitive comparison (so that tikritnews would be considered the same as TikritNews).
In line 10, the beginning of the main loop that reads the input file, each line read is split at the first "@" character ("delims=@"), and the resulting two tokens ("tokens=1*") will be found in %%a (explicitly defined) and %%b (implicitly defined because "tokens=1*" expects two tokens). So %%b now contains everything after the "@".
In line 15, it's basically the same, only it's now splitting at space (and tab), the default if "delims=" is not defined, and this time the only interesting token is the first one. So "From" now contains the domain part of the email address.
Line 17 adds the new prefix and replaces all "." in "From" with "\.".
To remove the duplicates, a new "for /f" loop reads the sorted results of the extracted addresses (here the full line is read because no delims are defined: "delims="), and will only write the current line to the output file if it hasn't been already written in a previous run.
Which brings me to a minor correction: there should be a "/i" after "if" in line 24 to make a case insensitive comparison (so that tikritnews would be considered the same as TikritNews).
@echo off
setlocal enabledelayedexpansion
set InFile=D:\Temp\test2.txt
set OutFile=D:\Temp\testout.txt
set Search=queued message from
set TempFile=%Temp%\%~n0.tmp
if exist "%OutFile%" del "%OutFile%"
if exist "%TempFile%" del "%TempFile%"
echo Processing '%InFile%' ...
for /f "tokens=1* delims=@" %%a in ('type "%InFile%" ^| find.exe /i "%Search%"') do (
REM *** Everything before the "@" is removed in %%b:
set From=%%b
REM *** Now that the leading spaces have been removed, remove everything after the email address:
for /f "tokens=1" %%f in ("!From!") do set From=%%f
REM *** Now add the "from .*@" and escape the dots:
set From=from .*\@!From:.=.\!
REM *** Write the output to a temp file
>>"%TempFile%" echo !From!
)
REM Remove duplicates from the list in the temp file, and create the output file
set Line=
for /f "delims=" %%a in ('type "%TempFile%" ^| sort.exe') do (
if /i not "!Line!"=="%%a" (
>>"%OutFile%" echo %%a
set Line=%%a
)
)
del "%TempFile%"
echo Done; results are in '%OutFile%'.
ASKER
Works great. Thanks so much for all your help, time and education. I really appreciate it. Have a great day
Open in new window