Solved

Find text at end of line

Posted on 2012-09-06
748 Views
I generate a csv file with lines that vary in content and length, but have a consistent construction.  Example:

I want the samid at the end after the last comma and place it in an output text file.

I'd prefer using FOR or Findstr or some other DOS based command for accomplishing this.
0
Question by:BigmacMc

LVL 51

Expert Comment

Should be able to just do:

@echo off
for /F "usebackq tokens=6 delims=," %%A in ("c:\temp\inputfile.txt") do (
echo %%6>>"c:\temp\outputfile.txt"
)

0

Author Comment

1st --  I had to change echo %%6>>"C:\temp\outputfile.txt" to echo %%A>>"c:\temp\outpufile.txt".  Otherwise it just outputed %%6.  This works, but is dependent on the number of delimiters being the same and they're not.

2nd--  The number of delims may vary by line, some may have five, some may have six, others 7...all in the same file.

Is there way to either ignore or delete what is double quotes; or...to just capture what's after the last delimiter, no matter how there are?

BigmacMC
0

LVL 43

Expert Comment

Bill - I was trying to work out if you could use " as a delimeter... i.e. "delims=""" but couldn't get it to work.
Next option was to pick each line, then replace upto ", with blank, i.e.

@echo off
setlocal enabledelayedexpansion
for /f "tokens=*" %%a in (test.txt) do (
set line=%%a
echo !line:*",=!
)

Steve
0

Author Comment

Steve,

Almost there.

This works, however I need the result to output to a file; i.e. outputfile.txt.  I tried just adding >>outputfile.txt after echo !line:*",=!   but that just echos the addition, doesn't write it to a file.

BigmacMC
0

LVL 43

Assisted Solution

Sure, either do:

mybatchfile.cmd > output.txt

or

@echo off
setlocal enabledelayedexpansion
(for /f "tokens=*" %%a in (test.txt) do (
set line=%%a
echo !line:*",=!
)) > output.txt

or you can do it a line a time using >> to append:

@echo off
setlocal enabledelayedexpansion
for /f "tokens=*" %%a in (test.txt) do (
set line=%%a
(echo !line:*",=!)>>output.txt
) > output.txt

Off now for a bit but sure Bill can fill you in if you need more.

Steve
0

LVL 43

Expert Comment

BTW to explain this...

setlocal enabledelayedexpansion -- subject in itself but in summary means you can use variables in a loop using !variable! instead of %.  Without this the variable looks the same until after the loop.

( -- start of what gets exported to file

for /f "tokens=*" %%a in (test.txt) do (  )  -- runs the code between ( and ) for each line in the file with the line in "%%a"

set line=%%a - assign the line to a variable so we can edit it

echo !line:*",=!   -- replace upto matching ", with nothing and display it

) > output.txt -- write everything between that and matching ( to the file output.txt

Really off now!

Steve
0

Author Comment

Here's what I ended up using and it works perfectly.

@echo off

set appdir=c:\scripts\test

setlocal enabledelayedexpansion
for /f "skip=1 tokens=*" %%a in (%APPDIR%\nausers.csv) do (
set line=%%a
echo !line:*",=!
)>>%APPDIR%\outputfile.txt

Steve & Bill....thanks so much for you help.

BigmacMC
0

LVL 10

Assisted Solution

Hey Steve & Bill :)

Try this:
@ECHO OFF
SETLOCAL EnableDelayedExpansion
SET InputFile=1.csv
SET OutputFile=1.txt
FOR /F "usebackq delims=" %%A in ("%InputFile%") DO (
SET Line=%%A
SET Line=!Line:"=!
FOR %%A in (!Line!) DO SET Val=%%A
ECHO [!Val!]
ECHO !Val!>>%OutputFile%"
)
PAUSE
EXIT

0

LVL 10

Expert Comment

@BigmacMc

According to your supplied data sample, the script you say the works in 38373458 actually should not work.  This is because column lengths are not the same.  My version of batch file corrects this issue.

Cheers,
Rene
0

LVL 43

Expert Comment

Glad it worked for you, certainly did on my two line sample, though who knows when you throw thousands of lines of dodgy Microsoft data at it.

Rene - not sure what you are saying here.  It seems there is one field "qualified name" with quotes around and commas in, then a comma and the field he is interested in so the number of commas in the first field is not an issue as my script removes everything upto the first , after a ".

Steve
0

Author Comment

Rene,

I tried yours and it does in fact work perfectly.  Not exactly sure what the difference is, because the one I posted also worked with different line lengths/delimiters.

Thanks

BigmacMC
0

LVL 10

Expert Comment

@Steve
You got me by surprise with: !Line:*",=!
Thanks pal, I now know something new... :)

@BigmacMc
Sorry for the confusion. The last script you posted works indeed

Cheers,
Rene
0

Author Comment

So both solutions work.

Guys...thanks very much.

BigmacMC
0

LVL 10

Expert Comment

@Steve
Would a code to keep all before ", and discard after exist?
Something like: !Line:",*=!

Cheers
0

LVL 10

Expert Comment

@BigmacMc
You're welcome
0

LVL 51

Accepted Solution

Sorry for the silence after my first contribution (complete with typo), been a busy day on other fronts.  I'll review this more fully later and see if I have anything to add, but looks like you made some good progress, happy for that.

~bp
0

Author Closing Comment

Thanks to Steve and Rene for speedy replies that worked.  Solved my problem.
0

LVL 43

Expert Comment

Rene - I don't think you can match a wildcard at the end of the string.  I suppose you would have to do the opposite, i.e. find the bit after what you want then remove it with substitution, something like:

@echo off
setlocal enabledelayedexpansion
for /f "tokens=*" %%a in (test.txt) do (
set line=%%a
set delete=!line:*",=!
echo   Need to delete !delete!
call set line=%%line:,!delete!=%%
echo !line!
)

which is one of the ones I have on a page of mine here:  http://scripts.dragon-it.co.uk/links/batch-search-replace-substitute  I've used it in a few cases, but only when the string has been an ID number or similar that won't appear elsewhere on the line.

Steve
0

LVL 51

Expert Comment

Not a ton different, but this seems to get the job done as well, and fairly easy to understand.

@echo off
setlocal EnableDelayedExpansion
set AppDir=d:\ee
for /f "usebackq skip=1 tokens=*" %%A in ("%AppDir%\in.txt") do (
for %%B in (%%A) do set Account=%%B
echo !Account!
)>>"%AppDir%\out.txt"

~bp
0

LVL 11

Expert Comment

The proper way to have done it is like this:

@echo off
for /f "tokens=*" %%a in (file.csv) do call :processline %%a
exit /b

:processline
echo.%2
goto :eof

0

LVL 43

Expert Comment

We can always debate "proper" Paul... but agreed that is a nice way to deal with the quoted string with commas in.

Steve
0

LVL 11

Expert Comment

Steve

It just makes sense. And sometimes as programmers, we overlook the obvious.
0

LVL 51

Expert Comment

I had actually considered the subroutine approach for parsing, but then decided on the FOR approach.  My thinking was that I was always after the LAST token on the line, and if the format of the line changed in the future where there were more tokens before the last one, I still wanted it to work.  So for example, all of the following should work in the FOR approach:

Etc.  So I decided I liked always getting the last token rather than a particular numbered token.

Yes, I know, my approach doesn't protect against extra tokens being added AFTER the account id, in that case the FOR approach breaks down, whereas the CALL approach will still get the second token.

So both approaches have merit I believe, it really depends on coding style, and which case(s) you what strategy makes the most sense to protect against future changes, what are pretty hard to predict with absolute certainty.

~bp
0

LVL 43

Expert Comment

There are LOTS of approaches to this and as long as they work for the particular circumstances so be it.  Another way would be simply to count back characters from the end of a string until hitting a comma etc. or in a decent language use function like @rightback(string;",").

few thoughts that would work in batch

search back from right one char at time.
search left to right for commas until there is no more
Replace *, until it replaces no more
Replace *", like I did
Call subroutine to split off first and second token or similar with FOR
etc.

All of these can get potentially stumped by data in there, e.g. it would be valid to have & ^ % ! chars in the full name for instance...

Anyway better things to do!

Steve
0

LVL 10

Expert Comment

Thanks Steve :)
0

LVL 11

Expert Comment

I think treating the first data field as a single data item (because of the double-quotes) is the proper way to approach this problem.

Token-counting, comma-counting or string-substitution are not warranted here. It's just a simple matter of 'grab-the-second-data-item-and-run' and DOS provides us with a mechanism to do just that by how it treats double-quoted data on the command-line.

Sure, there are many ways in which we could arrived at the same result. It reinforces the fact programming is a creative process.
0

Featured Post

NTFS file system has been developed by Microsoft that is widely used by Windows NT operating system and its advanced versions. It is the mostly used over FAT file system as it provides superior features like reliability, security, storage, efficienc…
Today, still in the boom of Apple, PC's and products, nearly 50% of the computer users use Windows as graphical operating systems. If you are among those users who love windows, but are grappling to keep the system's hard drive optimized, then you s…
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
Windows 8 came with a dramatically different user interface known as Metro. Notably missing from that interface was a Start button and Start Menu. Microsoft responded to negative user feedback of the Metro interface, bringing back the Start button a…