Link to home
Start Free TrialLog in
Avatar of BigmacMc
BigmacMc

asked on

Find text at end of line

I generate a csv file with lines that vary in content and length, but have a consistent construction.  Example:

     "CN=username1,ou=ou1,ou=ou2,dc=path1,dc=path2",samid1
     "CN=username2,ou=ou1,ou=ou2,ou=ou3,dc=path1,dc=path2",samid2

I want the samid at the end after the last comma and place it in an output text file.

I'd prefer using FOR or Findstr or some other DOS based command for accomplishing this.
Avatar of Bill Prew
Bill Prew

Should be able to just do:

@echo off
for /F "usebackq tokens=6 delims=," %%A in ("c:\temp\inputfile.txt") do (
  echo %%6>>"c:\temp\outputfile.txt"
)

Open in new window

Avatar of BigmacMc

ASKER

1st --  I had to change echo %%6>>"C:\temp\outputfile.txt" to echo %%A>>"c:\temp\outpufile.txt".  Otherwise it just outputed %%6.  This works, but is dependent on the number of delimiters being the same and they're not.

2nd--  The number of delims may vary by line, some may have five, some may have six, others 7...all in the same file.

Is there way to either ignore or delete what is double quotes; or...to just capture what's after the last delimiter, no matter how there are?

BigmacMC
Bill - I was trying to work out if you could use " as a delimeter... i.e. "delims=""" but couldn't get it to work.
Next option was to pick each line, then replace upto ", with blank, i.e.

@echo off
setlocal enabledelayedexpansion
for /f "tokens=*" %%a in (test.txt) do (
  set line=%%a
  echo !line:*",=!
)

Steve
Steve,

Almost there.

This works, however I need the result to output to a file; i.e. outputfile.txt.  I tried just adding >>outputfile.txt after echo !line:*",=!   but that just echos the addition, doesn't write it to a file.

BigmacMC
SOLUTION
Avatar of Steve Knight
Steve Knight
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
BTW to explain this...

setlocal enabledelayedexpansion -- subject in itself but in summary means you can use variables in a loop using !variable! instead of %.  Without this the variable looks the same until after the loop.

( -- start of what gets exported to file

for /f "tokens=*" %%a in (test.txt) do (  )  -- runs the code between ( and ) for each line in the file with the line in "%%a"

set line=%%a - assign the line to a variable so we can edit it

echo !line:*",=!   -- replace upto matching ", with nothing and display it

) > output.txt -- write everything between that and matching ( to the file output.txt

Really off now!

Steve
Here's what I ended up using and it works perfectly.

@echo off

set appdir=c:\scripts\test

setlocal enabledelayedexpansion
for /f "skip=1 tokens=*" %%a in (%APPDIR%\nausers.csv) do (
  set line=%%a
  echo !line:*",=!
)>>%APPDIR%\outputfile.txt

Steve & Bill....thanks so much for you help.

BigmacMC
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
@BigmacMc

According to your supplied data sample, the script you say the works in 38373458 actually should not work.  This is because column lengths are not the same.  My version of batch file corrects this issue.

Cheers,
Rene
Glad it worked for you, certainly did on my two line sample, though who knows when you throw thousands of lines of dodgy Microsoft data at it.

Rene - not sure what you are saying here.  It seems there is one field "qualified name" with quotes around and commas in, then a comma and the field he is interested in so the number of commas in the first field is not an issue as my script removes everything upto the first , after a ".

Steve
Rene,

I tried yours and it does in fact work perfectly.  Not exactly sure what the difference is, because the one I posted also worked with different line lengths/delimiters.

Thanks

BigmacMC
@Steve
You got me by surprise with: !Line:*",=!
Thanks pal, I now know something new... :)

@BigmacMc
Sorry for the confusion. The last script you posted works indeed

Cheers,
Rene
So both solutions work.

Guys...thanks very much.

BigmacMC
@Steve
Would a code to keep all before ", and discard after exist?
Something like: !Line:",*=!

Cheers
@BigmacMc
You're welcome
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks to Steve and Rene for speedy replies that worked.  Solved my problem.
Rene - I don't think you can match a wildcard at the end of the string.  I suppose you would have to do the opposite, i.e. find the bit after what you want then remove it with substitution, something like:

@echo off
setlocal enabledelayedexpansion
for /f "tokens=*" %%a in (test.txt) do (
  set line=%%a
  set delete=!line:*",=!
  echo   Need to delete !delete!
  call set line=%%line:,!delete!=%%
  echo !line!
)

which is one of the ones I have on a page of mine here:  http://scripts.dragon-it.co.uk/links/batch-search-replace-substitute  I've used it in a few cases, but only when the string has been an ID number or similar that won't appear elsewhere on the line.

Steve
Not a ton different, but this seems to get the job done as well, and fairly easy to understand.

@echo off
setlocal EnableDelayedExpansion
set AppDir=d:\ee
for /f "usebackq skip=1 tokens=*" %%A in ("%AppDir%\in.txt") do (
  for %%B in (%%A) do set Account=%%B
  echo !Account!
)>>"%AppDir%\out.txt"

Open in new window

~bp
The proper way to have done it is like this:

@echo off
for /f "tokens=*" %%a in (file.csv) do call :processline %%a
exit /b

:processline
  echo.%2
goto :eof

Open in new window

We can always debate "proper" Paul... but agreed that is a nice way to deal with the quoted string with commas in.

Steve
Steve

It just makes sense. And sometimes as programmers, we overlook the obvious.
I had actually considered the subroutine approach for parsing, but then decided on the FOR approach.  My thinking was that I was always after the LAST token on the line, and if the format of the line changed in the future where there were more tokens before the last one, I still wanted it to work.  So for example, all of the following should work in the FOR approach:

"CN=username1,ou=ou1,ou=ou2,dc=path1,dc=path2",samid1
"CN=username2,ou=ou1,ou=ou2,ou=ou3,dc=path1,dc=path2",samid2
"CN=username1,ou=ou1,ou=ou2,dc=path1,dc=path2","new token",samid1
new token,"CN=username2,ou=ou1,ou=ou2,ou=ou3,dc=path1,dc=path2",samid2

Etc.  So I decided I liked always getting the last token rather than a particular numbered token.

Yes, I know, my approach doesn't protect against extra tokens being added AFTER the account id, in that case the FOR approach breaks down, whereas the CALL approach will still get the second token.

So both approaches have merit I believe, it really depends on coding style, and which case(s) you what strategy makes the most sense to protect against future changes, what are pretty hard to predict with absolute certainty.

~bp
There are LOTS of approaches to this and as long as they work for the particular circumstances so be it.  Another way would be simply to count back characters from the end of a string until hitting a comma etc. or in a decent language use function like @rightback(string;",").

few thoughts that would work in batch

search back from right one char at time.
search left to right for commas until there is no more
Replace *, until it replaces no more
Replace *", like I did
Call subroutine to split off first and second token or similar with FOR
etc.

All of these can get potentially stumped by data in there, e.g. it would be valid to have & ^ % ! chars in the full name for instance...

Anyway better things to do!

Steve
Thanks Steve :)
I think treating the first data field as a single data item (because of the double-quotes) is the proper way to approach this problem.

Token-counting, comma-counting or string-substitution are not warranted here. It's just a simple matter of 'grab-the-second-data-item-and-run' and DOS provides us with a mechanism to do just that by how it treats double-quoted data on the command-line.

Sure, there are many ways in which we could arrived at the same result. It reinforces the fact programming is a creative process.