Solved

how to get the patterns in a file with perl one-liner

Posted on 2009-05-17
17
327 Views
Last Modified: 2013-11-29
type test.txt
AA01=
(svr = ABCQRS.C.D.E)
AA02=
(svr = BBB1.C.D.E)
BB01=
(svr = ABCXYZ.C.D.E)
CC02=
(svr = ABCQRS.C.D.E)
BB02=
(svr = AAA0.C.D.E)
....
1) How to get the following patterns out:
ABCQRS
ABCXYZ
ABCQRS
with perl one-liner? I can get it out with DOS shell, but it is unpractical slow.
2) Is it possible to remove the duplicate with same one-liner or another one-liner?
0
Comment
Question by:jl66
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 6
  • 2
  • +1
17 Comments
 
LVL 11

Expert Comment

by:climbgunks
ID: 24408700

yes.. it's possible... but it's a pretty ugly line...


please be more specific.   do you want only those 3 patterns pulled out?   what do you mean by out (do you want the line it's on plus the line before it printed out, or just the string if it's found?

when removing duplicates, do you both the duplicate line and the one before it removed?  I'm assuming these are pairs of lines.

And any real reason for a one liner, rather than a short script?
0
 
LVL 84

Expert Comment

by:ozo
ID: 24408703
How do you choose
ABCQRS
ABCXYZ
ABCQRS
and not
BBB1
AAA0
?
Is it because they start with ABC?
Is it because they have 6 letters?
0
 

Author Comment

by:jl66
ID: 24408790
1) Not really ABC nor 6 letters long, but the pattern is similar.
2) can it be done that I select the following lines
(svr = ABCQRS.C.D.E)
(svr = ABCXYZ.C.D.E)
(svr = ABCQRS.C.D.E)
with perl one-liner?
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 84

Expert Comment

by:ozo
ID: 24408816
Again, by what criterion do you select those lines and not
(svr = BBB1.C.D.E)
(svr = AAA0.C.D.E)
we can always select lines 2, 6, and 8, but I'm guessing that may not be the pattern you want either.
0
 
LVL 11

Assisted Solution

by:climbgunks
climbgunks earned 180 total points
ID: 24408851

This line prints out the queries you want... obviously if you change what you're looking for you'll have to change the search pattern... but you're going to have to understand perl regex's to do so...

And while I think writing one-liners is "cute", it's really unmaintainable, and prone to error.  While being able to write code like this is a nice party trick, I'd never use anything like it in the real world.

perl -ne 'print $_=~/ = ([\S\.]+)\)/?$1=~/^ABCQRS|^ABCXYZ/?"$a$_":"":scalar($a=$_,"")' test.txt

and this removes duplicates as well (assuming by duplicates, you mean the 2nd and subsequent occurences of your strings):

perl -ne 'print $_=~/ = ([\S\.]+)\)/?$1=~/(^ABCQRS|^ABCXYZ)/?$b{$1}++?"":"$a$_":"":scalar($a=$_,"")' test.txt

Note above, there is no reason to enter the same search string more than once.    And you didn't answer the question...  By out, do you mean 1) remove them from the output, or 2) display only them?   My code does the 2nd.. though it's easily changed to do the former.  


0
 
LVL 84

Accepted Solution

by:
ozo earned 320 total points
ID: 24408914
perl -lne 'print $1 if /=\s*([A-Z]+)\./'  test.txt

prints

ABCQRS
ABCXYZ
ABCQRS

and

perl -lne 'print $1 if /=\s*([A-Z]+)\./ && !$seen{$1}++'  test.txt

prints

ABCQRS
ABCXYZ

but I have no idea if those patterns generalize to other cases where you have not been clear about what you would want
0
 

Author Comment

by:jl66
ID: 24409376
Thanks for the codes/opinions.
ozo, I got the error when I ran your code. Is there something wrong in my side?
D:\perl -lne 'print $1 if /=\s*([A-Z]+)\./'  test1.txt
Can't find string terminator "'" anywhere before EOF at -e line 1.
------
climbgunks, Thanks for input. I got some error when I got your code:
perl -ne 'print $_=~/ = ([\S\.]+)\)/?$1=~/^ABCQRS|^ABCXYZ/?"$a$_":"":scalar($a=$_,"")' D:\test1.txt
BCXYZ' is not recognized as an internal or external command,
erable program or batch file.
------
The patterns I am looking for are
ABC*
where * can be XYZ or QRS or something else(only alphabetic), but the 1st 3 characters are ABC.
0
 
LVL 84

Assisted Solution

by:ozo
ozo earned 320 total points
ID: 24409395
a dos shell would use " instead of ' to quote arguments
I thought you said Not really ABC
if you are looking for words starting with ABC that could be
perl -lne "print $1 if /\b(ABC\w*)/" test1.txt
or
perl -lne "print $1 if /\b(ABC\w*)/ && !$b{$1}++" test1.txt

0
 

Author Comment

by:jl66
ID: 24409498
Ozo, It works. If the patterns are *ABC (not ABC*), meaning the beginning several characters can be changed, how to revise your code?
0
 
LVL 84

Assisted Solution

by:ozo
ozo earned 320 total points
ID: 24409514
perl -lne "print $1 if /(\w*ABC)\b/ && !$b{$1}++" test1.txt
0
 

Author Comment

by:jl66
ID: 24412082
Ozo,
Thank for the input.
When I ran the command line directly in DOS prompt, I got the right, but when I ran it in a script, I got only a record:
QRSABC

Do you know why?
0
 
LVL 84

Expert Comment

by:ozo
ID: 24414257
How did you run it in a script?
0
 

Author Comment

by:jl66
ID: 24416988
In the script, setup perl5lib and path to perl.exe
perl -lne "print $1 if /\b(ABC\w*)/ && !$b{$1}++" test1.txt > testout.txt

Check testout.txt. There is only one record in it. without > testout.txt, only one record shows on the screen.
0
 

Author Comment

by:jl66
ID: 24417010
Sorry it's wrong line. the code line should be
perl -lne "print $1 if /(\w*ABC)\b/" test1.txt > testout.txt
0
 

Author Comment

by:jl66
ID: 24417036
Still wrong. It should be
perl -lne "print $1 if /(\w*ABC)\b/ && !$b{$1}++" test1.txt > testout.txt
0
 

Author Closing Comment

by:jl66
ID: 31582466
Excellent except the last piece in a script.
0
 
LVL 39

Expert Comment

by:Adam314
ID: 24426533
By script... is this a .bat file?  A .cmd file?  Some other language?
0

Featured Post

Building an interactive eFuture classroom

Watch and learn how ATEN provided a total control system solution including seamless switching matrix switch, HDBaseT extenders, PDU, lighting control to build an interactive eFuture classroom.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Learn how to PXE Boot both BIOS & UEFI machines with DHCP Policies and Custom Vendor Classes
This article summaries thoughts and ideas from two years of sustained use. It provides good reasoning to make the jump to Windows 10.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

695 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question