Solved

Perl Parse String Question

Posted on 2009-07-13
9
274 Views
Last Modified: 2012-05-07
Hi,
I have a question on string regex in Perl. I have strings like

user/112312312789071820883/label/qwe
user/112312312789071820883/state/asdas/cvbv
user/-/state/asdas/cvbv

I would like to get third substrin, i.e. label or state in the above case. Hoe can I get that in Perl?

Thanks.
0
Comment
Question by:Purdue_Pete
  • 4
  • 2
  • 2
  • +1
9 Comments
 
LVL 13

Accepted Solution

by:
marchent earned 25 total points
ID: 24842695

$string = "user/112312312789071820883/state/asdas/cvbv";
print $string =~ m|^[^/]*/[^/]*/([^/]*)|;
 
 
$string = "user/112312312789071820883/label/qwe
user/112312312789071820883/state/asdas/cvbv
user/-/state/asdas/cvbv";
$\ = $/;
print for $string =~ m|^[^/]*/[^/]*/([^/]*)|msg;

Open in new window

0
 

Author Comment

by:Purdue_Pete
ID: 24842776
marchent,
Since it is a single string for each "user/....", I guess i will just replace m by s, right?

Can you explain what this line does in your code in detail?
print $string =~ m|^[^/]*/[^/]*/([^/]*)|;

K
0
 
LVL 39

Expert Comment

by:Adam314
ID: 24842881

while(<DATA>) {
	my $third1 = (split/\//)[2];
	my ($third2) = m|.*?/.*?/(.*?)/|;
	
	print "third=$third1, $third2\n";
}
 
__DATA__
user/112312312789071820883/label/qwe
user/112312312789071820883/state/asdas/cvbv
user/-/state/asdas/cvbv

Open in new window

0
Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

 
LVL 84

Expert Comment

by:ozo
ID: 24843029
(split'/')[2]
0
 
LVL 84

Assisted Solution

by:ozo
ozo earned 25 total points
ID: 24843063
perl -MYAPE::Regex::Explain -e "print YAPE::Regex::Explain->new(qr|^[^/]*/[^/]*/([^/]*)|)->explain"
The regular expression:

(?-imsx:^[^/]*/[^/]*/([^/]*))

matches as follows:
 
NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  [^/]*                    any character except: '/' (0 or more times
                           (matching the most amount possible))
----------------------------------------------------------------------
  /                        '/'
----------------------------------------------------------------------
  [^/]*                    any character except: '/' (0 or more times
                           (matching the most amount possible))
----------------------------------------------------------------------
  /                        '/'
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    [^/]*                    any character except: '/' (0 or more
                             times (matching the most amount
                             possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------
0
 
LVL 84

Expert Comment

by:ozo
ID: 24847166
neither /m nor /s is necessary
m just allows ^ and $ to match at the beginning and end of lines in addition to the beginning and end of the string,
s just allows . to match "\n"

my ($third2) = m|/.*?/(.*?)/|;
0
 

Author Comment

by:Purdue_Pete
ID: 24869883
marchent / ozo,
From this line,
print $string =~ m|^[^/]*/[^/]*/([^/]*)|;

I do get 'state' or 'label'. How can I store that in a variable?

K
0
 
LVL 39

Expert Comment

by:Adam314
ID: 24870457

my $third = $string =~ m|^[^/]*/[^/]*/([^/]*)|;

Open in new window

0
 
LVL 84

Expert Comment

by:ozo
ID: 24872137
my ($third) = $string =~ m|/[^/]*/([^/]*)|;
0

Featured Post

Is Your AD Toolbox Looking More Like a Toybox?

Managing Active Directory can get complicated.  Often, the native tools for managing AD are just not up to the task.  The largest Active Directory installations in the world have relied on one tool to manage their day-to-day administration tasks: Hyena. Start your trial today.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Do you hate spam? I do, and I am willing to bet you do as well. I often wonder, though, "if people hate spam so much, why do they still post their email addresses on the web?" I'm not talking about a plain-text posting here. I am referring to the fa…
Background Still having to process all these year-end "csv" files received from all these sources (including Government entities), sometimes we have the need to examine the contents due to data error, etc... As a "Unix" shop, our only readily …
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question