Solved

Get part text from html source

Posted on 2008-10-01
7
186 Views
Last Modified: 2012-05-05
the source is

<strong>No:</strong> <ins>59520</ins>(f)<br /> or
<strong>No:</strong> <ins>59520</ins>(m)<br />

how to get "m"   or  "f" and 59520 is not a constant it is always number but not constant


and another one


<strong>@O:>@:</strong> <ins class="female">BeBcHo0o0o__</ins><em>

i need the text betwen "ins"  tag  this text "BeBcHo0o0o__"

0
Comment
Question by:dupetata
  • 4
  • 3
7 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 22613349
for( '<strong>No:</strong> <ins>59520</ins>(f)<br />', '<strong>No:</strong> <ins>59520</ins>(m)<br />' ){
     print m(</ins>\W*(\w)),"\n";
}

for( '<strong>@O:>@:</strong> <ins class="female">BeBcHo0o0o__</ins><em>' ){
    print m(<ins\b[^>]*>(\w+)),"\n";
}
0
 

Author Comment

by:dupetata
ID: 22613454
ozo the number 59520 isnot a constant is always changing
0
 

Author Comment

by:dupetata
ID: 22613469
and BeBcHo0o0o__ too
0
Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
LVL 84

Expert Comment

by:ozo
ID: 22613516
that's why the m(</ins>\W*(\w)) and m(<ins\b[^>]*>(\w+) try to match the ins, not the 59520  or BeBcHo0o0o__
if the ins also changes, then I'm not sure how you want to determine which part to get.
0
 

Author Comment

by:dupetata
ID: 22613714
ok u didnt get me i have for loop


for my $ids ($start..$end) {
        my $res=$www->get("http://site.com/u:$ids");
        unless($res->is_success) {
                warn "Could not get id $ids: " . $res->code . "\n";
                next;
        }
i need to do it that way

if($res->content =~ /<strong>No:</strong> <ins>some number</ins>(*)<br />/)

and get the value of *
then in the same loop

($value) = $res->content =~ /<ins class="female">***</ins><em>/

and get the value of ***
0
 
LVL 84

Expert Comment

by:ozo
ID: 22613899
if( $res->content =~ /<strong>No:<\/strong> <ins>\d+\/ins>(.*?)<br \/>/ ){
    print $1;
}
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 22613956
if( $res->content =~ /<strong>No:<\/strong> <ins>\d+<\/ins>(.*?)<br \/>/ ){
    print $1;
}

($value) = $res->content =~ /<ins class="female">(.*?)<\/ins><em>/;
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
cpan issue 1 65
Removing file extension within a file. 4 98
Formatting stings with pack and printf in perl 5 78
parse a file and get data out 11 77
I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

856 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question