how to extract certain rows of a table from html using regex

I need to extract certain records from an html page. And those recrods are in the form of table and I need only those rows which have a "Review" button in them. The link for the web page is http://campbellcollaboration.org/lib/index.php?noframe&go=browse_small&sort=title&view=all
or I have pasted html of three rows from where I need to filter the records

Get only those rows which have <span style="padding: 3px;">Review</span>

thanks
<tr>
    <td style="padding-left: 5px; width: 26px;">
        7.
    </td>
    <td style="width: 120px; padding-left: 0px; padding-right: 15px;">
        <a class="badge_new badge_new_title_proposal ui-corner-all" rel="nofollow" href="download/61/">
            <span style="padding: 3px;">Title proposal</span> </a><a class="badge_new badge_new_protocol ui-corner-all"
                rel="nofollow" href="download/62/"><span style="padding: 3px;">Protocol</span>
        </a><a class="badge_new badge_new_review ui-corner-all" rel="nofollow" href="download/63/">
            <span style="padding: 3px;">Review</span> </a>
    </td>
    <td>
        <table cellspacing="0" cellpadding="0" style="width: 100%;">
            <tbody>
                <tr>
                    <td style="width: 70px; color: rgb(0, 102, 153);">
                        Title:
                    </td>
                    <td colspan="3" style="font-weight: bold; color: rgb(0, 0, 0);">
                        Approaches to parent involvement for improving the academic performance of elementary
                        school age children
                    </td>
                </tr>
                <tr>
                    <td style="color: rgb(0, 102, 153);">
                        Authors:
                    </td>
                    <td colspan="3">
                        Chad Nye, Jamie Schwartz, Herbert Turner
                    </td>
                </tr>
                <tr>
                    <td style="color: rgb(0, 102, 153);">
                        Published:
                    </td>
                    <td colspan="3">
                        21.06.2006
                    </td>
                </tr>
                <tr>
                    <td style="color: rgb(0, 102, 153);">
                        Group:
                    </td>
                    <td style="text-align: left;">
                        Education
                    </td>
                </tr>
            </tbody>
        </table>
    </td>
    <td style="width: 50px; text-align: right; padding-right: 20px;">
        <input type="checkbox" value="1" name="export_ris_checkbox[13]" class="checkbox browse_checkbox">
    </td>
</tr>
<tr>
    <td style="padding-left: 5px; width: 26px;">
        10.
    </td>
    <td style="width: 120px; padding-left: 0px; padding-right: 15px;">
        <a class="badge_new badge_new_title_proposal ui-corner-all" rel="nofollow" href="download/351/">
            <span style="padding: 3px;">Title proposal</span> </a><a class="badge_new badge_new_protocol ui-corner-all"
                rel="nofollow" href="download/352/"><span style="padding: 3px;">Protocol</span>
            </a>
    </td>
    <td>
        <table cellspacing="0" cellpadding="0" style="width: 100%;">
            <tbody>
                <tr>
                    <td style="width: 70px; color: rgb(0, 102, 153);">
                        Title:
                    </td>
                    <td colspan="3" style="font-weight: bold; color: rgb(0, 0, 0);">
                        Behavioral stuttering interventions in school age children 4-18 years of age
                    </td>
                </tr>
                <tr>
                    <td style="color: rgb(0, 102, 153);">
                        Authors:
                    </td>
                    <td colspan="3">
                        Carl Herder, Courtney Howard, Chad Nye, Jamie Schwartz, Herbert Turner, Martine
                        Vanryckeghem
                    </td>
                </tr>
                <tr>
                    <td style="color: rgb(0, 102, 153);">
                        Published:
                    </td>
                    <td colspan="3">
                        12.04.2007
                    </td>
                </tr>
                <tr>
                    <td style="color: rgb(0, 102, 153);">
                        Group:
                    </td>
                    <td style="text-align: left;">
                        Education
                    </td>
                </tr>
            </tbody>
        </table>
    </td>
    <td style="width: 50px; text-align: right; padding-right: 20px;">
        <input type="checkbox" value="1" name="export_ris_checkbox[71]" class="checkbox browse_checkbox">
    </td>
</tr>
<tr>
    <td style="padding-left: 5px; width: 26px;">
        21.
    </td>
    <td style="width: 120px; padding-left: 0px; padding-right: 15px;">
        <a class="badge_new badge_new_protocol ui-corner-all" rel="nofollow" href="download/92/">
            <span style="padding: 3px;">Protocol</span> </a><a class="badge_new badge_new_review ui-corner-all"
                rel="nofollow" href="download/93/"><span style="padding: 3px;">Review</span>
        </a><a class="badge_new badge_new_abstract ui-corner-all" rel="nofollow" href="download/94/">
            <span style="padding: 3px;">User abstract</span> </a>
    </td>
    <td>
        <table cellspacing="0" cellpadding="0" style="width: 100%;">
            <tbody>
                <tr>
                    <td style="width: 70px; color: rgb(0, 102, 153);">
                        Title:
                    </td>
                    <td colspan="3" style="font-weight: bold; color: rgb(0, 0, 0);">
                        Cognitive-behavioural interventions for children who have been sexually abused
                    </td>
                </tr>
                <tr>
                    <td style="color: rgb(0, 102, 153);">
                        Authors:
                    </td>
                    <td colspan="3">
                        Geraldine Macdonald, Julian Higgins, Paul Ramchandani
                    </td>
                </tr>
                <tr>
                    <td style="color: rgb(0, 102, 153);">
                        Published:
                    </td>
                    <td colspan="3">
                        06.11.2006
                    </td>
                </tr>
                <tr>
                    <td style="color: rgb(0, 102, 153);">
                        Group:
                    </td>
                    <td style="text-align: left;">
                        Social Welfare
                    </td>
                </tr>
            </tbody>
        </table>
    </td>
    <td style="width: 50px; text-align: right; padding-right: 20px;">
        <input type="checkbox" value="1" name="export_ris_checkbox[19]" class="checkbox browse_checkbox">
    </td>
</tr>

Open in new window

mmalik15Asked:
Who is Participating?
 
käµfm³d 👽Commented:
Provided your HTML is properly structured (as in XML structured--that is tags are properly nested and are properly closed), then you should be able to use the following:
(?s)<tr(?:>| [^>]*>)(?:.(?!</tr>))*?<span(?:>| [^>]*>)Review</span>.*?</tr>

Open in new window

0
 
grayeCommented:
Another technique would be to use an XPath query.   Recall the HTML is really just XML, so the normal XML filtering cababilities of the .Net Framework apply.

using XPath, you can get the entire <tr> contents for any row that has a <span> element that is equal to "Review".  In this example, it'd be something like this:

      table/tr[td/span = 'Review')]


http://msdn.microsoft.com/en-us/library/ms256086.aspx
0
 
käµfm³d 👽Commented:
Recall the HTML is really just XML,

That's only if the HTML follows strict structuring, like XML does. There is still the possibility  of having opening tags without matching closing tags in HTML, which would violate XML's well-formedness requirement. HTML is still in transition, AFAIK, to become rigid in structure.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.