Scraping html using Domdoc + PHP

Member_2_5230414
Member_2_5230414 used Ask the Experts™
on
I would like to scrape the following HTML
     <div class="venue-event-list " rel="GB">
                                <div class="tracks-list">
    <div class="single-track">
                <a href="//livevideo.betfair.com/Default.do?mi=119408124" target="_blank" class="live-video-link"><div class="bf-icon-live-video tag-i13n i13n-ltxt-LVid i13n-sec-GB i13n-tab-today" title="Watch now on Betfair Live Video"></div></a>
        <div class="info-container">
            <span class="track-name">
                <a class="tag-i13n i13n-ltxt-meeting i13n-sec-GB i13n-tab-today" href="/exchange/plus/#/horse-racing/market/1.119408124">Lingfield</a>
            </span>
            <div class="races-list">
                    
                    
    <div class="single-race" id="m-1_119408124">
        <span class="race-time link-text">
            <a class="race-link  tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today"
                href="/exchange/plus/#/horse-racing/market/1.119408124"
                title="5f Nursery | 7 Runners">14:10</a>
        </span>
            <span class="separator">|</span>
    </div>
                    
                    
    <div class="single-race" id="m-1_119408128">
        <span class="race-time link-text">
            <a class="race-link  tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today"
                href="/exchange/plus/#/horse-racing/market/1.119408128"
                title="6f Mdn Stks | 11 Runners">14:40</a>
        </span>
            <span class="separator">|</span>
    </div>
                    
                    
    <div class="single-race" id="m-1_119408132">
        <span class="race-time link-text">
            <a class="race-link  tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today"
                href="/exchange/plus/#/horse-racing/market/1.119408132"
                title="7f Mdn Stks | 6 Runners">15:10</a>
        </span>
            <span class="separator">|</span>
    </div>
                    
                    
    <div class="single-race" id="m-1_119408136">
        <span class="race-time link-text">
            <a class="race-link  tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today"
                href="/exchange/plus/#/horse-racing/market/1.119408136"
                title="2m Hcap | 12 Runners">15:40</a>
        </span>
            <span class="separator">|</span>
    </div>
                    
                    
    <div class="single-race" id="m-1_119408140">
        <span class="race-time link-text">
            <a class="race-link  tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today"
                href="/exchange/plus/#/horse-racing/market/1.119408140"
                title="1m2f Sell Stks | 6 Runners">16:10</a>
        </span>
            <span class="separator">|</span>
    </div>
                    
                    
    <div class="single-race" id="m-1_119408144">
        <span class="race-time link-text">
            <a class="race-link  tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today"
                href="/exchange/plus/#/horse-racing/market/1.119408144"
                title="1m3f Hcap | 8 Runners">16:40</a>
        </span>
            <span class="separator">|</span>
    </div>
                    
                    
    <div class="single-race" id="m-1_119408148">
        <span class="race-time link-text">
            <a class="race-link  tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today"
                href="/exchange/plus/#/horse-racing/market/1.119408148"
                title="1m1f Hcap | 14 Runners">17:10</a>
        </span>
    </div>
            </div>
        </div>
    </div>
                        </div>
                                <div class="tracks-list">
    <div class="single-track">
                <a href="//livevideo.betfair.com/Default.do?mi=119408153" target="_blank" class="live-video-link"><div class="bf-icon-live-video tag-i13n i13n-ltxt-LVid i13n-sec-GB i13n-tab-today" title="Watch now on Betfair Live Video"></div></a>
        <div class="info-container">
            <span class="track-name">
                <a class="tag-i13n i13n-ltxt-meeting i13n-sec-GB i13n-tab-today" href="/exchange/plus/#/horse-racing/market/1.119408153">Wolverhampton</a>
            </span>
            <div class="races-list">
                    
                    
    <div class="single-race" id="m-1_119408153">
        <span class="race-time link-text">
            <a class="race-link  tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today"
                href="/exchange/plus/#/horse-racing/market/1.119408153"
                title="5f Mdn Stks | 7 Runners">14:20</a>
        </span>
            <span class="separator">|</span>
    </div>
                    
                    
    <div class="single-race" id="m-1_119408157">
        <span class="race-time link-text">
            <a class="race-link  tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today"
                href="/exchange/plus/#/horse-racing/market/1.119408157"
                title="1m6f Hcap | 7 Runners">14:50</a>
        </span>
            <span class="separator">|</span>
    </div>
                    
                    
    <div class="single-race" id="m-1_119408161">
        <span class="race-time link-text">
            <a class="race-link  tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today"
                href="/exchange/plus/#/horse-racing/market/1.119408161"
                title="1m4f Sell Stks | 5 Runners">15:20</a>
        </span>
            <span class="separator">|</span>
    </div>
                    
                    
    <div class="single-race" id="m-1_119408165">
        <span class="race-time link-text">
            <a class="race-link  tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today"
                href="/exchange/plus/#/horse-racing/market/1.119408165"
                title="1m1f Hcap | 13 Runners">15:50</a>
        </span>
            <span class="separator">|</span>
    </div>
                    
                    
    <div class="single-race" id="m-1_119408169">
        <span class="race-time link-text">
            <a class="race-link  tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today"
                href="/exchange/plus/#/horse-racing/market/1.119408169"
                title="1m1f Hcap | 11 Runners">16:20</a>
        </span>
            <span class="separator">|</span>
    </div>
                    
                    
    <div class="single-race" id="m-1_119408173">
        <span class="race-time link-text">
            <a class="race-link  tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today"
                href="/exchange/plus/#/horse-racing/market/1.119408173"
                title="1m Mdn Stks | 11 Runners">16:50</a>
        </span>
            <span class="separator">|</span>
    </div>
                    
                    
    <div class="single-race" id="m-1_119408177">
        <span class="race-time link-text">
            <a class="race-link  tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today"
                href="/exchange/plus/#/horse-racing/market/1.119408177"
                title="1m Hcap | 13 Runners">17:20</a>
        </span>
    </div>
            </div>
        </div>
    </div>
                        </div>

Open in new window

I have used the following code to pull the racename and the time of the race

   
 $url         = ""; 
    $html        = file_get_contents($url);
    $dom         = new DOMDocument();
    @$dom->loadHTML($html);
    $dom->preserveWhiteSpace = false;
    $xpath                   = new DOMXPath($dom);
    //pull the individual cards for the day
    //li class="rac-cardsclass="ix ixc"
    $getdropdown             = '//div[contains(@class, "tracks-list")]';
    $getdropdown2            = $xpath->query($getdropdown);
    //loop through each individual card
    foreach ($getdropdown2 as $dropresults) {
    echo $dropresults->textContent. "<br />";
    }

Open in new window


What i would like to do is pull the meeting name if only the link (shown below) contains "GB" and "today" (this is within the class text) -

    >  <a class="tag-i13n i13n-ltxt-meeting i13n-sec-GB i13n-tab-today"
    > href="/exchange/plus/#/horse-racing/market/1.119408124">Lingfield</a>

Open in new window


so the outcome would be lingfield... if this is true i would like to then pull the time of the race and the market id from the following :
 

    <a class="race-link  tag-i13n i13n-ltxt-race i13n-sec-GB i13n-tab-today"
            href="/exchange/plus/#/horse-racing/market/1.119408124"
            title="5f Nursery | 7 Runners">14:10</a>

Open in new window


so the outcome would be:

   
 Lingfield 14:10 1.119408124 
    Lingfield 14:40 1.119408144
     ............................. 
    Wolverhampton 14:20 1.119408153

Open in new window

Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Most Valuable Expert 2011
Top Expert 2016

Commented:
Please show us the true URL you are trying to scrape.  We would need to be able to use PHP file_get_contents() (the same way you do) to read the HTML in order to get the test data.
Expert of the Year 2008
Top Expert 2008
Commented:
Have you considered using "PHP Simple HTML DOM Parser"?
http://simplehtmldom.sourceforge.net/

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial