Solved

PHP - extracting values from HTML

Posted on 2014-02-09
4
356 Views
Last Modified: 2014-02-13
Hi,

I have the following HTML block in a variable called $innerHTML.
<a class="name" href="link.html">Link Text</a>
<span class="phone">
555 123 123
</span>
<p class="address secondary">1407 Mayfair St, City West</p>

Open in new window


I need to extract the following data out -

$name = "Link Text"
$link = "link.html"
$phone = "555 123 123"
$address = "1407 Mayfair St, City West"

What is the best way?

Ive attempted to use Xpath Queries but it doesnt seem efficient as it requires a foreach loop to go through the query result.

Any suggestions are much appreciated.

Thankyou
0
Comment
Question by:mhdi
  • 2
4 Comments
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 39846437
... it doesnt seem efficient as it requires a foreach loop ...
How have you concluded this is not efficient?  Have you measured the elapsed time?  If you haven't, you might want to do that before you dismiss this out-of-hand.  In my experience you cannot find the inefficiency in any reasonable human measure.  You may be talking microseconds, but you're not anywhere near milliseconds with something like this.

Please post the code you've tried and I'll show you how to measure the elapsed time.
0
 

Author Comment

by:mhdi
ID: 39846449
The reason I don't like using xpath queries is because it returns a list of matching nodes which I then need to go over with a foreach loop to extract the nodevalue.  As my html block only has one occurance of each item it doesnt seem like a good way to do it.

Maybe I should have said easier rather than inefficient.
0
 
LVL 108

Accepted Solution

by:
Ray Paseur earned 250 total points
ID: 39846456
As with most programming questions, the quality of the answer to this question is highly dependent on the quality of the test data set.  This is a rather unusual question because PHP is normally used to create HTML, not to parse HTML.  Nevertheless, it can be done.  It's just a matter of coming up with a representative test data set so that the parsing process takes into account sufficient detail about the inputs and outputs.

Please see http://www.laprbass.com/RAY_temp_mhdi.php

<?php // RAY_temp_mhdi.php
error_reporting(E_ALL);


// SEE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28360576.html


// TEST DATA FROM THE POST AT EE
$htm = <<<EOD
<a class="name" href="link.html">Link Text</a>
<span class="phone">
555 123 123
</span>
<p class="address secondary">1407 Mayfair St, City West</p>
EOD;

/**
 * DESIRED VARIABLES FROM THE TEST DATA
 *
$name = "Link Text"
$link = "link.html"
$phone = "555 123 123"
$address = "1407 Mayfair St, City West"
 */

// MAKE AN OBJECT FROM THE STRING OF HTML STATEMENTS
$xml = '<wrap>' . $htm . '</wrap>';
$obj = simpleXML_Load_String($xml);

// ACTIVATE THIS TO SEE THE OBJECT
// var_dump($obj);

// ASSIGN THE OBJECT PROPERTIES TO LOCAL VARIABLES
$name    = (string)$obj->a;
$link    = (string)$obj->a['href'];
$phone   = trim($obj->span);
$address = $obj->p;

// SHOW THE WORK PRODUCT
echo PHP_EOL
. "NAME: $name, "
. "LINK: $link, "
. "PHONE: $phone, "
. "ADDRESS: $address"
;

Open in new window

Best regards, ~Ray
0
 
LVL 15

Assisted Solution

by:Insoftservice
Insoftservice earned 250 total points
ID: 39846473
You could try below link  it might guide you to resolve your issue.

http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_27981504.html
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

This article discusses how to create an extensible mechanism for linked drop downs.
Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to dynamically set the form action using jQuery.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now