Solved

Grabbing HTML included into XML elements

Posted on 2004-08-19
8
357 Views
Last Modified: 2013-11-19
Hi,

I have the following XML-Code I want to parse  using PHP 4:

<?xml version="1.0" standalone="yes" ?>
<!DOCTYPE GlossarXML [
  <!ELEMENT EXPLANATION (SHORT, DETAILED, BIBLIOGRAPHY, LINKS )>
    <!ELEMENT SHORT (#PCDATA)>
    <!ELEMENT DETAILED (#PCDATA)>
    <!ELEMENT BIBLIOGRAPHY (SOURCE)>
      <!ELEMENT SOURCE (#PCDATA)>
    <!ELEMENT HLINKS (HLINK)>
      <!ELEMENT HLINK (HLINKTITLE, HLINKREF)>
        <!ELEMENT HLINKTITLE (#PCDATA)>
        <!ELEMENT HLINKREF (#PCDATA)>  
]>
 
<EXPLANATION>
  <SHORT>
     <p>this is a <b><font color="red">short</font></b> explanation showed on WAP devices.</p>
  </SHORT>
  <DETAILED>
     <p>this is the <b><font color="green">detailed</font></b> text displayed on non-WAP devices.</p>
  </DETAILED>
  <BIBLIOGRAPHY>
     <SOURCE>
        Berliner Abendblatt, 3. Ausgabe, Seite 3, Absatz 4
     </SOURCE>
  </BIBLIOGRAPHY>
  <HLINKS>
     <HLINK>
        <HLINKTITLE>Heise Verlag</HLINKTITLE>
        <HLINKREF>http://www.heise.de</HLINKREF>
     </HLINK>
  </HLINKS>
</EXPLANATION>


So, what I need is a PHP script that parses the XML Code (assume it is contained in a string $xml_content).

The Parser has to grab the content of the XML-Elements and sub elements and put it into an associative array.

BUT for the elements SHORT and DETAILED it only has to return the content of the surrounding XML-tags at a whole, but not the content for every included HTML tag!

Who can help?

0
Comment
Question by:WebFerret
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
8 Comments
 
LVL 48

Expert Comment

by:hernst42
ID: 11846720
Try using this regular expression for that case:

preg_match_all('/<(SHORT|DETAILED)>(.*)<\/\1>/iUs', $xml_content, $m);
var_dump($m);

$m has the following structure:
count($m[1]) : number of found short and detailed tags
$m[1][$i] = type of $i th tag (short or detailed) found
$m[2][$i] = content of the $i th tag
0
 

Author Comment

by:WebFerret
ID: 11846774
LOL okay, I forgot to mention that I don't want a RegEx solution but a generic solution that detects recognizes HTML code and gives back HTML code as a single object and not splitted into the separate HTML tags...

0
 
LVL 6

Expert Comment

by:merwetta1
ID: 11848123
i think your solution will require some RegEx. what's wrong with a RegEx solution and what is a "generic solution"?
0
 
LVL 48

Accepted Solution

by:
hernst42 earned 330 total points
ID: 11849343
With a xml-parser it would look like, you just have to implode the tags back to text in certain circumstances:


function MENUstartElement($parser, $name, $attrs) {
    switch (strtoupper($name)) {
        case 'SHORT':
            $GLOBALS['inTag']=true;
            $GLOBALS['TagType']='s';
            break;

        case 'DETAILED':
            $GLOBALS['inTag']=true;
            $GLOBALS['TagType']='d';
            break;

        default:
            if ($GLOBALS['inTag']) {
                $GLOBALS['tmpTagContent'] .= "<$name";
                if ( count($attrs) >0) {
                    foreach ($attrs as $n => $v) {
                        $GLOBALS['tmpTagContent'] .= " $n=\"$v\"";
                    }
                }
                $GLOBALS['tmpTagContent'] .= '>';
            }
    }
}

function MENUendElement($parser, $name) {
    switch (strtoupper($name)) {
        case 'SHORT':
        case 'DETAILED':
            $GLOBALS['inTag']=false;
            $GLOBALS['extractedText'][$GLOBALS['TagType']][] = $GLOBALS['tmpTagContent'];
            $GLOBALS['tmpTagContent'] = '';
            break;

        default:
            if ($GLOBALS['inTag']) {
                $GLOBALS['tmpTagContent'] .= "</$name>";
            }
    }
}

function MENUcharacterData($parser, $data) {
    if ($GLOBALS['inTag']) {
        $GLOBALS['tmpTagContent'] .= $data;
    }
}

$xml_parser = xml_parser_create();
xml_parser_set_option($xml_parser,XML_OPTION_CASE_FOLDING,0);
xml_set_element_handler($xml_parser, "MENUstartElement", "MENUendElement");
xml_set_character_data_handler($xml_parser, "MENUcharacterData");

xml_parse($xml_parser, $xml_content, true );
printf("XML error: %s at line %d",
                    xml_error_string(xml_get_error_code($xml_parser)),
                                        xml_get_current_line_number($xml_parser));

xml_parser_free($xml_parser);

var_dump($GLOBALS['extractedText']);

0
 
LVL 48

Expert Comment

by:hernst42
ID: 12865329
   Accept: hernst42 {http:#11849343}
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Simple function not working 7 44
PHP delete contents of file- before writing to it 6 50
Special characters in a TCPDF 4 26
php output utf-8 problem 6 22
Browsers only know CSS so your awesome SASS code needs to be translated into normal CSS. Here I'll try to explain what you should aim for in order to take full advantage of SASS.
This article discusses how to create an extensible mechanism for linked drop downs.
The viewer will learn how to count occurrences of each item in an array.
The viewer will receive an overview of the basics of CSS showing inline styles. In the head tags set up your style tags: (CODE) Reference the nav tag and set your properties.: (CODE) Set the reference for the UL element and styles for it to ensu…

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question