Link to home
Start Free TrialLog in
Avatar of webmanager
webmanagerFlag for Canada

asked on

How to sort an XML file by "title" using PHP

Hi there,

I have an XML file with about 1000 entries in it.

The XML is formatted similar to this:




<?xml version="1.0" encoding="UTF-8"?>
<urlset xmls="http://www.sitemaps.org/schemas/sitemap/0.9">

<url>
<title>Programs</title>
<loc>programs.php</loc>
<changefreq>always</changefreq>
<priority>0.9</priority>
</url>

<url>
<title>Training</title>
<loc>training.php</loc>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>

<url>
<title>College</title>
<loc>college.php</loc>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>

<url>
<title>Academic</title>
<loc>academic.php</loc>
<changefreq>daily</changefreq>
<priority>0.7</priority>
</url>

<url>
<title>University</title>
<loc>University.php</loc>
<changefreq>daily</changefreq>
<priority>0.7</priority>
</url>

<url>
<title>Education</title>
<loc>education.php</loc>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>

<url>
<title>Apply now</title>
<loc>apply.php</loc>
<changefreq>daily</changefreq>
<priority>0.7</priority>
</url>

<url>
<title>Policies</title>
<loc>policies.php</loc>
<changefreq>daily</changefreq>
<priority>0.7</priority>
</url>

</urlset>



My current PHP code looks like this:

<?php

$xml_file = new SimpleXMLElement('sitemap.xml', null, true);

echo '<ul>';

foreach ($xml_file->url as $itemChild) {
      echo '<li><a href="' . $itemChild->loc . '">' . $itemChild->title . "</a></li>\n";
}

echo '</ul>';
?>


I want to sort the XML by the title attribute, and at each new letter in the alphabet, have a new list started.  Basically, I want to create an automated A-Z index listing.

I'm new to XML, so I'm not sure what to do here.
Avatar of gr8gonzo
gr8gonzo
Flag of United States of America image

PHP doesn't really have the facilities to sort XML very well, so it's better to convert the XML to an array to do any sorting. Then you can just loop through the array like normal. See the attached snippet for code that converts to an array and sorts the <url>s by <title> in descending order:


<?php

$xmlString = '<?xml version="1.0" encoding="UTF-8"?>
<urlset xmls="http://www.sitemaps.org/schemas/sitemap/0.9">

<url>
<title>Programs</title>
<loc>programs.php</loc>
<changefreq>always</changefreq>
<priority>0.9</priority>
</url>

<url>
<title>Training</title>
<loc>training.php</loc>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>

<url>
<title>College</title>
<loc>college.php</loc>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>

<url>
<title>Academic</title>
<loc>academic.php</loc>
<changefreq>daily</changefreq>
<priority>0.7</priority>
</url>

<url>
<title>University</title>
<loc>University.php</loc>
<changefreq>daily</changefreq>
<priority>0.7</priority>
</url>

<url>
<title>Education</title>
<loc>education.php</loc>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>

<url>
<title>Apply now</title>
<loc>apply.php</loc>
<changefreq>daily</changefreq>
<priority>0.7</priority>
</url>

<url>
<title>Policies</title>
<loc>policies.php</loc>
<changefreq>daily</changefreq>
<priority>0.7</priority>
</url>

</urlset>
';

function xml2array($xml, $options = null, $isURL = false){
  $sxi = new SimpleXmlIterator($xml, $options, $isURL);
  return sxiToArray($sxi);
}

function sxiToArray($sxi){
  $a = array();
  for( $sxi->rewind(); $sxi->valid(); $sxi->next() ) {
    if(!array_key_exists($sxi->key(), $a)){
      $a[$sxi->key()] = array();
    }
    if($sxi->hasChildren()){
      $a[$sxi->key()][] = sxiToArray($sxi->current());
    }
    else{
      $a[$sxi->key()] = strval($sxi->current());
    }
  }
  return $a;
}

function array_csort() {        //coded by Ichier2003 
    $args = func_get_args(); 
    $marray = array_shift($args);  
    $msortline = 'return(array_multisort('; 
    $i = 0;
    foreach ($args as $arg) { 
        $i++; 
        if (is_string($arg)) { 
            foreach ($marray as $row) { 
                $sortarr[$i][] = $row[$arg]; 
            } 
        } else { 
            $sortarr[$i] = $arg; 
        } 
        $msortline .= '$sortarr['.$i.'],'; 
    } 
    $msortline .= '$marray));'; 
    eval($msortline); 
    return $marray; 
} 

// Read cats.xml and print the results:
$catArray = xml2array($xmlString,null,false);

$catArray["url"] = array_csort($catArray["url"],"title",SORT_DESC);

print_r($catArray);
?>

Open in new window

By the way, the sxiToArray / xml2array code is a modified version of the code found on the manual page for SimpleXMLIterator.
Avatar of webmanager

ASKER

How do I pull the XML file in from an external file.  I need to keep it outside.  I tried this... but no luck...

$xmlString = 'include("sitemap.xml")';

Yes, the php and xml files are in the same directory
Just change $xmlString to the URL:

$xmlString = "http://www.somedomain.com/sitemap.xml";
gr8gonzo, that doesn't work...

this does though.

$xmlString = file_get_contents("sitemap.xml");

Having said that, the content is just being dumped as an array...  I need it in UL's, and a new UL for each letter of the alphabet.  So all the title's starting with A are in the same UL,then another for B's, etc.
Here is an example of how I "sorted" some XML.  It's not pretty, but you may be able to apply the concepts to your needs.

Grouping the output into HTML <ul> blocks really should be posted as a separate question.  It's not hard - you just keep the "old letter" and compare to the "new letter" - when that changes, you are in a new <ul> bock and you create the appropriate tags.

best to all, ~Ray
<?php // RAY_sort_XML_2.php
error_reporting(E_ALL);
echo "<pre>\n"; // READABILITY

// CONSUME XML AND REPORT IT OUT IN A SORTED ORDER

// TEST DATA WRAPPED IN A REASONABLE XML PACKAGE
$xml = '<?xml version="1.0" encoding="UTF-8"?>
<Package>
<ALPHA ID="1">
  <NAME>Boston</NAME>
  <DATE>10/30/2009 3:45:00 PM</DATE>
</ALPHA>

<ALPHA ID="1">
  <NAME>LA</NAME>
  <DATE>10/29/2009 3:45:00 PM</DATE>
</ALPHA>

<ALPHA ID="1">
  <NAME>Miami</NAME>
  <DATE>10/31/2009 3:45:00 PM</DATE>
</ALPHA>

<ALPHA ID="2">
  <NAME>Paris</NAME>
  <DATE>10/27/2009 3:45:00 PM</DATE>
</ALPHA>

<ALPHA ID="2">
  <NAME>London</NAME>
  <DATE>10/24/2009 3:45:00 PM</DATE>
</ALPHA>

<ALPHA ID="2">
  <NAME>Madrid</NAME>
  <DATE>10/30/2009 3:45:00 PM</DATE>
</ALPHA>
</Package>';


// MAKE AN OBJECT
$obj = SimpleXML_Load_String($xml);
// VISUALIZE THE OBJECT
// var_dump($obj);

// ITERATE OVER THE OBJECT TO INJECT A SORT CODE
foreach ($obj->ALPHA as $thing)
{
// CREATE AN IDENTITY FOR THE OBJECT
   $object_id = md5(serialize($thing));

// INJECT THE ID INTO THE OBJECT
   $thing->ObjectID = $object_id;

// GET THE ATTRIBUTE ID
   $sort_attr = (string)$thing["ID"];

// PRODUCE A SORTABLE ISO8601 DATE
   $sort_date = date('c', strtotime($thing->DATE));

// CREATE ARRAYS THAT WE CAN SORT
   $attr_array[$object_id] = $sort_attr;
   $date_array[$object_id] = $sort_date;

// SORT ASCENDING BY ATTR ID AND DESCENDING BY DATE
// MAN PAGE: http://us2.php.net/manual/en/function.arsort.php
   asort($attr_array);
   arsort($date_array);
}

var_dump($obj);
var_dump($attr_array);
var_dump($date_array);

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of gr8gonzo
gr8gonzo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi Wizard,

I'll take a look at the code you suggested.

As for me asking something that wasn't in the original...  I did ask in the original (at the end)..

"I want to sort the XML by the title attribute, and at each new letter in the alphabet, have a new list started.  Basically, I want to create an automated A-Z index listing."

:-)
That's true. It's just that usually that type of information is provided as context / the end goal (which helps us ask the question listed in the title). :-)

Let me know how the code works.
Thanks.

I was able to create a way to put in the section titles fairly easily... I just needed to stop looking at the code for a day.  :-)

Thanks!