Link to home
Create AccountLog in
Avatar of jello32
jello32Flag for United States of America

asked on

Reading in a list of files in a directory and checking the description tag

I have a directory of svg objects at www.anysite.com/svg.  I want to write php code to read each of the objects and check to see if the term QX greater than PX. QY equals PY. exists in the description.  How would I do this?  Here's an example of the svg code:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg version="1.1" xmlns="http://www.w3.org/2000/svg"
  xmlns:xlink="http://www.w3.org/1999/xlink"  x="0px" y="0px" width="256px"
  height="256px"  viewBox="0 0 256 256"  enable-background="new   0 0 256 256"
  xml:space="preserve">

<g id="Background_xA0_Image_1_">
  <image overflow="visible" width="256"  height="256" id="Background_xA0_Image">
  </image>
</g>

<g id="Shape_1_1_" enable-background="new    "> <g id="Shape_1"> <g> <ellipse fill-rule="evenodd" clip-rule="evenodd" fill="blue" cx="11" cy="62" rx="8" ry="8"/> </g> </g> </g>

<g id="Shape_2_1_" enable-background="new    "> <g id="Shape_2"> <g> <rect x="165" y="28" fill-rule="evenodd" clip-rule="evenodd" fill="yellow" width="34" height="216"/> </g> </g> </g>

<metadata>
 <rdf:RDF
      xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:rdfs = "http://www.w3.org/2000/01/rdf-schema#"
      xmlns:ont = "http://www.anywebsite.com/ontology/svg_ont.owl#"
      xmlns:dc = "http://purl.org/dc/elements/1.1/" >

   <rdf:Description rdf:about="http://www.anywebsite.com/svg"
        dc:title="image 187"
          dc:description="QX greater than PX. QY equals PY. PX less than QX. PY equals QY. S2, yellow, rectangle. S1, blue, circle. S2. S1. S2, yellow, rectangle horizontally aligns with S1, blue, circle. S2 horizontally aligns with S1. S1, blue, circle horizontally aligns with S2, yellow, rectangle. S1 horizontally aligns with S2."
          dc:publisher="Mathis Webs"
          dc:date="2012-11-18"
          dc:format="image/svg+xml"
          dc:language="en" >
       <dc:creator>Regina Mathis</dc:creator>
     </rdf:Description>

   <rdf:Description rdf:about="http://www.anywebsite.com/svg/S1">
     <ont:type>circle</ont:type>
     <ont:color>blue</ont:color>
     <ont:horizontallyAligns>S2</ont:horizontallyAligns>
     <ont:PX>11</ont:PX>
     <ont:PY>57</ont:PY>
   </rdf:Description>

   <rdf:Description rdf:about="http://www.anywebsite.com/svg/S2">
     <ont:type>rectangle</ont:type>
     <ont:color>yellow</ont:color>
     <ont:horizontallyAligns>S1</ont:horizontallyAligns>
     <ont:QX>165</ont:QX>
     <ont:QY>57</ont:QY>
   </rdf:Description>

   </rdf:RDF>
  </metadata>
</svg>

Open in new window

Avatar of Julian Hansen
Julian Hansen
Flag of South Africa image

Not xpath but it should work.
$lines = file('input.xml');
foreach($lines as $l) {
   if (strstr($l, 'description="QX greater than PX. QY equals PY')) {
      echo "found one";
      break;
   }
}

Open in new window

If you will post the true URL of the directory (and if your server does not block external scripts), I will show you how to read it and find all of the desired objects.  If the directory is in your web server you can often use scandir() to get the list of names into an array, then read each of the documents with file()
Avatar of jello32

ASKER

the thru url of the directory is mathiswebs.com/svg
Unfortunately that URL returns a web page instead of a directory list.  You could write a script to parse the web page and extract the URLs or you could just assume that there would be sequentially numbered files and try to read all of them, ignoring the errors.

This script shows how to read one of the files and display its contents.

<?php // RAY_temp_jello32.php
error_reporting(E_ALL);
echo '<pre>';

// SEE http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_28175592.html#a39300174

// READ ONE FILE
$url = 'http://mathiswebs.com/svg/image100.svg';
$doc = file_get_contents($url);
echo htmlentities($doc);

Open in new window

Avatar of jello32

ASKER

but your script doesn't show how i would match QX greater than PX. QY equals PY to the description.
$lines = file('input.xml');
foreach($lines as $l) {
   if (strstr($l, 'description="QX greater than PX. QY equals PY')) {
      echo "found one";
      break;
   }
}
                                            

Open in new window

Avatar of jello32

ASKER

that doesn't work.
doesn't work
We might need a little more to go on here.  What happened, exactly?  Parse error?  Incorrect output?  Halt and catch fire?  

Please post the script you have tried to put together from the suggestions we have offered here.  I am fairly sure that we can help you shake the bugs out.
Avatar of jello32

ASKER

Nothing happened.  Here's the script
<?php
$lines = file('http://www.mathiswebs.com/svg/image187.svg');
foreach($lines as $l) {
   if (strstr($l, 'description="QX greater than PX. QY equals PY"')) {
      echo "found one";
      break;
   }
}
?>
Try this: http://www.laprbass.com/RAY_temp_jello32a.php

If that seems to make sense to you, then we can go on to the other part which will be to scrape the web page and extract an array of URLs that point to the imageXXX.svg files.  Then we can repeat the process for each of the URLs.  But first make sure you're comfortable with this script.

<?php // RAY_temp_jello32a.php
error_reporting(E_ALL);
echo '<pre>';

// URL OF THE DOCUMENT
$url = 'http://www.mathiswebs.com/svg/image187.svg';

// TEXT OF THE SIGNAL STRING
$sig = 'description="QX greater than PX. QY equals PY';

// READ THE DOCUMENT
$lines = file_get_contents($url);

// LOOK FOR THE SIGNAL STRING
if (strpos($lines, $sig))
{
    echo PHP_EOL . "************************** FOUND ONE" . PHP_EOL;
}

// SHOW THE DOCUMENT
echo htmlentities($lines);

Open in new window

Best regards, ~Ray
That's because there was an extra double quote on the end. This works.

<?php
$lines = file('http://www.mathiswebs.com/svg/image187.svg');
foreach($lines as $l) {
   if (strstr($l, 'description="QX greater than PX. QY equals PY. PX less than QX. PY equals QY')) {
      echo "found one";
      break;
   }
}
?> 

Open in new window

Avatar of jello32

ASKER

ok ur previous comment seems to work.  Now how do we scrape the web page and extract an array of URLs that point to the imageXXX.svg files?
ASKER CERTIFIED SOLUTION
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
Avatar of jello32

ASKER

Thanks!