Commandline xpath query

mreuring
mreuring used Ask the Experts™
on
I'm writing an AppleScript action for iTunes that needs to query some information from an XML file, which isn't available natively :(. Currently I'm using 'xmllib' from http://www.satimage.fr/software/en/downloads/downloads_companion_osaxen.html to run xpath queries in applescript. But I wan't to write this script without any 3rd-party dependencies, if at all possible and for this part of the process that I'm automating, XMLLib is the only 3rd party dependency.

So, now I was wondering if I could use any of the built-in (pre-compiled) tools that will run from commandline to query an xml-file using xpath. Considdering an xpath that returns a string, '//Series[1]/id[1]/text()', this would be quite acceptable to me and would negate the need for installing the above-mentioned scripting plugin. For instance I was wondering if Perl or PHP could be abused to do this for me, I've seen similar tricks somewhen/where but I have hardly any experience with PHP and 0 with Perl, nor is Bash-scripting among my strengths :)

Requirements would be for this commandline tool to be available on OSX.4 running on a PowerPC, since my media-server is an old G4 that doesn't run any later version :)

FYI, the AppleScript action, when 'finished', will be opensourced.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Top Expert 2009

Commented:
If you post a sample of the XML file, and what you'd like from it, we may be able to get you something.

Author

Commented:
Alright, I've posted a partial of one of the xml-files I'm working with.

From this file I'm pulling:
"//Episode[DVD_episodenumber=" & (episode of myEpisodeInfo) & "][SeasonNumber=" & (season of myEpisodeInfo) & "]"
And from the resulting node I pull:
"/Overview/text()"
"/EpisodeName/text()"
"/FirstAired/text()"
Then for genre I jump back to the root for this:
"//Series[1]/Genre[1]/text()"

I was thinking of grabbing all the info in individual queries, instead of grabbing the node first. This would be less efficient, but I cannot conceive of how to return multiple results from a shell-command back to AppleScript... Anyways, I have a reasonable grasp of xpath, it's getting the result without 3rd-party patching that eludes me :)

<?xml version="1.0" encoding="UTF-8" ?>
<Data><Series>
  <id>75682</id>
  <Actors>|Emily Deschanel|David Boreanaz|T.J. Thyne|John Francis Daley|Michaela Conlin|Tamara Taylor|Eric Millegan|Jonathan Adams|</Actors>
  <Genre>|Drama|</Genre>
  <IMDB_ID>tt0460627</IMDB_ID>
  <Language>en</Language>
  <Network>FOX</Network>
  <NetworkID></NetworkID>
  <Overview>Lorem Ipsum</Overview>
  <Rating>8.6</Rating>
  <SeriesID>33332</SeriesID>
  <fanart>fanart/original/75682-2.jpg</fanart>
</Series>
<Episode>
  <id>298561</id>
  <Combined_episodenumber>2</Combined_episodenumber>
  <Combined_season>1</Combined_season>
  <DVD_chapter></DVD_chapter>
  <DVD_discid></DVD_discid>
  <DVD_episodenumber></DVD_episodenumber>
  <DVD_season></DVD_season>
  <Director>Allan Kroeker</Director>
  <EpImgFlag></EpImgFlag>
  <EpisodeName>The Man In The SUV</EpisodeName>
  <EpisodeNumber>2</EpisodeNumber>
  <FirstAired>2005-09-20</FirstAired>
  <GuestStars>|Jose Zuniga|Nicholas Massouh|Ane Dudek|Bahar Soomekh|Federico Dordei|Said Faraj|Dave Roberson|Tracy Howe|</GuestStars>
  <IMDB_ID></IMDB_ID>
  <Language>en</Language>
  <Overview>Lorum Ipsum</Overview>
  <ProductionCode>1AKY02</ProductionCode>
  <Rating>7.5</Rating>
  <SeasonNumber>1</SeasonNumber>
  <Writer>Stephen Nathan</Writer>
  <absolute_number></absolute_number>
  <filename>episodes/75682/298561.jpg</filename>
  <lastupdated>1231598855</lastupdated>
  <seasonid>9191</seasonid>
  <seriesid>75682</seriesid>
</Episode>
</Data>

Open in new window

Author

Commented:
I just noticed that I posted the less common query for getting the 'episode' node, most of the time I'd be searching based on EpisodeNumber instead:
"//Episode[EpisodeNumber=" & (episode of myEpisodeInfo) & "][SeasonNumber=" & (season of myEpisodeInfo) & "]"
Acronis in Gartner 2019 MQ for datacenter backup

It is an honor to be featured in Gartner 2019 Magic Quadrant for Datacenter Backup and Recovery Solutions. Gartner’s MQ sets a high standard and earning a place on their grid is a great affirmation that Acronis is delivering on our mission to protect all data, apps, and systems.

Top Expert 2009

Commented:
I'm not familiar with the xpath, or how it relates to the XML file.  I was planning on looking into this so I could help some more, but haven't had time.

If you summarize what you want, I can help you get it.

Author

Commented:
Alright, I'll try to start with a quick summary of what xpath does:
xpath can be used to search for, and retreive, information from an xml-file in  a way which is quite similar to a 'path' in any shell (DOS, Unix, etc). With it you can return entire nodes/elements or even go as detailed as a single attribute. Also, by using advanced xpath constructions, you can select based on attributes.

In my particular case I have an xml-document which is filled with episodes for a particular (tv)series and I am querying it for one particular episode. After retreiving the episode I need query the resulting element for information regarding its' title, description and possibly some other info when I want to expand the script.
Also I'm retreiving some information from the 'Series' node regarding the Genre of the series, which in iTunes needs to be stored for every single 'track'.

The reason I was suggesting perl is because I know it's possible to write commandline 'tools' with it, it's installed by default since OSX.4 and it has an xml/xpath api installed. Problem is, I've never written anything in perl so I don't know how to do it myself, while it sounds like it could be a rather easy thing to write (in java it wouldn't take me very long to write it and I've been considdering to do so). All that the tool needs to do is recieve the path to an xml file and an xpath query, process it and then return the result as a string.

Author

Commented:
Adam314 still seems interested in bringing this issue to a conclusion, since this part of my project is not in any time-pressure it has not been resolved through other means and I would like to await Adam's reply.

Author

Commented:
Ok, so I had another short inspirational session of googling and came across 'http://uszla.me.uk/space/blog/2007/09/21', using xmllint to run an xpath query against some xml file. I adapted it a little to more closely match my needs and have something that works. It ain't pretty, if you ask my opinion, but it works. Also, xmllint comes installed on an osx.4 box, at least one that has been regularly updated.

Since this still ain't the pretty solution I was hoping for I'm still holding out for a custom script using perl or ruby or whatever can be used to make some elegant script out of... But in case it doesn't happen, I have something that works well enough.
-- *
-- xpath query using xmllint shell mode. This method throws the xpath query at xmllint using an echo
-- statement. Since this tool lives on tiger as well as leopard it is a relatively 'safe' method of
-- running an xpath query from applescript. Currently this method only works on queries that select
-- a single node!
-- The result of the shellCommand is trimmed and set to an empty string if the result started with the
-- query (which indicates it failed or multiple nodes were part of the selection).
-- For the idea of creating this method I would refer to: http://uszla.me.uk/space/blog/2007/09/21
--
-- @parameter query A String containing the xpath query you want to use
-- @parameter source A String containing the xml, or a POSIX path to the xml-file you want to query
-- @parameter fromFile A class set to 'file' if the source is a file, otherwise a file is create and source is stored in that file.
-- */
on xpath for query against source from fromFile
	set shellPipe to "|"
	set shellEcho to "echo \"cd " & query & NL & "cat\""
	set shellCommand to "/usr/bin/xmllint --noent --shell "
	set shellCleanup to "sed 's/^[/a-zA-Z][^<]*>//g'"
	set shellCaptureError to " 2>&1" -- For capturing the errorout
	
	-- If source is not from a file, create a temp file...	
	if fromFile is file then
		set filePath to quoted form of POSIX path of source
	else
		set theFilePath to TEMP_FILE
		set filePath to quoted form of POSIX path of theFilePath
		try
			do shell script "/bin/rm " & filePath
		on error
			--Ignore it, no need to worry...
		end try
		set theFileReference to open for access theFilePath with write permission
		write source to theFileReference
		close access theFileReference
	end if
	
	set shellScript to (shellEcho & shellPipe & shellCommand & filePath & shellCaptureError & shellPipe & shellCleanup)
	set shellResult to (do shell script shellScript)
	
	--If the result starts with the query, something went wrong...
	if (shellResult contains query) then
		-- display dialog (shellScript & NL & "'" & trim(true, shellResult) & "'")
		set shellResult to ""
	else
		set shellResult to trim(true, shellResult)
	end if
	
	return shellResult
end xpath

Open in new window

Top Expert 2009
Commented:
Here is a perl script that will display the requested parameters.  If you'd prefer the output in a different format, let me know.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use XML::Simple;
 
my $xml = XMLin('file1.xml');
 
foreach (qw(EpisodeNumber SeasonNumber Overview EpisodeName FirstAired)) {
	print "$_=$xml->{Episode}->{$_}\n";
}
print "Series Genre=$xml->{Series}->{Genre}\n";

Open in new window

Author

Commented:
I'll test it out and let you know, I'll probably be able to tweak the output, I just didn't know anything about building a perl script. Like I said, I'll get back to you!
Top Expert 2009

Commented:
No problem.  The XML::Simple module does all of the reading of the XML file.  To see the structure of the xml once it has been read,you can use:
    print Dumper($xml);

If you have any more questions, let me know.

Author

Commented:
Since this perl script is by far more tweakable and simple than what I have I'm gonna go with it and will probably adapt it to more closely match my needs as I go along :) Thanx!

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial