[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

REXML only gives me nil responses when parsing tree.

Posted on 2007-08-09
11
Medium Priority
?
289 Views
Last Modified: 2013-11-05
Goodafternoon,

i'm trying to parse the result i get from google's codesearch...My response is in XML like (i hope it comes out alright)

<feed>
<entry xmlns:gcs="http://schemas.google.com/codesearch/2006">
  <id>http://www.google.com/codesearch?q=malloc+show:fLwrFGa3hx</id>
  <updated>2006-10-02T15:08:42Z</updated>
  <author>
    <name>Code owned by external author.</name>
  </author>
  <title type="text">w3c-libwww-5.4.0/Library/src/wwwsys.h</title>
  <link rel="alternate" type="text/html"
      href="http://www.google.com/codesearch?q=malloc+show:fLwrFGa3hxs"/>
  <gcs:package name="http://www.w3.org/Library/Distribution/w3c-libwww-5.4.0.zip"
      uri="http://www.w3.org/Library/Distribution/w3c-libwww-5.4.0.zip"/>
  <gcs:file name="w3c-libwww-5.4.0/Library/src/wwwsys.h"/>
  <gcs:match lineNumber="706" type="text/html">
    &lt;pre&gt;Memory Module for how to handle &lt;b&gt;malloc&lt;/b&gt; and &lt;/pre&gt;
  </gcs:match>
  <gcs:match lineNumber="716" type="text/html">
    &lt;pre&gt;#define &lt;b&gt;malloc&lt;/b&gt;      VAXC$&lt;b&gt;MALLOC&lt;/b&gt;_OPT&lt;/pre&gt;
  </gcs:match>
  <gcs:match lineNumber="1032" type="text/html">
    &lt;pre&gt;/* &lt;b&gt;malloc&lt;/b&gt;.h */&lt;/pre&gt;
  </gcs:match>
  <gcs:match lineNumber="1033" type="text/html">
    &lt;pre&gt;#ifdef HAVE_&lt;b&gt;MALLOC</b&gt;_H&lt;/pre&gt;
  </gcs:match>
  <gcs:match lineNumber="1034" type="text/html">
    &lt;pre&gt;#include &lt;&lt;b&gt;malloc&lt;/b&gt;.h&gt;&lt;/pre&gt;
  </gcs:match>
</entry>
</feed>

Offcourse there are multiple entries.

When I use REXML and do

doc.elements.each("//entry") { |element| puts element.attributes["file name"] }

I only get 'nil' back instead of what I expect the file name attribute.

What I would like is to get result like

filename
 code
filename
 code

etc. What am i doing wrong?

The API endpoint i'm using as an example is

http://www.google.com/codesearch/feeds/search?q=malloc

Hope someone can help

thank you very much
kind regards,
Marco Kotrotsos







0
Comment
Question by:EnolaKotrotsos
  • 6
  • 5
11 Comments
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 19661147
The reason is the namespace
xmlns:gcs="http://schemas.google.com/codesearch/2006"
makes the entry element by default in the gcs namespace

you could compare with the local name

doc.elements.each("//*[local-name() = 'entry']") { |element| puts element.attributes["file name"] }

or pass the namespace node with the query
I ll look for an example and let you know in a minute

cheers

Geert
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 19661161
sorry, the above is nonsens,
I missed the prefix.
The XML as you posted it does not put entry in a namespace
do you show the entire XML or is there someting higher up the tree?

cheers

Geert
0
 
LVL 1

Author Comment

by:EnolaKotrotsos
ID: 19661166
Thanks,

i tried your code and it gave me all nils still...
Looking forward to your example.

Thanks again!
Marco
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 1

Author Comment

by:EnolaKotrotsos
ID: 19661176
This is the code i'm using now...you can puts doc to see the entire XML it's fetching.

 require 'open-uri'
 require 'rexml/document'
 include REXML
 
 open("http://www.google.com/codesearch/feeds/search?q=malloc") {|f|   @req = f.read }

 
 doc = Document.new(@req)
 
 doc.elements.each("//*[local-name() = 'entry']") { |element| puts element.attributes["file name"] }

0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 19661207
note that the problem is in fetching the attributes (they don't exist)
the XPath for entry works,

try this
 doc.elements.each("//entry") { |element| puts element }

and you will see the entries

what exactly do you want to achieve?
0
 
LVL 1

Author Comment

by:EnolaKotrotsos
ID: 19661241
I'm trying to get a listing of filenames with it's sourcecode.

so after fetching http://www.google.com/codesearch/feeds/search?q=include 'REXML'

it returns (or should) all the entries containing

The filename (and package preferably)
the sourcecode that is posted with the return xml





0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 19661347
I think this is what you are after

XPath.each(doc, "//entry/*[local-name() = 'file']"){|element| puts element.attribute("name")}

cheers

Geert
0
 
LVL 1

Author Comment

by:EnolaKotrotsos
ID: 19661368
That works, but how do I get the code (gcs:match) underneath each name?
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 2000 total points
ID: 19661392
work down the tree, like this

XPath.each(doc, "//entry") do |element|
            XPath.each(element, "*[local-name() = 'file']") do |fname|
                  puts fname.attribute("name")
            end
            XPath.each(element, "*[local-name() = 'match']") do |cmatch|
                  puts cmatch
            end
end
0
 
LVL 1

Author Comment

by:EnolaKotrotsos
ID: 19661414

That's what I needed! Thanks alot, you also given me alot of more insight how this stuff works!
Thanks alot!
Marco
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 19661443
you are welcome

There is one thing I could not get to work and that is the namespace stuff
          XPath.each(element, "gcs:file", {"gcs"=>"http://schemas.google.com/codesearch/2006"}) do |fname|
should work the same as
          XPath.each(element, "*[local-name() = 'file']") do |fname|
and of course is a lot nicer
I think the problem is with the selection of entry, where I seem to have lost the namespace node
... but hey, you have a workaround
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Article by: narshlob
If you've ever programmed in Ruby and have come across either a proc or a lambda, you might have been wondering what the difference is between the two and when you would use one over the other. This article will try to explain the difference between…
In Ruby, Call or invoke a API DLL library is easily via Win32API class, win32-api gem or other gems. For general DLL API call, there are quite a few references, some good tips list below: http://www.rubytips.org/2008/05/13/accessing-windows-api-fro…
This video shows how to quickly and easily deploy an email signature for all users in Office 365 and prevent it from being added to replies and forwards. (the resulting signature is applied on the server level in Exchange Online) The email signat…
Are you ready to place your question in front of subject-matter experts for more timely responses? With the release of Priority Question, Premium Members, Team Accounts and Qualified Experts can now identify the emergent level of their issue, signal…
Suggested Courses
Course of the Month18 days, 5 hours left to enroll

829 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question