Link to home
Start Free TrialLog in
Avatar of aviator21114
aviator21114

asked on

XML query in Javascript

Greetings,

I have a simple xml file of airport information (sample shown below).  

<airports>
<airport id="HEF" latitude="38.72140121" longitude="-77.51540375">Manassas</airport>
<airport id="IAD" latitude="38.94449997" longitude="-77.45580292">Dulles</airport>
<airport id="BWI" latitude="39.17539978" longitude="-76.66829681">Baltimore</airport>
</airports>

I'm new to xml so I'm not even sure if this is constructed properly.    I'm creating a simple application that will calculate the distance between two airports.  I have a javascript to perform the calculations.    I have an input field for departure and destination id's.    I need to be able to query this xml file based upon the input for departure/destinations and return the lat/long fields to variables in the application.

Ideally (not mandatory) I would like to have the query act like a soundex type of search so if you type H for example it would return those id's containing H.  If you typed HE it would further refine the airports returned to only show those beginning with HE... and so on...

Any thoughts?

Thank you in advance for your input.
Avatar of Alexandre Simões
Alexandre Simões
Flag of Switzerland image

Does it really need to be XML?
This in JSON format would make your life much easier.

On the other hand for javascript, xml is just a bunch of text without any meaning and you need to parse it by hand.

As I understand this is data that is on the client side and you need to perform search operations on it. Questions:
1. How does this data arrive there? From the server?
2. Can't it come already in a form of javascript array or JSON?

Make sure you don't need to transverse the whole XML each time you need to search for something. Depending on the size it might take too much time.

Here I leave some references on parsing XML in javascript:
http://www.w3schools.com/xml/xml_parser.asp
http://api.jquery.com/jQuery.parseXML/

I'll still reinforce the idea that arrays of JSON objects will make your search operations much faster.
Avatar of aviator21114
aviator21114

ASKER

Honestly, I don't know what format would be best... I'm shooting from the hip a bit here.  

The actual file will contain about 13,000 entries (all the airports in the U.S.)

The app will be run on an iPad and I was thinking to load the file locally vs externally connecting to a file (I'm not sure how to do this).

Any short answers on the describing the JS Array and JSON?  Not familiar with these..

Again, thanks in advance...
Well, I'll dump my thoughts on this:

Size Matters!

XML is much more verbose than JSON which will save you much space (amount of chars on the client). Your sample data in JSON format, indented just for readability, looks like this:
{
    "ariports": [
        {
            "id": "HEF",
            "latitude": "38.72140121",
            "longitude": "-77.51540375",
            "name": "Manassas"
        },
        {
            "id": "IAD",
            "latitude": "38.94449997",
            "longitude": "-77.45580292",
            "name": "Dulles"
        },
        {
            "id": "BWI",
            "latitude": "39.17539978",
            "longitude": "-76.66829681",
            "name": "Baltimore"
        }
    ]
}

Open in new window

If you remove all the spaces and make a char count on both versions you'll see the real difference in 3 records...
{"ariports":[{"id":"HEF","latitude":"38.72140121","longitude":"-77.51540375",name:"Manassas"},{"id":"IAD","latitude":"38.94449997","longitude":"-77.45580292",name:"Dulles"},{"id":"BWI", "latitude":"39.17539978","longitude":"-76.66829681",name:"Baltimore"},]}

Open in new window

but you say you have around 13.000!
You can also replace all those explicit property names and give them single letter names like:
{"a":[{"i":"HEF","t":"38.72140121","g":"-77.51540375",n:"Manassas"},{"i":"IAD","t":"38.94449997","g":"-77.45580292",n:"Dulles"},{"i":"BWI", "t":"39.17539978","g":"-76.66829681",n:"Baltimore"},]}

Open in new window

Basically for each record I removed 16 chars... time 13.000 = 208.000 chars in total

Performance

Your client side data will be around 1Mb and each search loop will in the worst case to run through 13000 records.
An idea to improve this would be to make sure the file is ordered by the search property.
You can also create an index array that only holds the first position of each letter on the array like:
var index = [0,105, 3470, 4500, 6000, ...]

Open in new window


Each of these positions would correspond to the same order of the letter in the alphabet and knowing that ASCII code 65 corresponds to the letter 'A' you could do:
var firstIndex = "C".charCodeAt(0) - 65;

Open in new window

That would give you the information that 'C' begins in position 3470 in the array.
While in the loop, the first time you find a non C you can exit and return the result.

Make sure all letters appear in this index array and set -1 values for the ones that are not in the data array. This way you don't even need to loop as you know no match will be found.

Connectivity

As you say this will run on mobile devices, I would keep the data in the device.
Although this kind of searches are faster and more flexible on the server, connection problems will make the response much slower.

Final

So I would give the smallest possible json data to the client along with an index array of that data for the first alphabet letters and use it to make an optimized search loop.
that's wonderful input!   Any chance you could share a script to perform the look-up using my small sample data?  I need to get jump-started on this..
I'll give it to you tomorrow.
Just tell me something, the search is by name ou by ID?

This is important because of the sorting of the data.

Cheers!
I think by id for starters.. Ultimately , perhaps I will have two copies of the data one sorted for ID the other by Name.   I can make the change if I decide to go with name..  

Thank you for taking time to assist me... BIG POINTS!
Man, at the end I had time to amuse myself a bit with this :)
I prepared a working sample on jsFiddle, test it there directly:
http://jsfiddle.net/ZbNAT/4/

So basically I ended up creating the index on javascript also, so all you have to worry about is to have all data sorted by the search property.

searchPropertyName variable sets the property name that you want to use for the search.
It's also used of course for the index creation and the search result is an array of this column matched values.
If you need the full object as a result of the search just use the return commented line instead.

Have fun! :)

I'll post the code here anyway:
var data = {"a":[
    {"i":"BWI", "t":"39.17539978","g":"-76.66829681",n:"Baltimore"},
    {"i":"IAD","t":"38.94449997","g":"-77.45580292",n:"Dulles"},
    {"i":"HEF","t":"38.72140121","g":"-77.51540375",n:"Manassas"},
]};

// searchPropertyName represents the name of the property the search will be performed on
// make sure the ALL data is also sorted by this column, the algorithm relies on this for performance
var searchPropertyName = "n";    
var dataIndex = null;

function rebuildIndex(){    
    dataIndex = [];
    var i = 0;
    var indexLetter;
    while(i < data.a.length){
        var currentLetter = data.a[i][searchPropertyName][0].toUpperCase();
        if(!indexLetter || indexLetter != currentLetter){
            indexLetter = currentLetter;
            dataIndex[currentLetter.charCodeAt(0) - 65] = i;
        }
        
        i=i+1;
    }
    //alert(dataIndex.join()); // test index
}

function search(term){
    if(!dataIndex)
        rebuildIndex();

    var searchResult = [];
    var termIndexLetter = term[0];
    var i = dataIndex[term.toUpperCase().charCodeAt(0) - 65];
    
    // if there's no index on the array is because there's no match
    if(!i) {
        return null;
    }
  
    while(i < data.a.length && data.a[i][searchPropertyName][0] == termIndexLetter){
        searchResult.push(data.a[i][searchPropertyName]);
        
        //uncomment this if you want the full object in the result
        //searchResult.push(data.a[i]);
        i = i+1;
    }
    
    //alert(searchResult.join());
    return searchResult;
}

// test the search!
alert(search("Man").join());

Open in new window

I performed some more tests and found two problems.
One with a implicit cast... my bad...
And another becaused I got too focused on the index and forgot to do the actual full search comparison :)

Here's the link to the fixed version:
http://jsfiddle.net/ZbNAT/6/

Also added more tests at the end... should cover everything.
Cheers!

var data = {"a":[
    {"i":"BWI", "t":"39.17539978","g":"-76.66829681",n:"Baltimore"},
    {"i":"IAD","t":"38.94449997","g":"-77.45580292",n:"Dulles"},
    {"i":"HEF","t":"38.72140121","g":"-77.51540375",n:"Manassas"},
]};

// searchPropertyName represents the name of the property the search will be performed on
// make sure the ALL data is also sorted by this column, the algorithm relies on this for performance
var searchPropertyName = "n";    
var dataIndex = null;

function rebuildIndex(){    
    dataIndex = [];
    var i = 0;
    var indexLetter;
    while(i < data.a.length){
        var currentLetter = data.a[i][searchPropertyName][0].toUpperCase();
        if(!indexLetter || indexLetter != currentLetter){
            indexLetter = currentLetter;
            dataIndex[currentLetter.charCodeAt(0) - 65] = i;
        }
        
        i=i+1;
    }
    //alert(dataIndex.join()); // test index
}

function search(term){
    if(!dataIndex)
        rebuildIndex();

    var searchResult = [];
    var termIndexLetter = term[0];
    var i = dataIndex[term.toUpperCase().charCodeAt(0) - 65];

    // if there's no index on the array is because there's no match
    if(i === undefined) {
        return searchResult;
    }
  
    while(i < data.a.length && data.a[i][searchPropertyName][0] == termIndexLetter){
        var value = data.a[i][searchPropertyName];
        
        if(value.indexOf(term) == 0)
            searchResult.push(value);
        
        //uncomment this if you want the full object in the result
        //searchResult.push(data.a[i]);
        i = i+1;
    }
    
    //alert(searchResult.join());
    return searchResult;
}

// test the search!
alert(search("Bwer").join());
alert(search("D").join());
alert(search("K").join());
alert(search("B").join());
alert(search("M").join());

Open in new window

Super!... thank you...  Is there a way to have the file external to the code (not external to the device), load it and then query?

Also, what reading material would you recommend on this subject?
ASKER CERTIFIED SOLUTION
Avatar of Alexandre Simões
Alexandre Simões
Flag of Switzerland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Sorry for the delay in rewarding points.   Thank you...
So I have a followup question...   While I'm searching on the value "n" airport name or "i" airport identifier I actually want to return not only one of those values I want to return the values of "t" and "g".   I'm not sure how to do this in the script.  

Also how do I define the JSON outside of the script as a separate independent file and then call it from within the script?

Thanks in advance....
Hi mate!

1. So for the first question I actually predicted that you would need that.
If you take a closer look at my last sample you see a commented line exactly for that.
Basically the main idea is that instead of pushing only the value to the search result you push the whole object.
if(value.indexOf(term) == 0)
     searchResult.push(data.a[i]);

Open in new window


2. For the second question just declare the script file before the one with the search algorithm. JavaScript doesn't care about files, at the end everything is loaded and joint together. We just have to make sure we load the files before we need their content.

Hope I manage to explain it clearly.
Feel free to get back to me if you have any doubts.

Cheers!
Alex,

Thank you for the prompt response... I tried using the commented line of code but it doesn't return the whole value of the data line... I modified the script using "i" vs "n" as the search argument for my purposes... However, If for example I enter H or HE or HEF with the line of code uncommented it returns the value of HEF.objectObject  in the alert versus the whole string....

Could you show me explicitly how to define the json data externally and reference it in the script?  I'm not sure I understand your explanation...

Thank you again...
Without looking really deep into what you did, I would have
done

var data = {
    "BWI":{ "t":"39.17539978","g":"-76.66829681",n:"Baltimore"},
    "IAD":{"t":"38.94449997","g":"-77.45580292",n:"Dulles"},
    "HEF":{"t":"38.72140121","g":"-77.51540375",n:"Manassas"}
};
I mate,
So for the first issue I think the problem is that I forgot to say that you also need to comment the previous line. I updated the sample code: http://jsfiddle.net/ZbNAT/9/

The tests at the end will only return the first result but I think you get the idea.
Now the result is an array of objects:
// gets the 'n' property value out of the first search result
search("D")[0].n

Open in new window


Now for the scripts reference thing.
Basically you need to create a javascript file only with the data like I told you in the accepted answer.
To include javascript files in a page you put, usually inside the <head> block:
<script type="text/javascript" src="/path_to_the_file/myfilename.js"></script>

So in this case you might want to have something like:
<head>
<script type="text/javascript" src="/path_to_the_file/data1.js"></script>
<script type="text/javascript" src="/path_to_the_file/data2.js"></script>
<script type="text/javascript" src="/path_to_the_file/mysearchalgorithm.js"></script>
</head>

Open in new window

Browsers read the HTML in a sequential order.
So in this case when it reaches the head block it will load data1.js, then data2.js and after mysearchalgorithm.js.
Internally it will put the content of the files all together, so the only thing that really matters is the order on which you include them because you can't ask for something that is on data1.js before the file is actually loaded it.

Just to make it clear, data1.js contains, for instance, the data ordered by code, data2.js contains the data ordered by name and mysearchalgorithm.js contains the whole search code.

I don't want to extend this explanation further here just not to make it more confusing but there are some more considerations you might want to take in account here for performance improvements and best practices.
Put it to work like this and after I'll give you some further hints.

Cheers!
Hello Alex,

Once again thank you for your prompt response to my query.  Although I haven't tested it  I believe I know understand the external file loading question.  

However, I'm still not sure I presented my question on returning data from the JSON record so let me use an example....

Given the JSON data  "i":"BWI", "t":"39.17539978","g":"-76.66829681",n:"Baltimore"

I set "i" as the variable searchPropertyName = "i"   This lets me search on the 3 character airport ID...   B or BW or BWI will find the row noted above.

What I want returned in the function call to search is the values of "t" "g" and "n" if I get a match in the search on "i"

How do I call the function and return these values back to the calling script?

Thank you,
Hi mate, sorry for the delay.

What you want is exactly what I told you.
Your data source is a list of records like this:
{"i":"BWI", "t":"39.17539978","g":"-76.66829681",n:"Baltimore"}

Open in new window


Initially, on the search result, I was returning a list of strings corresponding to the found airport names. For that I was using:
var value = data.a[i][searchPropertyName];

Open in new window


Lets beak this down:
data => our whole data source object
data.a => the property that holds the array of data
data.a[i] => the item on the index i of the array
data.a[i][searchPropertyName] => the property value on the item i of the array

Open in new window


So, now if we actually push the
data.a[i]

Open in new window

to the search result what we'll have at the end is the whole object, not only the name, and with that we can get any of it's properties.

// this will get a list items
var result = search("B");

// to get the 't' of the first result you do
var latitude = result[0].t;
var longitude = result[0].g;
var name = result[0].n;

// if you ask for something that doesn't exist you'll get undefined
var something_else = result[0].z;   // there's no z property in our datasource

Open in new window

I hope I could make it clear.
Anyway, feel free to come back if you have any doubts.

Cheers!
Very very clear... Thank you for taking time to expand the answer!  you rock!

I'm sure there will be more questions along the way but this jump starts me nicely..
Nice!! :)
I'll be glad to further help if I can.

Have fun mate! :)
Cheers!
I played around a little for you

LIVE DEMO

var data = {
    "BWI": {
        "t": "39.17539978",
        "g": "-76.66829681",
        "n": "Baltimore"
    },
        "IAD": {
        "t": "38.94449997",
        "g": "-77.45580292",
        "n": "Dulles"
    },
        "HEF": {
        "t": "38.72140121",
        "g": "-77.51540375",
        "n": "Manassas"
    }
};

function getAirports() {
    var arr = [];
    for (airport in data) {
        if (data.hasOwnProperty(airport)) {
            arr.push(airport + " - " + data[airport].n);
        }
    }
    return arr;
}
$("#airp").autocomplete({
    source: getAirports(),
    select: function (event, ui) {
      var value = ui.item.value;
      var airport = data[value.split(" -")[0]];
      $("#output").html(value+"<br>t:"+airport.t+"<br>t:"+airport.g); 
    }
});

Open in new window

Alex,

I've been banging my head against the wall trying to determine why the search routine has particular issue with a new json dataset I created.

User generated image
Whenever I change the value noted in the JSON "i" data value as "SNEEZY" to the word "CHALLENGER601" or change the "i" data value "GOOFY" to "FALCON900" the search fails.   It works just fine on all other searches for the other "i" values and in-fact will return the corresponding values for "i" on GOOFY and SNEEZY.  

I'm sure I'm overlooking something simple but I've been staring at this code too long.

Thoughts?
I think you're missing the fundamental concept of my code that was mandatory for the search property to be ordered alphabetically.

If you set "i" to be the search column, the whole collection must be sorted by "i" ascending.

I didn't test but it might be the issue.
My apologies... I thought your comment said it had to be ordered alphabetically for performance reasons.   I didn't think it was imperative with such a small dataset.   I will order it and test.  Thanks,
In fact it is for performance reasons.
For performance (as for the initial question we were speaking about 13.000 rows) I've created an index for the list and that index, like the DB Indexes, rely on a specified order.
Ok my friend... That was it.. It's mandatory that the json dataset be ordered alphabetically in the search argument...  Thank you!
I also wan to thank mplungjan for their implementation.   Since programming is an 'art' not a science it's nice to see variations on a solution to a problem... Thank you again.
No prob :)
Oh, BTW Alex.... Bonus question to all the bonus questions I've thrown at you....  I noticed in the JSON dataset the last value of the string ("n" in my original, now "s" in this dataset), this variable is not enclosed in quotes as the others are.  Is this significant?

{"i":"BWI", "t":"39.17539978","g":"-76.66829681",n:"Baltimore"}

Open in new window


BTW, I've ordered the suggested reference material you shared but they have not arrived.

Thanks,
So that's a JSON rule and was my fault.
Super valid JSON should have the property names enclosed in quotes although it's not always actually mandatory as most browsers parse it correctly even without the quotes.

If you use http://jsonlint.com/ to validate that json line you'll get an error.
If you enclose the n in quotes you'll get valid JSON.

So, to avoid parsing errors in all browsers and flavors, always enclose property names in quotes.
Perfect... Thank you Alex for clarifying..  Your knowledge and insight is very much appreciated.
Alex

I've adapted the code you provided to search on another json for my program.   It works fine for returning the values of "e" "x" and "s".  However when I try to assign "z" "y" "v" or "w" the value isn't returned.   I've digested quite a bit of the code but I'm wondering if there is some sort of limiting factor in the code (length?) that's prohibiting me from getting the values beyond "s" in the data line.

Here is a single line from the json

var ac_data = {"p":[
    {"i":"BEECHJET400A","e":"2550","x":"2650","s":"2725","z":"700","y":"1050","v":"250","w":"350"},

Open in new window


(yes the var ac_data json is closed properly.  I just put one line here as a reference)

These vars work fine
var EliteRate = progdata[0].e;
var AccessRate = progdata[0].x;
var StarterRate = progdata[0].s;

Open in new window


These don't
var CrewDom = progdata[0].z;
var CrewInt = progdata[0].y;
var CADom = progdata[0].v;
var CAInt = progdata[0].w;

Open in new window


Your thoughts and insight is greatly appreciated on this...

Thank you,
Any thoughts... anyone?
Pretty Please... Help?
Sorry for the delay mate...

Here's a working example with your data: http://jsfiddle.net/ZbNAT/10/
I believe you messed up somewhere adapting my code.
I kept the 'a' as the array name instead of 'p' because 'a' is hard-coded inside everywhere.

Tell me if it works, if it doesn't send me the code you're using.

Cheers!
Thank you my friend...  I thought I made all the appropriate changes..

You know how it is when you stare at code to long... plus me being a novice with JS it won't surprise me if I did something wrong.    I'll take a closer look to see where I err'd.    Thank you again for taking time to throw me an assist...  Greatly appreciated...