Link to home
Start Free TrialLog in
Avatar of gr8life
gr8life

asked on

Tree Assistance

Fernado,
Here is the new posting for the previous thread. I uploaded the data, which includes the tool and source data as a attachment to this question.
https://www.experts-exchange.com/questions/22043310/Fernado-Soto-Please-HELP.html


Thank you for all your help,
Gr8life
Avatar of gr8life
gr8life

ASKER

Avatar of Fernando Soto
Hi Gr8life;

I ran your program and all the lines in the file c:\results\SkeletonKey\SkeletonKey20061121.txt stated that the IP was not in the database. I also noticed that you had combined the original files in the resource directory into 1 file with the  fields 1-4 having valid data and fields 0, 5-6 having a value of s which seemed to be OK because fields 0, 5-6 are not used in the code. I commented out the combined.latest file and uncommented the original files and re-ran the program and found that out of the 804 entires in the  Altered Data.txt file approximately 154 now were entries with Not In Database and 650 with valid entries in it. I then wrote a small program to convert the 4 files in the resource directory into a single file and re-ran the program and got 154 lines Not In Database and 650 with valid entries. I suspect that your combined.latest file has errors in it although the format looks correct. I did see a entry that the IP address was the same in both combined.latest file but had a different country code and a different field 4 which is used to find the length of the range of IP's. I have upload the following 3 files NewCombined.latest, my version of the original 4 files, NotInDatabase.txt, the list of all the Not In Database entries and SkeletonKey20061121.txt.

Please check this out and let me know what you find out with the 150 Not In Database entries.

    https://filedb.experts-exchange.com/incoming/ee-stuff/1459-TestResults.zip

Fernando
Avatar of gr8life

ASKER

The reason I made the combined.latest file was to make it easy to find problems, like IP addresses that are not in the database.  I used newer data than what was in the original file, because IP ranges are constantly changing and the database needs to be reflective so the results are as accurate as possible.  I looked at the data in the not in database file and it appears some of the IP ranges are not in the NewCombined.latest file.  I did find some with ranges that were in the database, but reported not in database.  I think I found the problembut,  I'm not sure if the problem is the building of the tree or the traversal of the tree. I need to research a little further to be certain.

Thanks,
Gr8life
Avatar of gr8life

ASKER

Is there a way to output the tree structure once it is populated with data to examine it for building errors?

Thanks,
Gr8life
Hi gr8life;

I have found this tool to be excelent UltraCompare, it is a stand alone tool or can be added to there other products as a add in. You can download a trial version to see if you like it.

        http://www.ultraedit.com/index.php?name=Content&pid=34

UltraCompare Professional is a powerful compare/merge application loaded with features to enable users to track differences between files, directories, and .zip archives! File Compare features include text and binary compare of two or three files at a time and users can merge differences between compared files.

Folder Compare supports comparison of local/network directories (and subdirectories with recursive compare) and .zip archives as well and users may merge differences between them. With automatic integration with UltraEdit-32 or UEStudio '05, UltraCompare Professional is a tool no user should be without!

Fernando
Sorry posted this one to the wrogn thread.
To this question, "Is there a way to output the tree structure once it is populated with data to examine it for building errors? ", Let me take a look at how this may be done.

Fernando
Avatar of gr8life

ASKER

I was able to locate a C++ recursive code snippet.

http://www.fredosaurus.com/notes-cpp/ds-trees/binarytreetraversal.html

void print_inorder(tree_node *p) {
    if (p != NULL) {
        print_inorder(p->left);  // print left subtree
        cout << p->data << endl; // print this node
        print_inorder(p->right); // print right subtree
    }
}

Not sure if this approach could be applied.
Also I will be on vacation until November 28.

Thank you for your time and hope you have a great holiday,
Gr8life
ASKER CERTIFIED SOLUTION
Avatar of Fernando Soto
Fernando Soto
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi Gr8life;

I have looked at the problem with the program after I made the last correction which I have posted above and this is what I have found. I have single stepped through the program looking at how the Trie was being built with an address that should have been found in the table but was not, the address was 12.106.65.236. The way the Trie is built is that it uses the first two parts of the IP to find the index into the main root node of the Trie that should hold this address; the Trie has 65536 such root nodes. In this case 12.106, bit shifting the first part of the address, 12, 8 bits to the left will result in a value of 3072. Now adding the second part of the address to the first part results in an index of 3178 in the Trie array of nodes. Now the range of valid IP’s is determined by the IP address plus the field 4 of the line from the file, for example:

s|US|ipv4|12.106.65.88|182040|s|s

The above line is from your combined.latest file. Note that field 4 has a value of 182040 and needs 18 bits to represent it but only 16 are available for use and so the way the Trie is written it will use the base address itself to store the node. Therefore when you go to see if the node 12.106.65.236 is valid it will not find it because the Trie only knows about 12.106.65.88.

I did not verify that all of the approximately 500 values that were Not In Database were for the same reason but what I did do was to download the most current versions of the files from “The Address Supporting Organizations” web site  at http://aso.icann.org/stats/index.html and used them in place of your combined.latest file. The result was that the file your program creates did not have 1 “Not In Database” message in it.

I hope that this was of some help to you.

Fernando
Avatar of gr8life

ASKER

Thank you for all your help.  

I am going to look at the changes tomorrow and again THANK YOU!!

Gr8life
Avatar of gr8life

ASKER

Works excellent

THANK YOU VERY MUCH!!
Gr8life
No problem, always glad to help. ;=)