Solved

AVL Trees vs. BTree - Real World Optimizations Needed

Posted on 2003-11-14
6
1,693 Views
Last Modified: 2007-12-19
This has been a question bugging me, but I've never gotten it solved the way I wanted....
I know in Theory an AVL tree will have better performance than a BTree in the long run.

Problem Set...

There is a database written useing an index file and a data file of
fixed length data (mostly).
Both of these sets of data are sorted using a Binary Sort tree.
The way this works is via record number, and via other sort critera
such that there are more then one way to sort through this data. These
record pointers ( unsigned int ), are actually doubly linked-lists. This allows
us to traverse the tree in any direction from any point.

The primary operation is Reads, and Updates, and Adds with Deletes
being the rarest of all events. In that order.

Since this methodology allows 22 disk reads to find 1 record in the worst-case
scenario for 5 million users at over 80 TPS, it's VERY fast. Much faster then using
a REAL database for far cheaper. For everything else, the filesystem can act as a
datbase for other interesting variable data that isn't time critical.

But....
Adds are written to the End of the file, so you have unsorted data following down a branch
creating an unbalenced tree. So performance degrades for all "new records" but not for
all current records. But all "new records" have a higher chance of being hit, since newer
is better.
Currently the clients attached that are attached to this database have to be locked out so that the indexes and sort indexes can be reindexed to rebalance the tree.

Hence my research into AVL trees.
While this may allow clients to access 24/7 it adds more to the disk reads/writes for
Adds and Deletes via most implementations that I came up with.

Since I would require exclusive locks per record, but that would lock 6 records for a SWAP(A,B)
operation.

|  parent     |           ---->          |   node   |      ---->         |  child  |
|      A         |          <----          |      A      |     <----         |   A      |

|  parent     |           ---->          |   node   |      ---->         |  child  |
|      B         |          <----          |      B      |     <----          |   B      |

SWAP(a,b)
Exclusive Lock Parent A
Exclusive Lock Parent B
Exclusive Lock Node A
Exclusive Lock Node B
Exclusive Lock Child A
Exclusive Lock Child B

// Remove NodeA, Link NodeB
ParentA.Next = NodeB.current
NodeB.Previous = ParentA.current
NodeB.Next = ChildA.current
ChildA.Previous = NodeB.current

// Node A ophened
// if locks released, the Node A cannot be found...not acceptable

ParentB.Next = NodeA.Ccurrent
NodeA.Previous = ParentB.Current
NodaA.Next = ChildA.Current
ChildB.Previous = NodeA.Current
// write the 6 records
ReleaseLocks(ParentA,NodaA,ChildA,ParentB,NodeB,ChildB)

Current Code:
LastRec = EndofFile-sizeof(rec)
AddRec( new data)
AddRec.Next = 0
AddRec.Previous = LastRec
Lock LastRec
LastRec.Next = AddRec
Unlock LastRec
Done
//4 I/O operations
So what I have here is now 18 file operations and blocking
some of the readers.

If I could allow ophaned records, then I could do this with at most 8 I/O operations.

This only took 4 operations before and was non-blocking.

So how do you solve this? Use a real database :) and pay
millions to M$ or to Larry Elli$on?

One theory I had was to create a shared memory copy that
is used for all reads, perform my writes to disk, and then
at some interval sync the shared memory copy to that which is
on disk or on updates that does update a node list.

The complexity goes up, but not sure if I will go this route.
Would prefer a better alogrithim.

(wow that was a long question)

:)
0
Comment
Question by:g0rath
  • 3
  • 3
6 Comments
 
LVL 45

Expert Comment

by:sunnycoder
ID: 9753335
I read whole of your question .... but your algo is not very clear to me ... maybe due to difference of terminology or may be due to too many node A and too many node B... But your problem is clear...

you must be having some function/code segment that builds the index for a database
place another call to that function and let it build an index in some temporary buffer without interfering with the existing indices ...
at first opportunity (no access to records) replace old index with new one....

this should allow 24/7 access without much performance hit (you can always build indices when you are freer)
0
 
LVL 5

Author Comment

by:g0rath
ID: 9763348
the problem is that both the index and the data file has binary trees in them, there are connected as doulbly linked lists, while the index is trivial to rebalance, the data file takes alot longer to balance.

There isn't a function to rebuild the index on insert, since that would lock too many records...it is just appended to the end. During the night the database is taken offline, and then both files are sorted by username, then then recursivly divide the tree in half to find the root node, and then on return it balances the rest of the branches...any records marked for deletetion, at this point are not put back into the tree, this can take upwards of 30 minutes to an hour

....which is why I'm looking at AVL trees except for the performance hit for the swaps....

Read/write performance is the most important aspect of this database which is why it was developed using this method.

Using your method I would have to build an index in memory, and when we don't have any access, write it back out....that would suggest I would need some sort of transaction log, that all write transactions that have to be commited. Also there is more write then I realized, since we toggle bits for status in the individual record and update dates...so if this proccess took 60 seconds, we could easily have 100+ write transactions to update during a busy period...

I have thought about this, but haven't looked for all the "What if's" yet so haven't gotten too far...
0
 
LVL 45

Expert Comment

by:sunnycoder
ID: 9769019
may I ask why you need doubly connected structures ?

anyway, I gave your problem a good thought and I think that a B+ tree might be just right...

the lockup will not be vast, effect will be absorbed in max 1 level above leaves ... Given your circumstances, if you start with something like say 60% populated internal nodes, it may be a long time before you see a level split anywhere...
the number of levels in the tree would  be reduced too ...

the size of your database would be a main consideration in deciding the cardianality...

what do you say?
0
Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

 
LVL 5

Author Comment

by:g0rath
ID: 9818700
doubly linked lists are so that you can traverse the tree in either direction from some arbitrary point.

Each node is user data, but the data is linked for things such as CreatedTime, and other indexes that need to be either sorted asscending or descending....

Do you have a good place that shows the B+ tree algorithm? Either pseudo-code or whatever may also be nice....the B+ tree I never really thought of because it seems that everyone has their own B++ tree modification or something else slightly different. Sorry been really busy these days so haven't looked close enough back on this problem.
0
 
LVL 5

Author Comment

by:g0rath
ID: 9818748
http://www.cs.duke.edu/~tavi/papers/tpie_paper.pdf

This looks very promising...a discussion of the implementation of Tree alogrithims in both sequential and random I/O by the TPIE (Transparent Parallel I/O Environment) project guys
0
 
LVL 45

Accepted Solution

by:
sunnycoder earned 500 total points
ID: 9823048
>doubly linked lists are so that you can traverse the tree in either direction from some arbitrary point.
So I guess you need sequential access for range searches ... B+ tree would do just fine

>the B+ tree I never really thought of because it seems that everyone has their own B++ tree modification or something else
>slightly different.
LOL, al the more better, you can customize the algorithm to match your needs perfectly

>Do you have a good place that shows the B+ tree algorithm?
http://userpages.umbc.edu/~cmason1/code/cs491i/cs491i/tutorial8.ppt

A book on DBMS concepts by Elmasri Navathe (addison wesley if I remember correctly) has a very good description with algorithm and examples ... If you can find a copy, go through it
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
lucky13 challenge 11 113
sameEnds challenge 25 73
countHi2 challenge 7 44
Not needed 13 58
If you haven’t already, I encourage you to read the first article (http://www.experts-exchange.com/articles/18680/An-Introduction-to-R-Programming-and-R-Studio.html) in my series to gain a basic foundation of R and R Studio.  You will also find the …
This article is meant to give a basic understanding of how to use R Sweave as a way to merge LaTeX and R code seamlessly into one presentable document.
The goal of this video is to provide viewers with basic examples to understand and use conditional statements in the C programming language.
The viewer will learn how to clear a vector as well as how to detect empty vectors in C++.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now