Start Free Trial

asked on

enchancement of 2,3 tree

I am about to start to code an 2,5 tree or you would say 2,3,4,5 tree.. really confused where to start, but I have some idea in mind of using abstract nodes and also sentinel nodes.. any hints suggestions or thought of where to start?

this tree will be used to store concordance of words, just for the info and I would like to know where the location of the word is

2-3 trees and 2-4 trees are well-known data structures. I am not familiar with the 2-5 tree but if it is an extension of the 2-3 tree then it is most likely a special case of the B-tree. You might consider reading up on B-trees. I can direct you to pseudocode for B-tree operations if you like.

Also, it might help to know what your application is since there are other similar data structures (like B+-trees, for ex.) that are better for certain applications.

For B-tree and B+-tree explanations see
http://en.wikipedia.org/wiki/B-tree and
http://en.wikipedia.org/wiki/B%2B_tree

ASKER

Yes, a 2-5 tree is basically similar idea with 2-3 or 2-4, I am just extending the idea of it.. What do you mean by application here?? oh and I am doing this in java and has to do this recursively..

What I am going to use this tree is to store a file that has lots of words in it.. so inside that file I have a paragraph and line number.. I have to keep track of this paragraph where I found the word in file and also what line number is it in the paragraph in the tree..

I wan't to print all the word found in this file alphabetically.. so I guess an in order traversal?

This link is a class assignment for a concordance using 2-3 trees
http://www.cs.arizona.edu/classes/cs345/fall07/prog4.pdf
I believe your 2-5 tree is simply a B-tree with 6-way branching. So, I think a B-tree implementation should solve your problem.
This link is pseudocode for B-tree operations
http://cs-netlab-01.lynchburg.edu/courses/DataStII/BTreeOps.htm
This link is info on a Java implementation of B-trees
http://answers.google.com/answers/threadview?id=510893

Yes, an inorder traversal of the tree wold give the words in alphabetical order if the words are the key. If you were to use a B+-tree then the inorder traversal would be trivial but you would not have 2-5 tree.

ASKER

that link of the class assignment you gave me perfectly describe what I need.. that's 100% what I must do to finish this off

I think I would go with the b-tree.. and I am still confused of insertion in this 2-5 tree though.. could you explain it to me??

the way I am going to do is create an abstract class and have separate class for 2 node, 3 node, 4 node, and a 5 node.. each has a different insertion method

ASKER

oh and by the way I already take a look of the link you posted:

http://answers.google.com/answers/threadview?id=510893

before I posted here

ASKER

and a 2-5 tree is similar to a b-tree of order 5 right?

ASKER

my implementation of the 2-5 tree is going to be similar to this:

http://www.owlnet.rice.edu/~comp212/01-fall/code/234tree/t234/

but I am going to add one more class for the five node

In a B-tree of order m every node has a max of m children so you are correct that it is equiv. to your 2-5 tree.
I think the explanation of insertion is pretty good here - http://en.wikipedia.org/wiki/B-tree
The diagram goes a long way to demonstrate the different cases that come up.
The code for insertion is a little complex due to the different cases. You should first understand how to search for a node in a B-tree. You can find B-tree ops pseudocode here
http://cs-netlab-01.lynchburg.edu/courses/DataStII/BTreeOps.htm

Briefly, you search as you would in a binary search tree except that when you look at a node you must sequence through the array of values (not just the single value in a BST) until you find a value that is larger than the one you are searching for or until you reach the end of the array. Then you follow the pointer to the next level in the tree.

Insertion ensures that the B-tree always remains perfectly balanced with respect to the number of nodes at each levels although some nodes typically contain more data items than others. In a B-tree of order m each node can hold a max of m-1 values (thus, m pointers and m children).
Every node must have a minimum of m/2 children (except for the root). These conditions cause the insertion algorithm to have some special cases.
Once you find where a value should go via search (in a leaf) and if that leaf has room you can simply insert if there. If the leaf is full you may be able to move its middle value into its parent node thus creating one more pointer in the parent and one more child where you may place your new value.
If the parent is full continue up to the grandparent, etc. If you reach the root without finding some place to put you new value then the root node is split into 2 nodes, a new root is created and your new value is placed in one of those 3 nodes.
Note, this is the only case in which the number of levels in a B-tree increases. this is what makes it different from a BST it grows from the top not the bottom.
As you can probably surmise, I cannot give you a complete and cogent explanation in this format but I hope that this helps.

ASKER

okay..thanks for the detailed information.. however you haven't asked my previous question.. I am about to use the implementation :

http://www.owlnet.rice.edu/~comp212/01-fall/code/234tree/t234/

and just add a new class for the five node.. is there any particular thing that I should change?

The author of the code your reference says the following.
"* The classic discussion for 2-3-4 trees is outlined in Sedgewick's
* "Algorithms in C++", Addison-Wesley 1992, pp. 215-219. Other elucidating
* discussions of balanced trees can be found in Stubbs/Webre's
* "Data Structures", and Aho/Hopcroft/Ullman's "Data Structures and Algorithms".
* For the sake of definiteness, we design the tree to store Integer only."

This is good advice.
I glanced through the code and if it works then I don't see any reason why it cannot be extended to a 2-5 tree implementation. Of course you must change integer to string.

Having said that let me say that I am not enthusiastic about the implementation for the following reasons.
A B-tree implementation would be much more flexible.
I don't like use of "helper" functions.

ASKER

hmm..okay that makes sense.. so you think I should do an implementation of a B-tree better than that? and besides the sample I gave you is not recursive right?

From what I see of the code I cannot tell for sure whether it is specifying a recursive implementation or not. If you must use an extension of 2-4 trees then, of course, you have no choice. However, I personally would do a B-tree implementation and simply use the 2-5 tree as a special case.
You can make the branching factor a parameter.
This means that your data will be stored in an array (or possibly linked list) in each node to accommodate a variable number of data values. Some recursion is normally employed in this implementation but, of course, may be avoided if necessary.
Using a B-tree means that you will not have to have several different types of nodes to accommodate 1, 2, 3, or 4 data values since the array will accommodate all up to some max.

ASKER

>>Using a B-tree means that you will not have to have several different types of nodes to accommodate 1, >>2, 3, or 4 data values since the array will accommodate all up to some max.

For this particular assignment, I am asked to use an abstract class that will be extended by the 1, 2, 3, 4, 5 node.. so I should have several class for those node.. and the link that I gave you, suits that well..

How many possible insertion case are there in a 5 node?

I have not done this implementation so I don't know if there are tricks to make it more general but it would appear that there would be many cases. If a node has to split for insertion or nodes combined for deletion then the number of data values in the node will change which I assume will be a special case for each type of node that you have. So, for example, if a parent node hold 2 values and has 3 children when you do an insert if the child where the new value is to be placed is full then the child will have to be split and the parent will have to take one more value. This means you will need to change the type of node the parent is to accommodate one more value, the child will have be split into 2 children both taking a different number of values than the original child.
It seems so cumbersome to me that I am guessing there is a slicker way to do this. Perhaps you should consult Sedgewick's chapter on 2-4 trees. I don't see a nice way to implement this.

I would assume that if you can find a good implementation if a 2-4 tree or even a 2-3 tree you should be able to extrapolate from that. If the implementation uses an array rather than different node types to store date then you are essentially back to a B-tree.

ASKER

I don't know, for me this tyoe of implementation makes more sense and easier for me to understand

ASKER

well could you help me find a better implementation of a 2-4 tree in java?

ASKER

Oh and I also understand by using an abstract node class and all the other nodes to extend from that abstract class it will take a lot of consideration to be performed..

we have to consider every case for insertion and also we will need to change the node type from 2 to 3 node

ASKER

okay so I just give a thought, if I want each node to a be a separate class then each of those node should have different searching methods, correct?

now I am writing search for a 2-node, the easiest one, I've attached the code below:

the problem is when it does return search(leftTree, n).. it does not know if the leftTree is a 2-node, 3-node, 4-node, or even 5-node... how do I redirect to the right search function?

@Override
	public AbstractNode search(TwoFive T, String n) {
		if (data.equals(n))
			return this;
		else{
		   if (data.compareTo(n) < 0)
			  return search(leftTree, n);
		   else
			  return search(rightTree, n);
		}
		
		
	}

Open in new window

ASKER

updated code on the classes I use:

public class TwoFive {

      private AbstractNode root;

      public TwoFive(){
            root = null;
      }

      public TwoFive(String n){
root = new TwoNode(n);
}

      public final boolean isEmpty() {
return root.isEmpty(this);
}


      public void insert(String n) {
root.insert(this,n);
}

      public AbstractNode search(String n){
            return root.search(this, n);
      }


}

ASKER


public class TwoNode extends AbstractNode{
 
	 private TwoFive leftTree = new TwoFive(); 
	 private String data;                      
	 private TwoFive rightTree = new TwoFive();
	 
	 
	 public TwoNode(String word) {
	        data = word;
	 }
	 
	@Override
	public void inOrder() {
		
	}
 
 
	@Override
	public void insert(TwoFive T, String d) {
		AbstractNode node = search(T, d);
	   
	}
 
	@Override
	public boolean isEmpty(TwoFive tree) {
		// TODO Auto-generated method stub
		return false;
	}
 
 
	@Override
	public AbstractNode search(TwoFive T, String n) {
		if (data.equals(n))
			return this;
		else{
		   if (data.compareTo(n) < 0){
			  return leftTree.search(n);
		   }else
			  return rightTree.search(n);
		}
		
		
	}
}

Open in new window

ASKER

do you think the insertion will work?

Your code only appears to be applicable when there is only one value in a node. This is equivalent to a binary tree. One approach is for each node to have a "flag" field which indicates what type of node it is so your comparison function can be overloaded. I.e., you need slightly different code to inspect the values in each type of node. You may have seen a similar technique employed in binary trees where leaves and interior nodes are flagged differently so that much space can be saved by eliminating the pointers in leaf nodes

Data Structures and Algorithms in Java (3rd edition) by Michael T. Goodrich and Roberto Tamassia
has a section on 2-4 trees. It may have code but I am not sure about that.

Finally., if you a student member of the ACM then you should have access to the online books including this one - Data Structures & Algorithms in Java by Robert Lafore
Where you will find code for a 2-3-4 tree
// tree234.java
// demonstrates 234 tree
// to run this program: C>java Tree234App
import java.io.*; // for I/O
import java.lang.Integer; // for parseInt()
////////////////////////////////////////////////////////////////
class DataItem
{
public double dData; // one data item

//--------------------------------------------------------------

public DataItem(double dd) // constructor
{ dData = dd; }

//--------------------------------------------------------------

public void displayItem() // display item, format "/27"
{ System.out.print("/"+dData); }

//--------------------------------------------------------------

} // end class DataItem

////////////////////////////////////////////////////////////////

class Node

{

private static final int ORDER = 4;

private int numItems;

private Node parent;

private Node childArray[] = new Node[ORDER];

private DataItem itemArray[] = new DataItem[ORDER-1];

// -------------------------------------------------------------

// connect child to this node

public void connectChild(int childNum, Node child)

{

childArray[childNum] = child;

if(child != null)

child.parent = this;

}

// -------------------------------------------------------------

// disconnect child from this node, return it

public Node disconnectChild(int childNum)

{

Node tempNode = childArray[childNum];

childArray[childNum] = null;

return tempNode;

}

// -------------------------------------------------------------

public Node getChild(int childNum)

{ return childArray[childNum]; }

// -------------------------------------------------------------

public Node getParent()

{ return parent; }

// -------------------------------------------------------------

public boolean isLeaf()

{ return (childArray[0]==null) ? true : false; }

// -------------------------------------------------------------

public int getNumItems()

{ return numItems; }

// -------------------------------------------------------------

public DataItem getItem(int index) // get DataItem at index

{ return itemArray[index]; }

// -------------------------------------------------------------

public boolean isFull()

{ return (numItems==ORDER-1) ? true : false; }

// -------------------------------------------------------------

public int findItem(double key) // return index of

{ // item (within node)

for(int j=0; j<ORDER-1; j++) // if found,

{ // otherwise,

if(itemArray[j] == null) // return -1

break;

else if(itemArray[j].dData == key)

return j;

}

return -1;

} // end findItem

// -------------------------------------------------------------

public int insertItem(DataItem newItem)

{

// assumes node is not full

numItems++; // will add new item

double newKey = newItem.dData; // key of new item

for(int j=ORDER-2; j>=0; j--) // start on right,

{ // examine items

if(itemArray[j] == null) // if item null,

continue; // go left one cell

else // not null,

{ // get its key

double itsKey = itemArray[j].dData;

if(newKey < itsKey) // if it's bigger

itemArray[j+1] = itemArray[j]; // shift it right

else

{

itemArray[j+1] = newItem; // insert new item

return j+1; // return index to

} // new item

} // end else (not null)

} // end for // shifted all items,

itemArray[0] = newItem; // insert new item

return 0;

} // end insertItem()

// -------------------------------------------------------------

public DataItem removeItem() // remove largest item

{

// assumes node not empty

DataItem temp = itemArray[numItems-1]; // save item

itemArray[numItems-1] = null; // disconnect it

numItems--; // one less item

return temp; // return item

}

// -------------------------------------------------------------

public void displayNode() // format "/24/56/74/"

{

for(int j=0; j<numItems; j++)

itemArray[j].displayItem(); // "/56"

System.out.println("/"); // final "/"

}

// -------------------------------------------------------------

} // end class Node

////////////////////////////////////////////////////////////////

class Tree234

{

private Node root = new Node(); // make root node

// -------------------------------------------------------------

public int find(double key)

{

Node curNode = root;

int childNumber;

while(true)

{

if(( childNumber=curNode.findItem(key) ) != -1)

return childNumber; // found it

else if( curNode.isLeaf() )

return -1; // can't find it

else // search deeper

curNode = getNextChild(curNode, key);

} // end while

}

// -------------------------------------------------------------

// insert a DataItem

public void insert(double dValue)

{

Node curNode = root;

DataItem tempItem = new DataItem(dValue);

while(true)

{

if( curNode.isFull() ) // if node full,

{

split(curNode); // split it

curNode = curNode.getParent(); // back up

// search once

curNode = getNextChild(curNode, dValue);

} // end if(node is full)

else if( curNode.isLeaf() ) // if node is leaf,

break; // go insert

// node is not full, not a leaf; so go to lower level

else

curNode = getNextChild(curNode, dValue);

} // end while

curNode.insertItem(tempItem); // insert new DataItem

} // end insert()

// -------------------------------------------------------------

public void split(Node thisNode) // split the node

{

// assumes node is full

DataItem itemB, itemC;

Node parent, child2, child3;

int itemIndex;

itemC = thisNode.removeItem(); // remove items from

itemB = thisNode.removeItem(); // this node

child2 = thisNode.disconnectChild(2); // remove children

child3 = thisNode.disconnectChild(3); // from this node

Node newRight = new Node(); // make new node

if(thisNode==root) // if this is the root,

{

root = new Node(); // make new root

parent = root; // root is our parent

root.connectChild(0, thisNode); // connect to parent

}

else // this node not the root

parent = thisNode.getParent(); // get parent

// deal with parent

itemIndex = parent.insertItem(itemB); // item B to parent

int n = parent.getNumItems(); // total items?

for(int j=n-1; j>itemIndex; j--) // move parent's

{ // connections

Node temp = parent.disconnectChild(j); // one child

parent.connectChild(j+1, temp); // to the right

}

// connect newRight to parent

parent.connectChild(itemIndex+1, newRight);

// deal with newRight

newRight.insertItem(itemC); // item C to newRight

newRight.connectChild(0, child2); // connect to 0 and 1

newRight.connectChild(1, child3); // on newRight

} // end split()

// -------------------------------------------------------------

// gets appropriate child of node during search for value

public Node getNextChild(Node theNode, double theValue)

{

int j;

// assumes node is not empty, not full, not a leaf

int numItems = theNode.getNumItems();

for(j=0; j<numItems; j++) // for each item in node

{ // are we less?

if( theValue < theNode.getItem(j).dData )

return theNode.getChild(j); // return left child

} // end for // we're greater, so

return theNode.getChild(j); // return right child

}

// -------------------------------------------------------------

public void displayTree()

{

recDisplayTree(root, 0, 0);

}

// -------------------------------------------------------------

private void recDisplayTree(Node thisNode, int level,

int childNumber)

{

System.out.print("level="+level+" child="+childNumber+" ");

thisNode.displayNode(); // display this node

// call ourselves for each child of this node

int numItems = thisNode.getNumItems();

for(int j=0; j<numItems+1; j++)

{

Node nextNode = thisNode.getChild(j);

if(nextNode != null)

recDisplayTree(nextNode, level+1, j);

else

return;

}

} // end recDisplayTree()

// -------------------------------------------------------------\

} // end class Tree234

////////////////////////////////////////////////////////////////

class Tree234App

{

public static void main(String[] args) throws IOException

{

double value;

Tree234 theTree = new Tree234();

theTree.insert(50);

theTree.insert(40);

theTree.insert(60);

theTree.insert(30);

theTree.insert(70);

while(true)

{

putText("Enter first letter of ");

putText("show, insert, or find: ");

char choice = getChar();

switch(choice)

{

case 's':

theTree.displayTree();

break;

case 'i':

putText("Enter value to insert: ");

value = getInt();

theTree.insert(value);

break;

case 'f':

putText("Enter value to find: ");

value = getInt();

int found = theTree.find(value);

if(found != -1)

System.out.println("Found "+value);

else

System.out.println("Could not find "+value);

break;

default:

putText("Invalid entry\n");

} // end switch

} // end while

} // end main()

//--------------------------------------------------------------

public static void putText(String s)

{

System.out.print(s);

System.out.flush();

}

//--------------------------------------------------------------

public static String getString() throws IOException

{

InputStreamReader isr = new InputStreamReader(System.in);

BufferedReader br = new BufferedReader(isr);

String s = br.readLine();

return s;

}

//--------------------------------------------------------------

public static char getChar() throws IOException

{

String s = getString();

return s.charAt(0);

}

//-------------------------------------------------------------

public static int getInt() throws IOException

{

String s = getString();

return Integer.parseInt(s);

}

//-------------------------------------------------------------

} // end class Tree234App

ASKER

well no, actually what I wan't is to do this using the abstract class like I told you before..

and why does it only work for one value in the node?

ASKER

I also have a three node class along with four node and also five node

here's the three node class


public class ThreeNode extends AbstractNode{
 
	private TwoFive leftTree  = new TwoFive (); 
    private String leftDat;	                 
    private TwoFive midTree  = new TwoFive ();  
    private String rightDat;	               
    private TwoFive rightTree  = new TwoFive ();
    
    public ThreeNode(String data1, String data2) {
    	if (data1.compareTo(data2) < 0) {
            leftDat   = data1;
            rightDat  = data2;
        }
        else {
            leftDat   = data2;
            rightDat  = data1;
        }
    }
    
	@Override
	public void inOrder() {
 
	}
 
	@Override
	public void insert(TwoFive tree, String data) {
		AbstractNode node = search(tree, data);
	
	}
 
	@Override
	public boolean isEmpty(TwoFive tree) {
		// TODO Auto-generated method stub
		return false;
	}
 
	@Override
	public AbstractNode search(TwoFive T, String data) {
		if (data.equals(leftDat) || data.equals(rightDat))
			return this;
		else{
			if (data.compareTo(leftDat) < 0)
				return leftTree.search(data);
			else if (data.compareTo(rightDat) < 0)
				return midTree.search(data);
			else
				return rightTree.search(data);
		}
	}
 
 
 
}

Open in new window

ASKER

here's my latest code, please let me know if I made a mistake or which part I am wrong:

I just attached the 3-node class here as a general idea of what I did..


public class ThreeNode extends AbstractNode{
 
	private TwoFive leftTree  = new TwoFive (); 
    private String leftDat;	                 
    private TwoFive midTree  = new TwoFive ();  
    private String rightDat;	               
    private TwoFive rightTree  = new TwoFive ();
    
    public ThreeNode(String data1, String data2) {
    	if (data1.compareTo(data2) < 0) {
            leftDat   = data1;
            rightDat  = data2;
        }
        else {
            leftDat   = data2;
            rightDat  = data1;
        }
    }
    
	@Override
	public void inOrder() {
 
	}
 
	@Override
	public void insert(TwoFive tree, String data) {
		AbstractNode node = search(tree, data);
		if (node.isEmpty(tree))
			tree.changeRoot(new FourNode(leftDat, rightDat, data));
		else{
			if (data.compareTo(leftDat) < 0)
				leftTree.insert(data);
			else if (data.compareTo(rightDat) < 0)
				midTree.insert(data);
			else
				rightTree.insert(data);
		}
	
	}
 
	@Override
	public boolean isEmpty(TwoFive tree) {
		if (leftTree.root == null && rightTree.root == null && midTree.root == null)
			return true;
		else
			return false;
	}
 
	@Override
	public AbstractNode search(TwoFive T, String data) {
		if ((data.equals(leftDat) || data.equals(rightDat)) || this.isEmpty(T))
			return this;
		else{
			if (data.compareTo(leftDat) < 0)
				return leftTree.search(data);
			else if (data.compareTo(rightDat) < 0)
				return midTree.search(data);
			else
				return rightTree.search(data);
		}
	}
 
 
 
}

Open in new window

ASKER


public class TwoFive {
 
	protected AbstractNode root;
	
	public TwoFive(){
		root = null;
	}
	
	public TwoFive(String n){
        root = new TwoNode(n);
    }
	
	public final boolean isEmpty() {
        return root.isEmpty(this);
    }
	
	
	public void insert(String n) {
		root.search(this, n);
        root.insert(this,n);
    }
	
	public AbstractNode search(String n){
		return root.search(this, n);
	}
	
	
	final void changeRoot(AbstractNode newRoot) {
	     root = newRoot;
	}
}

Open in new window

Perhaps I am wrong but it appears that you are assuming that you have a tree full of 2 nodes or 3 nodes, etc. when your tree has, potentially, a mix of all those different nodes. Each time you move to a new child node in the search process you don't know if that node is a 2, 3, 4 or 5 node.
Your code has to accommodate all of them. I don't see how it is doing that.

ASKER

as the matter of fact, yes we do know.. it's because of the class two five, is you see the code that I posted before:

http://www.owlnet.rice.edu/~comp212/01-fall/code/234tree/t234/

it does the same thing as what I did.. it separates different nodes in different classes..

I understand that in this 2-5 tree we have a collection of all 2 nodes, 3 nodes, 4 nodes, etc.. not only one of them..

in the search process, when I do leftTree.search(..), lets say that leftTree is a 3 node then it will use the search that is implemented in the 3 node class, I hope you understand what I mean.. please let me know of any other concern

ASKER

one thing I don't understand about the implementation in:

http://www.owlnet.rice.edu/~comp212/01-fall/code/234tree/t234/

is the method isEmpty always returns false... any reason for this?

by the way this implementation of a 2-5 tree is really tough.. I am going to need some more extra help here

final boolean isEmpty (Tree234 owner) {
return true;
}
If you are asking why this function is coded to return true, I don't know.

ASKER

yes, in the tree234 class, it returns true but in all the other node class, it just returns false..

any other hints or suggestion that you could give me? did you understand what I did by creating several node class?

In rechecking the code I see that isEmpty is overloaded so there is a different function for each node class. The 2, 3, and 4 classes return false since they have data and the empty node class isEmpty returns true since it has no data.
I will look over the code again and see if I have any other comments.

ASKER

okay I got it...

ASKER

please I need some more help here

ASKER

I was thinking to modify that code so I would have a search function in each node, and everytime I call insert the method calls the search function first which returns the leaf node of the tree and where a particular element should be inserted..

but I am confused on the attach method too

I am not certain about the attach method either but here is my guess.
When you add a value to a 2-3 tree or any of its derivations you search until you find the leaf where it belongs. If the leaf is not full then you change the node type to one higher and insert the value - no problem.
However, if the leaf is full then you have to insert a value in the parent, grandparent or some other direct ancestor. This means that a node has to be split, creating a subtree that must be attached to the main tree.
See this to help visualize http://www.cs.ucr.edu/cs14/cs14_06win/slides/2-3_trees_covered.pdf

ASKER

yes roussosc, I get the basic idea of this type of tree... it's just the implementation that confuses me the most

ASKER

here's my code, so far:


public class Node5 extends ANode234{
 
	private Tree234 _leftTree  = new Tree234 ();      // Invariant:  _leftTree     != null
    private String _leftDat = null;	                      // Invariant:  _leftDat      != null
    private Tree234 _leftMidTree  = new Tree234 ();   // Invariant:  _leftMidTree  != null
    private String _leftMidDat = null;	 
    private Tree234 _midTree  = new Tree234 ();  	  // Invariant:  _rightMidTree != null
    private String _rightMidDat = null;	                  // Invariant:  _rightDat     != null
    private Tree234 _rightMidTree  = new Tree234 ();  // Invariant:  _rightTree    != null
    private String _rightDat = null;	   
    private Tree234 _rightTree  = new Tree234 ();
    private String type = "5Node";
 
    final String type(){
    	return type;
    }
 
    final void nodeInfo(){
    	System.out.println("This is a 5 node type");
    	System.out.println("Left data is " + _leftDat);
    	System.out.println("Left mid data is " + _leftMidDat);
    	System.out.println("Right mid data is " + _rightMidDat);
    	System.out.println("Right data is " + _rightDat);
    }
    
    
	Node5(String lo, String mid, String hi, String n) {
        if (hi.compareTo(n) < 0) {
            _leftDat  = lo;
            _leftMidDat = mid;
            _rightMidDat  = hi;
            _rightDat = n;
        } else if (mid.compareTo(n) < 0)  { // nVal < hiVal.
        	 _leftDat  = lo;
             _leftMidDat = mid;
             _rightMidDat  = n;
             _rightDat = hi;
        } else if (lo.compareTo(n) < 0){
        	 _leftDat  = lo;
             _leftMidDat = n;
             _rightMidDat  = mid;
             _rightDat = hi;
        }
          else{
        	  _leftDat  = n;
              _leftMidDat = lo;
              _rightMidDat  = mid;
              _rightDat = hi;
          }
    }
	
	Node5(Tree234 lTree, String lN, Tree234 lmTree, String mN, Tree234 mTree, String mD, Tree234 rmTree, String rN, Tree234 rTree) {
        _leftTree     = lTree;
        _leftDat      = lN;
        _leftMidTree  = lmTree;
        _leftMidDat   = mN;
        _midTree      = mTree;
        _rightMidDat  = mD;
        _rightMidTree = rmTree;
        _rightDat     = rN;
        _rightTree    = rTree;
    }
 
	
	@Override
	void attach(Tree234 owner, Tree234 ltree, String n, Tree234 rtree) {
		 throw new IllegalStateException ("Node5.attach() should never be called.");
	}
 
	@Override
	void drawRootAndSubtrees(int level) {
		  System.out.println (_leftDat + "  " + _leftMidDat + "  " + _rightMidDat + "  " + _rightDat);
	        _leftTree.drawAtLevel (level + 1);
	        System.out.println ();
	        _leftMidTree.drawAtLevel (level + 1);
	        System.out.println ();
	        _midTree.drawAtLevel (level + 1);
	        System.out.println ();
	        _rightMidTree.drawAtLevel (level + 1);
	        System.out.println ();
	        _rightTree.drawAtLevel (level + 1);
	}
 
	@Override
	void insert(Tree234 owner, String n) {
	        if (n.equals(_leftDat) || n.equals(_leftMidDat) || n.equals(_rightMidDat)|| n.equals(_rightDat)) {
	            return;
	        }
	        
	          Tree234 leftTree  = new Tree234(_leftTree, _leftDat, _leftMidTree, _leftMidDat, _midTree);
	          Tree234 rightTree = new Tree234(_rightMidTree, _rightDat, _rightTree);
	          owner.changeRoot (new Node2(leftTree, _rightMidDat, rightTree));
	          owner.insert (n);  
	}
 
	@Override
	void insertHelper(Tree234 ownerPo, Tree234 owner, String n) {
		//System.out.println("N in helper is " + n + " " + _leftTree.type() + " " + _leftMidTree.type() + " " + _midTree.type());
	        if (n.equals(_leftDat) || n.equals(_leftMidDat) || n.equals(_rightMidDat) || n.equals(_rightDat)) {
	            return;
	        }
	        
	        String dataToInsert = n;
		       
	        if (_leftMidDat.compareTo(n) < 0 && _rightMidDat.compareTo(n) > 0){
	        	dataToInsert = _rightMidDat;
	        	_rightMidDat = n;
	        }
	        
	        Tree234 leftTree  = new Tree234(_leftTree, _leftDat, _leftMidTree, _leftMidDat, _midTree);
	        Tree234 rightTree = new Tree234(_rightMidTree, _rightDat, _rightTree);
	       
	        if (dataToInsert.compareTo(_rightMidDat) < 0) {
	        	leftTree.insert(dataToInsert);
	        } else {
	        	rightTree.insert(dataToInsert);
	        }
	        
	        ownerPo.attach (leftTree, _rightMidDat, rightTree);
	        
	        
		
	}
 
	@Override
	boolean isEmpty(Tree234 owner) {
		return false;
	}
 
}

Open in new window

ASKER

problem is when I tried to do this insertion, which was supposed to be like in this webpage:

http://cis.stvincent.edu/carlsond/swdesign/btree/btree.html

it fails to do so when I tried to insert "P" it splits the DGMT node as where it is not necessary to do so and just inserting at the leaf of NQ, can you see any error why this happens?

	Tree234 tree = new Tree234();
		tree.insert("C");
		tree.insert("N");
		tree.insert("G");
		tree.insert("A");
		tree.insert("H");
		tree.insert("E");
		tree.insert("K");
		tree.insert("Q");
		tree.insert("M");
		tree.insert("F");
		tree.insert("W");
		tree.insert("L");
		tree.insert("T");
		tree.insert("Z");
		tree.insert("D");
                tree.insert("P");

Open in new window

I may be a bit confused here. Tell me if this is correct.
Your code implements a 2-3-4-5 tree which means you can have up to 4 values in every node.
You are comparing it to a B-tree of order 5 which also means it can have up to 4 children in every node.
If this is correct then there is one issue we have not addressed.
This issue does not come up in 2-3 or 2-3-4 trees but does in 2-3-4-5 trees.
In B-trees there is a requirement that every node must be at least half full (except the root). This means in a B-tree or order 5 every non-root node must have at least 2 values (3 children). So when you build a B-tree or order 5 you must enforce this restriction.
If you simply extend the implementation of a 2-3-4 tree to a 2-3-4-5 tree then you are not enforcing that restriction and are unlikely to build the same tree as a B-tree adding the same values in the same sequence. I don't know for sure if that accounts for the discrepancy you encountered. However, you need to do one of two things.
1. impose the half-full restriction on your 2-3-4-5 trees or
2. Do not try to compare 2-3-4-5 tree building with B-tree building.

ASKER

I already did this in my 2-3-4-5 tree.. everytime I build the tree and add something it always starts with a 2 node, 3 children..so the starting root is a type of 2Node

I don't think I understand what you are saying.
A 2Node has 2 children which means it contains 1 value.
In a B-tree of order 5 every node must contain at least 2 values and have 3 children; that means, in your teminology, it is a 3Node.

ASKER

Oh okay I got it, I am going to change that then now.. whenever I start a root.. I am going to initialize that to a 3 node type..is there anything else that I should know? I am kind of stuck here.. so it would be really great if you could respond faster

When you say "whenever I start a root" I assume that you mean a root of a subtree; i.e. just an ordinary node in the tree. When you initialize the tree before it has any data, the root is an emptyNode and when you add the first value to that root it will be a 2Node.

Also, it is not just sufficient to create a 3Node when you create a node, you must actually put 2 values in it and it must have 3 children as you see in your reference to the web page on the B-tree.

ASKER

yes..that's what I mean/..

when I start the real root in the tree I still have an empty node with left and right pointing to null and then when I first insert the empty node becomes a 2-node and so on... but when I want to add something to the left of the subtree of that real root then that must be a 3-node? is this what you're trying to say?

That depends upon what definition you have for a 2-3-4-5 tree. Since I have never seen such an object before I know of no standard definition for it. If you want its definition to be compatible with a B-tree of order 5 then you need the restriction that every node must have at least 2 values.

This only comes up when nodes are split. You encounter this initially as follows.
The first 4 data values go into the root since they fit in the root. Then when the 5th value comes the root is split. The median of the 5 values becomes the new root and the 2 parts that were the original root become the left and right subtrees of the new root. These 2 subtrees, of course, have one node with 2 values each in them. As far as I know this is the only way the first 5 values could be inserted without violating the B-tree definition.

ASKER

Hmm..I guess I am going to stay with the definition of a 2-2-4 tree where each node can have a minimum 2 node..

>>. These 2 subtrees, of course, have one node with 2 values each in them.
>>As far as I know this is the only way the first 5 values could be inserted without violating the B-tree definition.

What I want here is so that those 2 subtrees each, have one node with 1 values each in them.. that node has a left tree and a a right tree.. in other words a 2-node, not a 3-node as what you explained..

I know this may seem a little bit weird, but I am implementing it more towards the 2-3-4 side instead of the b-tree side

Then are you going to fill the root with the first 4 nodes?
After doing that when the 5th node comes I suppose you will change one fo the 5 emptyNodes (children of the root) to a 2Node and put that value in there.

I suppose that should work.

ASKER

Yes, I am going to fill the root first with the 4 nodes and then when I try to add another item.. I will choose the median of those 5 values, take it as a root and then divide the remaining value as a 3 and 2 node children... if you understand what I mean?? however I am still causing trouble as I have to do this recursively

Do you think I should have a separate search method and a separate insert method.. or is it better to combine both??

If I were to have the search function separately then I want that to return a leaf node where I should insert it right? but if an element were already present then we just return null as an indicator..

after that the insert function just do an insertion to that node that search just returned, it that node returned from the search is a 2-Node then I just add the new value and change that node type to a 3-node and so on. If that node is a 5-node, then I have to find the median of those 4 values.. and do the rest of the thing..

seems easy to say but hard to put it into code

Re: separate code for search and insert, it is easier to do it that way but probably more elegant for insert to call search. It does not really matter.
I don't understand the comment re: returning null. Are you not allowing duplicates. If you want insert to call search then search will have to be coded so that an unsuccessful search still returns a reference to the leaf where the value should have been and return an additional boolean value indicating the value is not present.

I think you may have misspoken in your first paragraph. If, after the root is full, you try to insert a 5th value and put the median in a new root then you will have 4 values left. If you create a 3Node and a 2Node in which place them then they can only hold a total of 3 values. Perhaps you wanted to create two 3Nodes which would hold 2 values each?

I don't have any advice re: the recursion problem.

I am going to be off-line for a while.

ASKER

Yes, I won't allow duplicates to happen in this tree.. however I would also like to have a counter in which tracks how many occurences a particular object is trying to be inserted into the tree..

oh yes, I mean a 3node and a 3 node... my mistake..

The think is that I should know the parent of each node am I right?

ASKER

The most confusing thing here is when separating the search and the insertion, to insert we should check if there is a parent for that leaf node, is not then it's a root right? so we just change that particular node value.. when we search, we start from the root to the leaf, but when we insert we start from the leaf up to the root.. when traversing down the tree with the search we know the parent of each

ASKER

so that an unsuccessful search still returns a reference to the leaf where the value should have been and return an additional boolean value indicating the value is not present.

as I am doing this in java, there's no way to return two things at the same time.. the easiest way is when it finds an element in the tree it just returns the node where it finds it and later in the insert function I would have to check if the particular value that needs to be inserted in the tree is present or not? what do you think about it?

my confusion is really about going up from the leaf to the root... how do I do this after I already find a leaf to insert? can you please give me some detailed guide here?

If you do your insert separate from the search in your insert you will be searching from the root down to a leaf so you can keep parents/ancestors on a stack as you go down. The only time you should be going from a leaf back up to an ancestor including the root is if the leaf is full and you have to back up to an ancestor that is not full. Rather than use a stack this could be implemented recursively but unless you get the recursion right the first time it will be more difficult to debug that way.

Another approach is to store a parent reference in each node.

ASKER

the only way I go back up is if the node is a 5-node type right? I would really love to do this recursively..

Here's my code for the search in the 2-node class, I don't know where to store the parent at...

I am not really that expert at recursion, so I will need some extra help

@Override
	public AbstractNode search(TwoFive T, String n) {
		if (dat.equals(n)){
			return this;
		}
		else{
		   if (dat.compareTo(n) < 0 && !leftTree.isEmpty()){
			  return leftTree.search(n);
		   }else if (!rightTree.isEmpty()){
			  return rightTree.search(n);		
		   }else{
			  return this;
			}
		}
	}

Open in new window

ASKER

>>If you do your insert separate from the search in your insert you will be searching from the >>root down to a leaf so you can keep parents/ancestors on a stack as you go down.

I don't want to use stack here.. I want to use recursion.... let me know about the code I posted above..

I don't see a problem with your search code for a 2-Node.

If you use recursion then I don't think your insert should have a problem. It will look very similar to your search.
Of course it will need to have code to actually insert the new value if the current node/leaf is not full.
Otherwise it will call recusively with the appropriate subtree. If the recursive call returns failure; i.e., that it could not insert the new value then you know all descendents were full and its time for a split.

ASKER

hmm..I am confused by what you said it will recursively with the appropriate subtree.. don't we already know where we should put the value we want? so what's the point of comparing it with our subtree?

when does call recusively with the appropriate subtree happens? if we can't insert a value?

ASKER

oh and just for the info, we are doing this bottom-up insertion

regarding the search function.. if we call search all we get is just the node/leaf where we are supposed to insert the new value at, we didn't know what the parent is.. I don't want to store the parent in a stack, I just want to do all of this recursively

Well, let me try to be clearer.
Your search is clearly recursive. However, if you call search from insert and search returns a reference to a node then you lose the internal stack info from the search call and do not have info. about the ancestors. I think you pointed this out. I am sure you can make it work calling search from insert I just think it may be more difficult.

Obviously, I have not coded this so cannot say definitively what I am saying is correct.

If insert is coded without calling search then it will have to search for the correct location and insert the new value in the tree.
It seems to me you will need a main insert and an inserthelper which actually does the searching and inserting into a leaf.
If it finds a leaf in which to put the new value then all is well. It will need a base case that actually places the value in the leaf and return true. However, if the leaves are full then the recursive call should return false and your main insert then will need code to split nodes and the splitting could propogate up to the root. This is what happens in B-trees. So, your insert will need to have to create a split and pass some info. about the split back up to the calling function (itself, since its recursive).

An alternative to this complication, I suppose, would be to place the new value in an empty node below the leaf where the value should go. I don't know if this is allowed in your 2-5 tree, though.

Oh, by the way your link to the B-tree info (http://cis.stvincent.edu/carlsond/swdesign/btree/btree.html) is out-dated and the page no longer maintained.
The new page is at http://cis.stvincent.edu/html/tutorials/swd/btree/btree.html
and has more info.

ASKER

>>An alternative to this complication, I suppose, would be to place the new value in an empty node below >>the leaf where the value should go. I don't know if this is allowed in your 2-5 tree, though.

this is clearly not allowed

This is what I was thinking too.. calling search from the insert function will lose all the stack frame.. so what would be an easier way to do this?? are you suggesting doing a search and an insert at the same method? because to me that makes more sense..

the thing is, how do I store the parent??

ASKER

To make such complicated problems easier, I think I am just going to modify this implmentation:

http://www.owlnet.rice.edu/~comp212/01-fall/code/234tree/t234/

so that it will support my 2-3-4-5 tree.. the only problem is that this implementation does a top-bottom insertion, where what I want is a bottom up insertion.. so can you help me to modify it into a bottom up? I tried to think about this for hours, but just don't know where to modify

are you suggesting doing a search and an insert at the same method?
Yes, I am.
You don't actually store the parent. In fact, storing the parent is not enough. You need all the ancestors in case the split propogates upward. One method is for each node to have a reference to its parent as I mentioned before. However, this would mean a change is your node structure and is not necessary.
Consider the insert method.
It calls itself recusively. If the recursive returns false (or some other integer that indicates the new value was not successfully inserted by the recursive call) then at that point it knows it has to do a split of the node where it is. It does that and returns a value to the previous call that it just did a split and the passed up value needs to be placed. The passed up value being the median we discussed earlier that must be placed in the parent.
See, you have access to the parent because afer the recursive call returns you are still at the parent.
It that node is full then it does the same thing; it performs a split and passes info. back up to the previous call.

ASKER

Yes, I know to store all the parents is impossible but having each node store where the parent is, is possible.. the problem is how?

>>if the recursive returns false
you mean it doesn't hit the base case?

so what are you suggesting the return type for this searchInsert method?

The code you reference above does seem to be doing a normal recursive insert so I am not sure you need to alter that framework radically.

If you want each node to store a parent that is not difficult. At the time you create a node you know its parent so you need to add a field called 'parent' (or something like that) to every node. The constructor then just needs to fill in that field with a reference to the parent. However, understand that this approach is not purely recursive as you will need a loop to work your way up the tree when it is time to do a split.

The approach I mentioned in the previous post is purely recursive. When you make the recursive call the return value from the recursive call tells you if you need to split. There is no need to store any parent because you always get back to the parent when you return from the recursive call.

ASKER

When you make the recursive call the return value from the recursive call tells you if you need to split.

so what are you suggesting the return type of this method? just a boolean as an indicator?

if that's so we'll work out the purely recursive way, what is the base case here?? if we found a node which is the leaf?

Yes, the base case is when you find a leaf with room. In that case you place the value and return a value that indicates you were successful in which case the calling method does nothing.

The problem is that if the base case is not reached; i.e., the leaf is full then you need to pass back the median from the split so the calling method can do a split and place that median value in a node. I am not sure how you want to manage that. If the data that is going into the tree are positive integers then a 0 could indicate success, I suppose. Otherwise you may need to return a structure that contains the info. you need to send back. This might be the case if the data were strings, say.

ASKER

The data that I am going to store is String's...

so you mean create another object??

Yes, if you return an object rather than a basic data type then you can return whatever info. you need to in that object. Perhaps you know a better way to do this.

ASKER

Well I think you know better than me, otherwise I won't be asking here all day long.. so what would you suggest me store in that object?

Two things - true or false indicating if the method were successful in storing the value in a leaf, and
the median value needed for the calling method to place after the split and/or to pass back when it returns.

In other words
1. a boolean indicating if the value was stored
2. the median value needed to do the split.

ASKER

okay and what if it's a 5 node where we already know that there's no more possibilities of storing another value in there?

I you are at a 5Node then you call recursively hoping that the value will be placed down below in a leaf.
If your recursive call comes back false (it could not place the value) then a split has been initiated at the level below and the median value fro the split has been passed back to you. Now you need to split your 5Node into 2 3Nodes, recompute the median, return it and return false to your parent (calling node).

I am not saying this is easy. It gets pretty ugly here I think but it is just like the B-tree code.
If you are a 5Node then the split in the child means that you now have 6 children (illegal).
Thus you must split yourself, place the median value and reconnect your children to the proper place in the two nodes you just created. If your parent can hold the median it will insert it and all is done.
If the parent cannot hold the median then it must also split and reconnect its children.
However, there is a problem here, the parent has to have access to the extra child you created with your split so it can reconnect properly.
Let me direct you again to this page
http://cs-netlab-01.lynchburg.edu/courses/DataStII/BTreeOps.htm
which has pseudocode for inserting into a B-tree. I believe this is what you need.
if you read this pseudocode you will see there is a note.
===========================================================
//Note: we split full nodes on the way down the tree so that if a child of that
//node needs to be split its parent will be able to accept the middle key.
set i = index of whichever child of x is root of subtree in which k should be placed.
call B-Tree-Insert-NonFull(ci[x], k) //Note: Recursive call
===========================================================
splitting full nodes on the way down helps remove some of the complexity. However, insert still needs to communicate back (return) a reference to the new child that was created.
I suppose this, too, will have to be passed back in the returned object.

Now that we have had this long discussion I think most of the code in the reference will make more sense and you will see how it applies to your 2-5Tree. The insert is not trivial. The insert in the pseudocode uses 3 functions
B-Tree-Insert(T, k
B-Tree-Split-Child(x, i, y) and
B-Tree-Insert-NonFull(x, k)
Keep in mind that the reference to a root, r in the pseudocode is not necessarily referring to the root of the entire tree. It is most often referring to the root of a subtree.

Finally, because you are not required to keep nodes 1/2 full as in a B-tree then perhaps there is an easier way. Like I said earlier I have not programmed your 2-5tree so do not know what you might encounter.

Keep this in mind too, you need to pass the median up to the parent after a split so that if the parent is not full the parent can use it as a value so it can have the correct number of subtrees. If the parent is full then it needs to split and pass a new median up to its parent for the same reason.

ASKER

seems really complicated and tough! I am getting worried about this

Since this appears now more difficult to you than you initially thought, you might want to repost the question with 500 points rather than 50 and perhaps that will attract some other experts with fresh ideas.

ASKER

as I said, I don't want use the implementation of a b-tree I want to put it as an improvement of 2-3-4 tree

ASKER

on this link again:

http://www.owlnet.rice.edu/~comp212/01-fall/code/234tree/t234/

is it possible to do the insert method without the isEmpty?

I see in ANode234.java the following
"EXERCISE: Remove this isEmpty method and rewrite the insertion algorithm
without checking for emptiness.
</pre>
*/
abstract boolean isEmpty (Tree234 owner);"
I don't really understand the significance of the exercise.

When you do a search or insert you need to know somehow when you have moved or are about to move to an empty node. Whether you look to see if that particular child is an empty node directly or indirectly via a method like isEmpty is probably irrelevant. I suppose when you move to an empty node you could just return failure.
In other words you could eliminate isEmpty but you would need code to accomplish the same purpose.

ASKER

yeah..that's what I was thinking too roussosc.. we should check for emptiness somehow..

hmm.. is there any particular way to make that code more simple or efficient that you can think of?

I cannot say for certain whether this will work for you but here is an approach I have used on several occassions to simplify the code for certain types of search routines.

The following code comes from Node3.java.
This code prevents you from moving to an empty node
if (_leftTree.isEmpty() && _midTree.isEmpty() && _rightTree.isEmpty()) { // leaf node.
This and similar code is what moves you down the tree in an insert operation from a node (in this case a 3Node).
if (key < leftVal) {
_leftTree.insertHelper(owner, n);
} else if (rightVal < key) {
_rightTree.insertHelper(owner, n);
} else {
_midTree.insertHelper(owner, n);
}

An alternative to this would be to let the code move you to an empty node. (i.e. do not do the check that prevents the recursive call to an empty node). The insert method would first check to see if itself was an empty node and, if so, return with that information.
Near the end of insert then (after the recursive call) insert would realize (based on the return info. from the recursive call) that it tried to place a value in an empty node and take apprpriate action depending upon what kind of node it was invoked from. For example a 3Node upon receiving the return info. that says it called recursively on an empty node would place the value into itself, and change to a 4Node.

ASKER

hmm..I don't quite understand what you mean roussosc, could you do a code example using the 2 node..

ASKER CERTIFIED SOLUTION

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

ASKER

I am actually asking if there's any other thing that we can do to this code besides removing the isEmpty? for example removing the attach method..

Well, of course, you cannot simply remove isEmpty as I am sure you realize, you have to put in some other code or structure the program differently to stop descending the tree when looking for a leaf.

Similary you cannot simply just remove attach as you need to be able to connect a node to its subtrees; especially after a split has occurred.

ASKER

okay..so attach could not be removed.. how about insert helper.. is there a way so that I can combine insert helper with the method insert?

It appears that you might be able to restructure to eliminate insertHelper but without actually trying to rewrite the code I cannot say for sure. Oftentimes "helper" functions are used to protect private data elements or to structure code more modularly. However, I don't know enough about that area in Java to say that is the case here.

Is it important to be able to rework this progam to eliminate functions like isEmpty and insertHelper?

ASKER

well not really.. I just think that it would minimize the amount of methods I have in class that seems to me to be kind of useless.. but if that takes a lot of work, in this case rewriting it again in some fashion then I am fine with it

Actually, I think the code is surprisingly compact for a 2-3-4-tree implementation. I would be surprised if it could condensed significantly. I must say, though, it is not the easiest code to understand in my opinion. But then, perhaps, any implementation of such a data structure might be complex.

ASKER

yeah.. a 2-3-4 tree is a complex implementation I would say.. however I am proudly to say that I've finished the 2-3-4-5 implementation of the tree using the help from this implementation..

I just had a question.. in the insert parameter it always says owner.. is owner here actually the same as parent? or owner is the tree itself?

Congratulations on completing the 2-3-4-5-tree implementation!

Unfortunately, the internal documentation of the 2-3-4-tree leaves much to be desired. "owner" appears to be an entire subtree but not necessarily then entire 2-3-4-tree. insertHelper is often called with "owner" as the second parameter and the tree that has owner's parent as its root as the first parameter for the purpose of reconnecting parts of the tree.

ASKER

yes..I understand owner is a parent of a subtree but not all the tree...