Solved

b+ indexing

Posted on 1998-08-29
5
655 Views
Last Modified: 2012-06-21
can you tell what is b tree indexing and b+ tree indexing and full descriptions of them?
0
Comment
Question by:nabila87
  • 3
5 Comments
 

Author Comment

by:nabila87
ID: 1025600
Edited text of question
0
 
LVL 2

Expert Comment

by:odessa
ID: 1025601
Alright here is it.
B-tree is stucture that most of programs use for defining indexes, like "balanced binnary tree" b-tree is balanced too, but b-tree have more than one leave on every root, that helps to do search more fast, B+-tree is an advance in b-tree but it is more good for Index use
0
 
LVL 3

Expert Comment

by:junfeb
ID: 1025602
Here is some Good explanation of Btree indexing
xpert note:
Btree
=====

o       Structure of a Btree Table
o       Associated Data Pages
o       Index Growth
o       Locking and Btree Tables
o       Sorted Order in a Btree Table
o       Deleted Rows: Data Pages
o       When to Use Btree

'Btree' is INGRES's most versatile storage structure. The
btree structure allows for keyed access and supports range
searches and pattern matching. The btree index is dynamic,
growing as the table grows. This eliminates the overflow
problems that static structures like isam and hash present
as they grow.

INGRES's btree's implementation not only provides a dynamic
index but also allows for maximum concurrent use of the
table. The design incorporates a sparse index that points to
pages in a leaf level. The leaf level is a dense index,
containing key and tid pairs pointing to the rows on the
data pages in the table. (For a discussion of tids, please
see the section "Tids" in the INGRES Database
Administrator's Guide.) The benefit of this indexing
approach is that it minimizes splitting cost: when splitting
does occur, the actual data rows need not move. Only the
leaf and index levels require reorganization.


Structure of a Btree Table
--------------------------
INGRES btrees can be viewed as four separate parts. There
is:

   o A free list header page, which is used to keep track of
     allocated pages that are not currently being used

   o One or more index pages

   o One or more leaf pages

   o One or more data pages, where the user data is actually
     stored

Index pages contain the key and a pointer (known as a tid)
to either lower-level index pages or a leaf page where the
key-tid pair will be found. Data pages contain the data rows
and do not contain pointers to other pages. The smallest
btree containing data will have four pages, one of each
type.

If a secondary index is modified to btree, it will not
contain data pages. Instead, the leaf pages of the secondary
index will contain tids referencing the main table's data
pages.

The number of index pages is dependent on the width of the
key and the number of leaf pages, because eventually the
index pages point to a particular leaf page. Usually the
index level is small, since it needs only to point to the
leaf pages. The index level is similar to the isam index,
except that the isam index points to data pages whereas the
btree index level points to leaf pages.

The leaf page level is a dense index. It is considered a
dense index because it holds a key and tid pair for every
row in the table. In dense indexes, rows on data pages do
not move during a split (which would cause their tids to
change). The index level is considered a sparse index,
because it contains only a key value and a pointer to a
page.

The figure below, a portrait of a btree, aids in
understanding the three btree levels: index, leaf and data
page.

The btree index level is similar to isam's index, except
that the index points to index or leaf pages instead of to
data pages. To look for "Kay", the search starts from the
root node and traverses down the left side of the index for
values preceding "McShane". The index holds leaf page
numbers and the range of key values to expect on the leaf
page. The index in this case shows that leaf page 2 is the
appropriate page to look at.

On leaf page 2, a scan is made through the key-tid pairs,
with Kay's record identified as being on page 4, row 0 (page
4 row 0 is actually a "tid"). Going directly to page 4, row
0, Kay's record is located.

Note: This diagram is good pictorially, but it may not be
realistic. In actuality, if the key width "name" were only
30 characters, the row width were 500 bytes, and there were
only 31 employees, this btree would have only a free list
header page, one index page, one leaf page, and 8 data pages
(instead of 4 leaf pages and 3 index pages).

          +-----------------------------------+
          |                 ROOT              |
          |                                   |
          |           <=   McShane            |
INDEX     +-----------------------------------+
LEVEL            /                        \
                /                          \
+---------------------------+ +---------------------------+
| Key             Leaf Page | | Key             Leaf Page |
|                           | |                           |
| <= Giller              1  | | > McShane <= Shigio    3  |
| > Giller <= McShane    2  | | > Shigio               4  |
+---------------------------+ +---------------------------+

LEAF PAGE LEVEL

Leaf Page 1    Leaf Page 2    Leaf Page 3    Leaf Page 4
Aitken    1,0  Gordon    3,0  McTigue   5,0  Smith      7,0
Blumberg  1,1  Green     3,3  Ming      5,1  Stannich   7,1
Brodie    1,3  Gregori   3,2  Ramos     5,2  Stein      7,2
Cameron   1,2  Huber     3,1  Robinson  5,3  Stover     7,3
Clark     2,0  Kay       4,0  Ross      6,0  Sullivan   8,0
Curan     2,1  Kreseski  4,1  Sabel     6,1  Verducci   8,1
Curry     2,2  Mandic    4,2  Saxena    6,2  Zimmerman  8,2
Giller    2,3  McShane   4,3  Shigio    6,3

DATA PAGE LEVEL
          +-------------------------------------+
Page 1  0 |Aitken    |       1|    49| 50000.000|
        1 |Blumberg  |       2|    33| 32000.000|
        2 |Cameron   |       4|    37| 35000.000|
        3 |Brodie    |       3|    42| 40000.000|
          |-------------------------------------|
Page 2  0 |Clark     |       5|    43| 40000.000| Associated
        1 |Curan     |       6|    30| 30000.000| Data Page
        2 |Curry     |       7|    34| 32000.000| for Leaf
        3 |Giller    |       8|    47| 46000.000| Page 1
          |-------------------------------------|
Page 3  0 |Gordon    |       9|    28| 27000.000|
        1 |Huber     |      12|    35| 32000.000|
        2 |Gregori   |      11|    32| 31000.000|
        3 |Green     |      10|    27| 26000.000|
          |-------------------------------------|
Page 4  0 |Kay       |      13|    41| 38000.000| Associated
        1 |Kreseski  |      14|    25| 24000.000| Data Page
        2 |Mandic    |      15|    46| 43000.000| for Leaf
        3 |McShane   |      16|    22| 22000.000| Page 2
          |-------------------------------------|
Page 5  0 +McTigue   |      17|    44| 41000.000+

                FIGURE:  Portrait of a Btree


Associated Data Pages
----------------------
Every leaf page has an associated data page.  This is where
new rows are added.  When an associated data page fills up, a
new associated data page is attached to the leaf page. If you
delete rows that exist on the current associated data page,
the deleted space is reused.

Having one associated data page per leaf page provides a
good chance for rows with similar key ranges to exist on the
same data page, thereby increasing the likelihood that data
references will occur on the same data page.


Index Growth
------------
The major difference between isam and btree is that the
btree index grows as the table grows. Consider what would
happen if you added these five new employees to the isam
"employee" table, keyed on name: Zanadu, Zentura, Zilla,
Zorro, Zumu. These names would all be put on the last page
of the isam table, and since they would not all fit on the
last page, they would be put onto an overflow page attached
to the last page.

If you added these five new employees to the btree table,
you would add the new names to the appropriate leaf page
(page 4, in this case) and their records would go on the
associated data page for leaf page #4. Since the associated
data page would fill up, a new associated data page would be
assigned to page #4. If the leaf page were full, and could
not hold all five names, the leaf page would split into two
leaf pages, and a reference to the new leaf page in the
index would be added. If the index page could no longer hold
a reference to another leaf page, the index would be split
as well.

Splitting occurs fairly frequently while the table is small
and growing. As the table gets larger, splitting occurs less
frequently (unless a sequential key is used) and usually
only in the leaf or lowest index level.


Locking and Btree Tables
------------------------
During normal btree traversal, leaf and data pages are
logically locked until the end of the transaction. Btree
index pages are only temporarily locked during query
execution. The index page lock is released after the page
has been searched.

When searching the btree index, ladder locking is used: a
lock is taken on the first index page, which points to
another index page. The next index page is locked and, once
it is locked, the first lock is dropped; and so on down the
index to the leaf level.

INGRES always locks the leaf and data pages when accessing
btree tables. Because of this, locking pages in a btree
table requires twice as many locks as locking an isam or
hash table. It may be wise to set the maxlocks escalation
factor higher than the default when using the btree storage
structure. For details refer to the 'set lockmode' statement
in your query language reference manual.


Sorted Order in a Btree Table
-----------------------------
Looking back to the figure, note that the rows for "Huber"
and "Green" in the btree diagram are not in sorted order on
the data page. This would happen if Huber's record was
appended before Green's. They both end up on the same data
page, but slightly out of order. This would happen in isam
as well. However, if you tried the following:

     select * from employee
          where employee.name like 'G%';

you would retrieve the rows in sorted order if the employee
table was a btree, since the leaf pages are used to point to
the data rows, and the leaf pages maintain a sorted sequence
table of all the key/tid pairs on the leaf page. The data on
the data pages is not guaranteed to be sorted, but the
access, which is always through the leaf pages, guarantees
that the data is retrieved in sorted order. (This is not
true for isam.)

Although btree data is retrieved in sorted order, currently
the 'max' aggregate does not automatically go to the maximum
key in the btree. The 'max' aggregate scans the entire btree
table.


Deleted Rows: Data Pages
-----------------------
If rows are deleted on the associated data page, the space
is re-used the next time a row is appended to that page. If
rows are deleted from a data page that is no longer
associated, the space is not re-used. If all the rows on a
non-associated data page are deleted, the 'modify to merge'
statement places the page on the free list, and when a new
associated data page is needed, the page is reused. The
'modify to btree' statement is the only statement that
completely frees up unused data pages and disk space.

The reason that deleted space on a non-associated data page
is not automatically reused is to speed the append
operation. Appending to one particular page (the "associated
data page") is faster than tracking and checking all the
available free space on non-associated data pages; appending
to the associated data page also provides better key
clustering when data addition occurs in sorted key order.
Since appends generally occur more frequently than do
deletes, preserving the performance of the append operation
seems wiser than reusing deleted space from non-associated
data pages.


When to Use Btree
-----------------
Btree is the most versatile storage structure, as it
supports both exact match and range retrievals and includes
a dynamic index, so that frequent remodification may not be
necessary.

Btree is a good storage structure to use in any of these
cases:

   o The table is growing at a rapid rate.

   o You use pattern matching.

   o You retrieve ranges of key values.

   o You retrieve using only the leftmost part of a multi-
     column key.

Btree is a poor storage structure to use if:

   o The table is relatively static.

   o The table is small, static and access is heavily
     concurrent.

The following are problems encountered with the btree
storage structure, and their solutions.

                Btree Problems and Solutions

Problem:  You tried to use pattern matching, but did not
          specify the leftmost character.
Solution: Specify the leftmost part of the key; *F* does not
          use the btree index, but F* does. If you cannot
          modify the search condition, the entire table must
          be scanned.

Problem:  You tried to use just part of a multi-column key,
          but did not specify the leftmost column.
Solution: Specify the leftmost column of the multi-column
          key. If you cannot modify the search condition,
          create a secondary index with only the columns on
          which you are searching.

Problem:  You are deleting frequently, as well as adding
          data.
Solution: Use 'modify to merge' or 'modify' periodically to
          reclaim space.

Some variations of Btree indexing including b+tree indexing
--------------------------------------------------------------------------------------------------------------
B-tree variations

As described below, a number of variations to the basic B-tree algorithms can be attempted in order to save time or space.

     B-tree implementations often explicitly store keys in B-tree nodes. The B-TREE-P routines, however, store only Pick item id's in the B-tree, and the actual
     keys are computed on the fly as the B-tree is traversed. Branches #8 discusses the trade-offs of the two approaches.
     B-TREE-P supports secondary sorts by concatenating subsidiary keys to the primary keys using nulls in the BTPKEY routine. Some B-tree implementations
     (such as the MUMPS environment used by many hospitals) instead let keys also act as root nodes for whole subsidiary B-trees, creating a hierarchy of
     B-trees. For example, the root B-tree might have patient numbers as keys. Each patient key can also serve as the root for a second level B-tree containing visit
     dates as keys, each visit date can be the root for a third level B-tree containing medical procedure codes, and so on.
     B-TREE-P performs a binary search inside each node. Sometimes a simple linear search or some other technique can provide better performance.
     Because the value of adjacent keys and pointers within a node are usually very similar, it's generally easy to save space by compressing nodes using algorithms
     that store only the first key or pointer along with the computed differences between subsequent keys and pointers in the node.
     A trick some B-tree implementors use: whenever traveling down a B-tree, split nearly full nodes right away, so splitting never has to propagate back up the
     tree. (This can be particularly valuable in multi-user environments, since only low-level portions of the B-tree have to be locked against simultaneous access
     during updates, instead of the entire tree having to be locked from the root down.)
     So-called B*-trees keep each node at least 2/3 full instead of just 1/2 by redistributing keys until 2 child nodes are full, then splitting the 2 full nodes into 3
     nodes each 2/3 full. As a result, nodes are generally fuller and trees are more shallow (leading to faster searches).
     Each B-TREE-P node has a pointer back to its parent node. Some B-tree algorithms don't store a parent pointer at all, and instead use recursive stack
     techniques to traverse nodes. Or, as in so-called B+-trees, all keys are stored at the lowest level of the B-tree in leaves, which can then be easily linked in one
     "leaf list" for sequential traversal (although managing the "index keys" in the higher intermediate nodes becomes slightly more complicated.)
     In the same way that redistributing keys after an underflow can avoid having to concatenate nodes, redistributing after an overflow can avoid splitting. In other
     words, instead of splitting as soon as a node fills, try redistributing keys with a neighbor and only split when the neighbor is too full.

If you like, I could post this as an answer.
Hope this helps.

0
 
LVL 3

Expert Comment

by:junfeb
ID: 1025603
Here is the full Detailed description of both b and b+tree indexing -
   xpert note:
       Btree
       =====

       o       Structure of a Btree Table
       o       Associated Data Pages
       o       Index Growth
       o       Locking and Btree Tables
       o       Sorted Order in a Btree Table
       o       Deleted Rows: Data Pages
       o       When to Use Btree

       'Btree' is INGRES's most versatile storage structure. The
       btree structure allows for keyed access and supports range
       searches and pattern matching. The btree index is dynamic,
       growing as the table grows. This eliminates the overflow
       problems that static structures like isam and hash present
       as they grow.

       INGRES's btree's implementation not only provides a dynamic
       index but also allows for maximum concurrent use of the
       table. The design incorporates a sparse index that points to
       pages in a leaf level. The leaf level is a dense index,
       containing key and tid pairs pointing to the rows on the
       data pages in the table. (For a discussion of tids, please
       see the section "Tids" in the INGRES Database
       Administrator's Guide.) The benefit of this indexing
       approach is that it minimizes splitting cost: when splitting
       does occur, the actual data rows need not move. Only the
       leaf and index levels require reorganization.


       Structure of a Btree Table
       --------------------------
       INGRES btrees can be viewed as four separate parts. There
       is:

          o A free list header page, which is used to keep track of
            allocated pages that are not currently being used

          o One or more index pages

          o One or more leaf pages

          o One or more data pages, where the user data is actually
            stored

       Index pages contain the key and a pointer (known as a tid)
       to either lower-level index pages or a leaf page where the
       key-tid pair will be found. Data pages contain the data rows
       and do not contain pointers to other pages. The smallest
       btree containing data will have four pages, one of each
       type.

       If a secondary index is modified to btree, it will not
       contain data pages. Instead, the leaf pages of the secondary
       index will contain tids referencing the main table's data
       pages.

       The number of index pages is dependent on the width of the
       key and the number of leaf pages, because eventually the
       index pages point to a particular leaf page. Usually the
       index level is small, since it needs only to point to the
       leaf pages. The index level is similar to the isam index,
       except that the isam index points to data pages whereas the
       btree index level points to leaf pages.

       The leaf page level is a dense index. It is considered a
       dense index because it holds a key and tid pair for every
       row in the table. In dense indexes, rows on data pages do
       not move during a split (which would cause their tids to
       change). The index level is considered a sparse index,
       because it contains only a key value and a pointer to a
       page.

       The figure below, a portrait of a btree, aids in
       understanding the three btree levels: index, leaf and data
       page.

       The btree index level is similar to isam's index, except
       that the index points to index or leaf pages instead of to
       data pages. To look for "Kay", the search starts from the
       root node and traverses down the left side of the index for
       values preceding "McShane". The index holds leaf page
       numbers and the range of key values to expect on the leaf
       page. The index in this case shows that leaf page 2 is the
       appropriate page to look at.

       On leaf page 2, a scan is made through the key-tid pairs,
       with Kay's record identified as being on page 4, row 0 (page
       4 row 0 is actually a "tid"). Going directly to page 4, row
       0, Kay's record is located.

       Note: This diagram is good pictorially, but it may not be
       realistic. In actuality, if the key width "name" were only
       30 characters, the row width were 500 bytes, and there were
       only 31 employees, this btree would have only a free list
       header page, one index page, one leaf page, and 8 data pages
       (instead of 4 leaf pages and 3 index pages).

                 +-----------------------------------+
                 |                 ROOT              |
                 |                                   |
                 |           <=   McShane            |
       INDEX     +-----------------------------------+
       LEVEL            /                        \
                       /                          \
       +---------------------------+ +---------------------------+
       | Key             Leaf Page | | Key             Leaf Page |
       |                           | |                           |
       | <= Giller              1  | | > McShane <= Shigio    3  |
       | > Giller <= McShane    2  | | > Shigio               4  |
       +---------------------------+ +---------------------------+

       LEAF PAGE LEVEL

       Leaf Page 1    Leaf Page 2    Leaf Page 3    Leaf Page 4
       Aitken    1,0  Gordon    3,0  McTigue   5,0  Smith      7,0
       Blumberg  1,1  Green     3,3  Ming      5,1  Stannich   7,1
       Brodie    1,3  Gregori   3,2  Ramos     5,2  Stein      7,2
       Cameron   1,2  Huber     3,1  Robinson  5,3  Stover     7,3
       Clark     2,0  Kay       4,0  Ross      6,0  Sullivan   8,0
       Curan     2,1  Kreseski  4,1  Sabel     6,1  Verducci   8,1
       Curry     2,2  Mandic    4,2  Saxena    6,2  Zimmerman  8,2
       Giller    2,3  McShane   4,3  Shigio    6,3

       DATA PAGE LEVEL
                 +-------------------------------------+
       Page 1  0 |Aitken    |       1|    49| 50000.000|
               1 |Blumberg  |       2|    33| 32000.000|
               2 |Cameron   |       4|    37| 35000.000|
               3 |Brodie    |       3|    42| 40000.000|
                 |-------------------------------------|
       Page 2  0 |Clark     |       5|    43| 40000.000| Associated
               1 |Curan     |       6|    30| 30000.000| Data Page
               2 |Curry     |       7|    34| 32000.000| for Leaf
               3 |Giller    |       8|    47| 46000.000| Page 1
                 |-------------------------------------|
       Page 3  0 |Gordon    |       9|    28| 27000.000|
               1 |Huber     |      12|    35| 32000.000|
               2 |Gregori   |      11|    32| 31000.000|
               3 |Green     |      10|    27| 26000.000|
                 |-------------------------------------|
       Page 4  0 |Kay       |      13|    41| 38000.000| Associated
               1 |Kreseski  |      14|    25| 24000.000| Data Page
               2 |Mandic    |      15|    46| 43000.000| for Leaf
               3 |McShane   |      16|    22| 22000.000| Page 2
                 |-------------------------------------|
       Page 5  0 +McTigue   |      17|    44| 41000.000+

                       FIGURE:  Portrait of a Btree


       Associated Data Pages
       ----------------------
       Every leaf page has an associated data page.  This is where
       new rows are added.  When an associated data page fills up, a
       new associated data page is attached to the leaf page. If you
       delete rows that exist on the current associated data page,
       the deleted space is reused.

       Having one associated data page per leaf page provides a
       good chance for rows with similar key ranges to exist on the
       same data page, thereby increasing the likelihood that data
       references will occur on the same data page.


       Index Growth
       ------------
       The major difference between isam and btree is that the
       btree index grows as the table grows. Consider what would
       happen if you added these five new employees to the isam
       "employee" table, keyed on name: Zanadu, Zentura, Zilla,
       Zorro, Zumu. These names would all be put on the last page
       of the isam table, and since they would not all fit on the
       last page, they would be put onto an overflow page attached
       to the last page.

       If you added these five new employees to the btree table,
       you would add the new names to the appropriate leaf page
       (page 4, in this case) and their records would go on the
       associated data page for leaf page #4. Since the associated
       data page would fill up, a new associated data page would be
       assigned to page #4. If the leaf page were full, and could
       not hold all five names, the leaf page would split into two
       leaf pages, and a reference to the new leaf page in the
       index would be added. If the index page could no longer hold
       a reference to another leaf page, the index would be split
       as well.

       Splitting occurs fairly frequently while the table is small
       and growing. As the table gets larger, splitting occurs less
       frequently (unless a sequential key is used) and usually
       only in the leaf or lowest index level.


       Locking and Btree Tables
       ------------------------
       During normal btree traversal, leaf and data pages are
       logically locked until the end of the transaction. Btree
       index pages are only temporarily locked during query
       execution. The index page lock is released after the page
       has been searched.

       When searching the btree index, ladder locking is used: a
       lock is taken on the first index page, which points to
       another index page. The next index page is locked and, once
       it is locked, the first lock is dropped; and so on down the
       index to the leaf level.

       INGRES always locks the leaf and data pages when accessing
       btree tables. Because of this, locking pages in a btree
       table requires twice as many locks as locking an isam or
       hash table. It may be wise to set the maxlocks escalation
       factor higher than the default when using the btree storage
       structure. For details refer to the 'set lockmode' statement
       in your query language reference manual.


       Sorted Order in a Btree Table
       -----------------------------
       Looking back to the figure, note that the rows for "Huber"
       and "Green" in the btree diagram are not in sorted order on
       the data page. This would happen if Huber's record was
       appended before Green's. They both end up on the same data
       page, but slightly out of order. This would happen in isam
       as well. However, if you tried the following:

            select * from employee
                 where employee.name like 'G%';

       you would retrieve the rows in sorted order if the employee
       table was a btree, since the leaf pages are used to point to
       the data rows, and the leaf pages maintain a sorted sequence
       table of all the key/tid pairs on the leaf page. The data on
       the data pages is not guaranteed to be sorted, but the
       access, which is always through the leaf pages, guarantees
       that the data is retrieved in sorted order. (This is not
       true for isam.)

       Although btree data is retrieved in sorted order, currently
       the 'max' aggregate does not automatically go to the maximum
       key in the btree. The 'max' aggregate scans the entire btree
       table.


       Deleted Rows: Data Pages
       -----------------------
       If rows are deleted on the associated data page, the space
       is re-used the next time a row is appended to that page. If
       rows are deleted from a data page that is no longer
       associated, the space is not re-used. If all the rows on a
       non-associated data page are deleted, the 'modify to merge'
       statement places the page on the free list, and when a new
       associated data page is needed, the page is reused. The
       'modify to btree' statement is the only statement that
       completely frees up unused data pages and disk space.

       The reason that deleted space on a non-associated data page
       is not automatically reused is to speed the append
       operation. Appending to one particular page (the "associated
       data page") is faster than tracking and checking all the
       available free space on non-associated data pages; appending
       to the associated data page also provides better key
       clustering when data addition occurs in sorted key order.
       Since appends generally occur more frequently than do
       deletes, preserving the performance of the append operation
       seems wiser than reusing deleted space from non-associated
       data pages.


       When to Use Btree
       -----------------
       Btree is the most versatile storage structure, as it
       supports both exact match and range retrievals and includes
       a dynamic index, so that frequent remodification may not be
       necessary.

       Btree is a good storage structure to use in any of these
       cases:

          o The table is growing at a rapid rate.

          o You use pattern matching.

          o You retrieve ranges of key values.

          o You retrieve using only the leftmost part of a multi-
            column key.

       Btree is a poor storage structure to use if:

          o The table is relatively static.

          o The table is small, static and access is heavily
            concurrent.

       The following are problems encountered with the btree
       storage structure, and their solutions.

                       Btree Problems and Solutions

       Problem:  You tried to use pattern matching, but did not
                 specify the leftmost character.
       Solution: Specify the leftmost part of the key; *F* does not
                 use the btree index, but F* does. If you cannot
                 modify the search condition, the entire table must
                 be scanned.

       Problem:  You tried to use just part of a multi-column key,
                 but did not specify the leftmost column.
       Solution: Specify the leftmost column of the multi-column
                 key. If you cannot modify the search condition,
                 create a secondary index with only the columns on
                 which you are searching.

       Problem:  You are deleting frequently, as well as adding
                 data.
       Solution: Use 'modify to merge' or 'modify' periodically to
                 reclaim space.

       Some variations of Btree indexing including b+tree indexing
       --------------------------------------------------------------------------------------------------------------
       B-tree variations

       As described below, a number of variations to the basic B-tree algorithms can be attempted in order to save time or
       space.

            B-tree implementations often explicitly store keys in B-tree nodes. The B-TREE-P routines, however, store only Pick
       item id's in the B-tree, and the actual
            keys are computed on the fly as the B-tree is traversed. Branches #8 discusses the trade-offs of the two approaches.
            B-TREE-P supports secondary sorts by concatenating subsidiary keys to the primary keys using nulls in the BTPKEY
       routine. Some B-tree implementations
            (such as the MUMPS environment used by many hospitals) instead let keys also act as root nodes for whole
       subsidiary B-trees, creating a hierarchy of
            B-trees. For example, the root B-tree might have patient numbers as keys. Each patient key can also serve as the
       root for a second level B-tree containing visit
            dates as keys, each visit date can be the root for a third level B-tree containing medical procedure codes, and so on.
            B-TREE-P performs a binary search inside each node. Sometimes a simple linear search or some other technique
       can provide better performance.
            Because the value of adjacent keys and pointers within a node are usually very similar, it's generally easy to save
       space by compressing nodes using algorithms
            that store only the first key or pointer along with the computed differences between subsequent keys and pointers in
       the node.
            A trick some B-tree implementors use: whenever traveling down a B-tree, split nearly full nodes right away, so splitting
       never has to propagate back up the
            tree. (This can be particularly valuable in multi-user environments, since only low-level portions of the B-tree have to be
       locked against simultaneous access
            during updates, instead of the entire tree having to be locked from the root down.)
            So-called B*-trees keep each node at least 2/3 full instead of just 1/2 by redistributing keys until 2 child nodes are full,
       then splitting the 2 full nodes into 3
            nodes each 2/3 full. As a result, nodes are generally fuller and trees are more shallow (leading to faster searches).
            Each B-TREE-P node has a pointer back to its parent node. Some B-tree algorithms don't store a parent pointer at
       all, and instead use recursive stack
            techniques to traverse nodes. Or, as in so-called B+-trees, all keys are stored at the lowest level of the B-tree in
       leaves, which can then be easily linked in one
            "leaf list" for sequential traversal (although managing the "index keys" in the higher intermediate nodes becomes
       slightly more complicated.)
            In the same way that redistributing keys after an underflow can avoid having to concatenate nodes, redistributing after
       an overflow can avoid splitting. In other
            words, instead of splitting as soon as a node fills, try redistributing keys with a neighbor and only split when the
       neighbor is too full.

Here is some more information on the B+tree indexing -  
       Detailing the diff between btree and b+tree.

       As shown in the following figure, a B+Tree index consists of node blocks that refer to either:

            Other node blocks, or
            Data blocks in a subfile.

                         +------+
                   +-----| Node |-----+
                   |     |      |     |
                   |     +------+     |
                   |        |         |
                   |        |         |
                   |        |         |
                +------+ +------+ +------+
          +-----| Node | | Node | | Node |-----+
          |     |      | |      | |      |     |
          |     +------+ +------+ +------+     |
          |        |        |         |        |
          |        |        |         |        |
          |        |        |         |        |
       +------+ +------+ +------+ +------+ +------+
       | Data | | Data | | Data | | Data | | Data |
       |      | |      | |      | |      | |      |
       +------+ +------+ +------+ +------+ +------+

       Each node block contains technical logical records , which include:

            A file address that provides a pointer to the block on the next level
            Keys that match the keys in the first record of the block on the next level.

       For example, the following B+Tree can be used to locate the record with key 'Peterson'. The search would start at the
       highest level (root) node and continue
       down the right side of the figure until the correct data block was located.

                           +----------------------------+
                 +---------| Adams                      |
                 |         |----------------------------|
                 |     +---| Davis                      |
                 |     |   |----------------------------|
                 |     |   | Kelly                      |----+
                 |     |   +----------------------------+    |
                 |     |                                     |
                 |     +---------------+                     |
                 |                     |                     |
                 |                     |                     |
           +-----------------+   +-----------------+   +-----------------+
           | Adams           |   | Davis           |   | Kelly           |----+
           |-----------------|   |-----------------|   |-----------------|    |
           | Baker           |   | Franklin        |   | Roberts         |    |
           |-----------------|   |-----------------|   |-----------------|    |
           | Collins         |   | Jones           |   | Young           |    |
           +-----------------+   +-----------------+   +-----------------+    |

                                                                              |
                                                                +-------------+
                                                                |
       +------------------------------------------------------- | ------------+
       |                                                        |             |
       |                                                        |             |
       |   +-----------------+                         +-----------------+    |
       |   |                 |                         | Kelly           |    |
       |   |-----------------|    Data Blocks          |-----------------|    |
       |   |                 |  .....................  | Lovell          |    |
       |   |-----------------|                         |-----------------|    |
       |   |                 |                         | Peterson        |    |
       |   +-----------------+                         +-----------------+    |
       |                                                                      |
       +----------------------------------------------------------------------+

       Notice that for a B+Tree to work properly, the file must be defined as UP or DOWN organized in the database definition
       (DBDEF). The DBDEF must also
       contain default keys, which are the keys used in the TLRECs of the node blocks. As many as 6 default keys can be
       defined, all of which will be used to
       create the index.

       A B+Tree can be used to reduce the number of physical I/O operations needed to locate a record. For example, assume
       that a subfile contains 10,000
       records, and every block holds 10 records. This would imply that there are 1000 blocks, and to locate the last record
       would require 1000 I/O operations
       without a B+Tree index. If a B+Tree was added, it would look as shown in the following figure.

                                        +------+
                                +-------| Root |------+
                                |       | Node |       |
                                |       +------+       |
                                |                      |
                                |                      |
                             +------+              +------+
                     +-------| Node |   --------   | Node |------+
                     |       |      |   10 Nodes   |      |      |
                     |       +------+              +------+      |
                     |                                           |
                     |                                           |
                  +------+                                    +------+
          +-------| Leaf |   ------------------------------   | Leaf |-------+
          |       | Node |           100 Leaf Nodes           | Node |       |
          |       +------+                                    +------+       |
          |                                                                  |
          |                                                                  |
       +------+                                                        +------+
       | Data |   --------------------------------------------------   | Data |
       |      |                     1000 Data Blocks                   |      |
       |      |                                                        |      |
       +------+                                                        +------+

       To locate the last record using the B+Tree only requires four I/O operations. Further, the subfile could grow 10 times, but
       only one additional level of nodes
       would be created, adding only one I/O operation needed to locate the last record. Similar to block indexing, B+Tree
       indexing is performed at the subfile
       level. There is one B+Tree for each subfile in a file. The advantage of B+Tree indexing is that any number of node blocks
       can be created, so the subfile can
       grow to any size and the B+Tree is still effective. Block indexing is restricted to 1 block  and is not fully effective when the
       block is full.

       The B+Tree will always be balanced, which implies that the same number of I/O operations is required to locate any
       record in the subfile.


       So in summary -

       Balanced-tree) A technique for organizing indexes. In
                        order to keep access time to a minimum, it stores the
                        data keys in a balanced hierarchy that continually
                        realigns itself as items are inserted and deleted. Thus, all
                        nodes always have a similar number of keys.

                        B+tree is a version of B-tree that maintains a hierarchy
                        of indexes while also linking the data sequentially,
                        providing fast direct access and fast sequential access.
                        IBM's VSAM uses this.


Thanks,
0
 
LVL 3

Accepted Solution

by:
junfeb earned 100 total points
ID: 1025604
I'm posting this as an answer so that you'd look at this -

I would like to know why you rejected the answer. I recognised it was a school assignment the first time I saw the question and I would like to know if you have taken the effort to read through the whole document and understand it. Since I recognised it was a school assignment, I made sure that you got a good explanation.

I believe in giving answers so as to help the other person understand the subject, not just do their assignment questions for them.

I still think, you should read thro' the document and make your own table of differences.
I'm sure other experts would agree to this!!

If you've more questions, I'm sure either myself or other experts would be willing to help you.
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

Introduction: Often, when running a query with joins, the results show up "duplicates", and often, those duplicates can be "eliminated" in the results using DISTINCT, for example. Using DISTINCT is simple: just add it after the SELECT keyword, an…
Never store passwords in plain text or just their hash: it seems a no-brainier, but there are still plenty of people doing that. I present the why and how on this subject, offering my own real life solution that you can implement right away, bringin…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now