join algorithms

Posted on 2003-03-26
Medium Priority
Last Modified: 2010-05-18

I need help for explain me the three differents king of join in an Oracle environment.

1) NESTED LOOP (with index)
2) SORT/MERGE (without index)
3) HASH JOIN (withou index)

Somebody can explain me with a little example, please!

Thanks a lot

Question by:Boubou
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
LVL 15

Accepted Solution

andrewst earned 200 total points
ID: 8211525
Rather than try to explain it myself (badly), here's a link to the documentation on this subject:

LVL 48

Expert Comment

ID: 8211542
I have some stuff, but it is on my home computer. So I collect some info from my office computer:
The performance of SORT MERGE joins and HASH joins is strongly impacted by certain initialization parameters. Join performance can be crippled if certain parameters are not set properly. Nested loop operations normally tend not to require particular parameters other than for general tuning purposes.
Nested loop joins are useful when small subsets of data are being joined and if the join condition is an efficient way of accessing the second table.

&#61623; <Parameter:OPTIMIZER_INDEX_CACHING> modifys the behavior of the Cost Based
Optimizer (CBO) to favor nested loops joins and IN-list iterators.
&#61623; <Parameter:OPTIMIZER_INDEX_COST_ADJ> Adjusts the cost of index use.
In 8, 8i and 9i , the default value seems to be to high, thus the CBO tends to choose full table table scan instead of an index scan

&#61623; SORT MERGE join

Sort merge joins can be used to join rows from two independent sources. Hash joins generally perform better than sort merge joins. On the other hand, sort merge joins can perform better than hash joins if both of the following conditions exist:
&#61623; The row sources are sorted already.
&#61623; A sort operation does not have to be done.

However, if a sort merge join involves choosing a slower access method (an index scan as opposed to a full table scan), then the benefit of using a sort merge might be lost.
Sort merge joins are useful when the join condition between two tables is an inequality condition (but not a nonequality) like <, <=, >, or >=. Sort merge joins perform better than nested loop joins for large data sets. (You cannot use hash joins unless there is an equality condition).

&#61623; <Parameter:DB_FILE_MULTIBLOCK_READ_COUNT> specifies how many blocks
Oracle should read at a time from disk when performing a full table scan. Since SORT MERGE joins often involve full table scans, setting this parameter will reduce overhead when scanning large tables.
Reference:<NOTE:1037322.6> What is the parameter DB_FILE_MULTIBLOCK_READ_COUNT
But on the other side high values of DB_FILE_MULTIBLOCK_READ_COUNT makes index access more expensive

&#61623; <Parameter:SORT_AREA_SIZE> specifies how much memory can be used
for sorting , and this has a strong impact on performance of all sorts. Since SORT MERGE joins require sorting of both row sources, SORT_AREA_SIZE strongly impacts SORT MERGE join performance. If an entire sort cannot be completed in the amount of memory specified by this parameter, then a temporary tablespace will be allocated .In this case, the sort will be performed in memory one part at a time and partial results will be stored on the disk in the temporary segment. If SORT_AREA_SIZE is et very small, then excessive disk I/O will be necessary to perform even the smallest of sorts. If it is set too high then OS may run out of physical memory and resort to swapping.
Reference:<NOTE:102339.1> Temporary Segments: What Happens When a Sort Occurs

&#61623; HASH join
Hash joins are used for joining large data sets. The optimizer uses the smaller of two tables or data sources to build a hash table on the join key in memory. It then scans the larger table, probing the hash table to find the joined rows.
This method is best used when the smaller table fits in available memory. The cost is then limited to a single read pass over the data for the two tables.
However, if the hash table grows too big to fit into the memory, then the optimizer breaks it up into different partitions. As the partitions exceed allocated memory, parts are written to temporary segments on disk.

&#61623; <Parameter:HASH_JOIN_ENABLED> dictates whether optimizer should consider using HASH joins .
If you do not want HASH joins to be used, set this parameter to false.
&#61623; <Parameter:HASH_AREA_SIZE> specifies how much memory can be used to build a hash table for a HASH join , and resembles the SORT_AREA_SIZE parameter. If this parameter is set too small , then partial hash tables will need to be stored in temporary segments. If this parameter is set too big, then physical memory would be exhausted. As with SORT_AREA_SIZE, HASH_AREA_SIZE indicates how much memory can be used per session. Many concurrent sessions can consume a lot of memory.
The default value of HASH_AREA_SIZE = 2 * SORT_AREA_SIZE.

Hash Join Processing
Hash Join (HJ) is a new type of join processing introduced in 7.3. The goal of
the implementation is to provide a better alternative to the Sort Merge Join
(SMJ) and Nested Loops (NL) Join.
Hash Joins should be more efficient that both SMJ and NL.
SMJ is comparatively slow as potentially both row sources have to be sorted
to enable the merge phase.
NL is inefficient if the driving table contains lots of rows (i.e. high
cardinality), thus forcing multiple probes of the inner table and the
inner table is expensive to visit.
HJ only requires a single pass of each table.
Hash Joins are only implemented when using the CBO and are selected by
comparing the cost of a Hash Join with SMJ and NL during query optimization.
The use of Hash Joins are controlled by the initialisation parameter
<Parameter:HASH_JOIN_ENABLED>. When this is set to false the optimizer will not
consider the use of hash joins when evaluating query plans. Note that a
/*+ HASH */ hint will override this parameter.
As with all other joins in Oracle, Hash Joins only ever join 2 row
sources at any one time.
o Decide on the number of partitions (buckets) we are going
to use to put hashed rows in. This is determined using the hash_area_size,
db_block_size and hash_multiblock_io_count parameters. The hash_area_size
is adjusted by 20% to allow for the overhead of storing partitions,
the bitmap of unique left input and the hash table - These will be discussed
Number of Partitions = 0.8 x hash_area_size)
(db_block_size x hash_multiblock_io_count)
o Smallest Row source is read (We will call this R from now on) and each row
is subjected to a hashing algorithm. We use 2 hashing algorithms to hash the
rows as an attempt to get a better spread of values around buckets.
Collectively these hash buckets (partitions) are called the Hash Table.
o The hashed rows are then placed in the appropriate partitions based on the
first hash values, the second hash value is stored with the row in case we
need to rehash later
o At the same time a bitmap of the hashed values is built using both hash values
o If the hash area fills up then the largest partition is chosen and written out
to disk. Any new rows that hash to a partition on disk will then update the
on disk partition. It is possible to only have part of a single
partition in memory, with the rest of this partition and all the others
residing on disk. Hopefully this will not happen as it will compromise
This continues until we have exhausted the initial row source R.
o We then do some processing to try to fit the maximum possible partitions in
memory. We sort the partitions in order of size and then try to fit in as
many as possible.
o We then start reading from the second row source. We hash each row in turn and
test to see if the hash value hits a partition that is currently in memory. If
it does then we return this row as a joined row. If it does not match a
partition in memory then we write the row to a new partition that stores
rows from row source S. This partition is built using the same hash algorithm
as used to hash R. This means that rows that hash to the same value in S will
be in the same partition number in the new set of partitions as similar
rows from R are in in the original set of partitions. This partition is stored
on disk.
o We continue through S until we have examined every row from that row source.
o We are then left with a bunch of rows that join and partitions on disk that
have been built from source R and from Source S.
o We then load up the smallest partition from either original row source's
partition set, building a hash table in memory from it using the second hash
function so as to distribute the rows more evenly.
o The new hash table in memory is probed using the matching partition from
the other (larger) row source. Note that we use the smaller row source
to build the hash table - this could be either R or S.
Any joining rows are returned.
o This process continues until all partitions have been exhausted.
Bitmap for unique join keys
This bitmap is effectively a flag that indicates if a particular hash bucket
contains values. The bitmap records the hash values that actually contain
records. The point of this is to strip out rows from S before they
are written to the partition on disk because they do not hash to any of the
partition(s) that are currently in memory. This bitmap is consulted before
the rows from S are put in to partitions on disk so that we do not have the
overhead of writing out and then reading back in rows that are never going
to join.
The filter really tells us what is NOT stored on disk as opposed to
what is. Since different values can hash to the same hash value, a match on a
hash value does not guarantee that the actual value is the one that caused the
hash partition to be populated. However it can filter a good proportion of
rows, hence its usage.
If the hash value is not present then the value is definately not
in the hash table.
It is possible to get false positives; keys that hash to a value in the hash
value set, but are not contained in the data set.
False positives are less likely as the hash value range increases.

Author Comment

ID: 8252698
Thank you andrewst and schwertner, it's very nice for your help ;o)

You help me a lot. I would like to divide the point, but it's not possible. ;o(

Have a good day. Jerome

Featured Post

New feature and membership benefit!

New feature! Upgrade and increase expert visibility of your issues with Priority Questions.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Why doesn't the Oracle optimizer use my index? Querying too much data Most Oracle developers know that an index is useful when you can use it to restrict your result set to a small number of the total rows in a table. So, the obvious side…
Have you ever had to make fundamental changes to a table in Oracle, but haven't been able to get any downtime?  I'm talking things like: * Dropping columns * Shrinking allocated space * Removing chained blocks and restoring the PCTFREE * Re-or…
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Via a live example, show how to restore a database from backup after a simulated disk failure using RMAN.
Suggested Courses

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question