Understand Oracle null values

Introduction

I had to explain to a fellow DBA the impact of NULL on processing where the columns that where null were also used in joining of tables. It took a while, but finally he caught on. It made me wonder how many folks are confused by this.

The major issue is that NULLs are not tracked in normal indexes. It is true that while bitmap indexes may be used to track NULL values, they have their own issues for OLTP environments and usually have very restricted use in OLTP environments.

I decided to develop a little test case just to show what is going on with NULLs. Let’s take a look at this test case:

Test Case Illustration

We will create a table named test and take a look at its schema:

    SQL> create table test as select * from dba_objects;
                      
                          Table created.

Open in new window

    SQL> desc test
                          Name Null? Type
                          ----------------------------------------- -------- ----------------------------
                          OWNER VARCHAR2(30)
                          OBJECT_NAME VARCHAR2(128)
                          SUBOBJECT_NAME VARCHAR2(30)
                          OBJECT_ID NUMBER
                          DATA_OBJECT_ID NUMBER
                          OBJECT_TYPE VARCHAR2(18)
                          CREATED DATE
                          LAST_DDL_TIME DATE
                          TIMESTAMP VARCHAR2(19)
                          STATUS VARCHAR2(7)
                          TEMPORARY VARCHAR2(1)
                          GENERATED VARCHAR2(1)
                          SECONDARY VARCHAR2(1)

Open in new window

Now create an index on the object_id field and check the analysis of it, in particual, examine how SQL Server will handle NULL values:

    SQL> create index test_ind on test(object_id);
                      
                          Index created.

Open in new window

    SQL> analyze table test compute statistics;
                      
                          Table analyzed.

Open in new window

Check on IS NOT NULL:

    SQL> set autotrace on
                          SQL> select count(*) from test where object_id is not null;
                      
                          COUNT(*)
                          ----------
                          33732
                      
                          Execution Plan
                          ----------------------------------------------------------
                          0 SELECT STATEMENT Optimizer=CHOOSE (Cost=19 Card=1 Bytes=4)
                          1 0 SORT (AGGREGATE)
                          2 1 INDEX (FAST FULL SCAN) OF 'TEST_IND' (NON-UNIQUE) (Cost=
                          19 Card=33732 Bytes=134928)
                      
                          Statistics
                          ----------------------------------------------------------
                          0 recursive calls
                          0 db block gets
                          79 consistent gets
                          0 physical reads
                          0 redo size
                          381 bytes sent via SQL*Net to client
                          499 bytes received via SQL*Net from client
                          2 SQL*Net roundtrips to/from client
                          0 sorts (memory)
                          0 sorts (disk)
                          1 rows processed

Open in new window

Check on IS NULL:

    SQL> select count(*) from test where object_id is null;
                      
                          COUNT(*)
                          ----------
                          11
                      
                          Execution Plan
                          ----------------------------------------------------------
                          0 SELECT STATEMENT Optimizer=CHOOSE (Cost=104 Card=1 Bytes=4)
                          1 0 SORT (AGGREGATE)
                          2 1 TABLE ACCESS (FULL) OF 'TEST' (Cost=104 Card=11 Bytes=44
                          )
                      
                          Statistics
                          ----------------------------------------------------------
                          0 recursive calls
                          0 db block gets
                          464 consistent gets
                          0 physical reads
                          0 redo size
                          379 bytes sent via SQL*Net to client
                          499 bytes received via SQL*Net from client
                          2 SQL*Net roundtrips to/from client
                          0 sorts (memory)
                          0 sorts (disk)
                          1 rows processed

Open in new window

Using IS NULL and IS NOT NULL based WHERE clauses we determined:

The IS NOT NULL type WHERE clause will use an index

The IS NULL type WHERE doesn’t use an index

Next we create a function-based index, fbi_test, using the NVL function to convert the NULL values into a 'default' value. When we use the NVL function in our SELECT in place of the IS NULL we get index usage and much better performance, however, this may lead to improper relationships between tables.

    SQL>
                          SQL> create index fbi_test on test( NVL(object_id,-1) );
                      
                          Index created.

Open in new window

    SQL> analyze table test compute statistics;
                      
                          Table analyzed.

Open in new window

    SQL> select count(*) from test where NVL( object_id,-1) = -1;
                      
                          COUNT(*)
                          ----------
                          11
                      
                          Execution Plan
                          ----------------------------------------------------------
                          0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=4)
                          1 0 SORT (AGGREGATE)
                          2 1 INDEX (RANGE SCAN) OF 'FBI_TEST' (NON-UNIQUE) (Cost=2 Ca
                          rd=12 Bytes=48)
                      
                          Statistics
                          ----------------------------------------------------------
                          0 recursive calls
                          0 db block gets
                          2 consistent gets
                          0 physical reads
                          0 redo size
                          379 bytes sent via SQL*Net to client
                          499 bytes received via SQL*Net from client
                          2 SQL*Net roundtrips to/from client
                          0 sorts (memory)
                          0 sorts (disk)
                          1 rows processed

Open in new window

As a final step, we replace the NULL values in test with a value of -1. Now, using that in our select instead of IS NULL we get response virtually identical with the use of NVL function based index.

    SQL> update test set object_id=-1 where object_id is null;
                      
                          11 rows updated.
                      
                          Execution Plan
                          ----------------------------------------------------------
                          0 UPDATE STATEMENT Optimizer=CHOOSE (Cost=104 Card=11 Bytes=44
                          )
                      
                          1 0 UPDATE OF 'TEST'
                          2 1 TABLE ACCESS (FULL) OF 'TEST' (Cost=104 Card=11 Bytes=44
                          )
                      
                          Statistics
                          ----------------------------------------------------------
                          2 recursive calls
                          78 db block gets
                          465 consistent gets
                          0 physical reads
                          9208 redo size
                          625 bytes sent via SQL*Net to client
                          548 bytes received via SQL*Net from client
                          3 SQL*Net roundtrips to/from client
                          2 sorts (memory)
                          0 sorts (disk)
                          11 rows processed

Open in new window

    SQL> commit;
                      
                          Commit complete.

Open in new window

    SQL> select count(*) from test where object_id=-1;
                      
                          COUNT(*)
                          ----------
                          11
                      
                          Execution Plan
                          ----------------------------------------------------------
                          0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=4)
                          1 0 SORT (AGGREGATE)
                          2 1 INDEX (RANGE SCAN) OF 'TEST_IND' (NON-UNIQUE) (Cost=2 Ca
                          rd=1 Bytes=4)
                      
                          Statistics
                          ----------------------------------------------------------
                          0 recursive calls
                          0 db block gets
                          2 consistent gets
                          0 physical reads
                          0 redo size
                          379 bytes sent via SQL*Net to client
                          499 bytes received via SQL*Net from client
                          2 SQL*Net roundtrips to/from client
                          0 sorts (memory)
                          0 sorts (disk)
                          1 rows processed

Open in new window

Of course the issues we are trying to prevent with all this is processing of IS NULL type queries forcing such items as full table scans and causing developers to have to use OUTER JOIN type syntax to resolve NULL conditions between tables. By eliminating these OUTER JOIN and IS NULL processing steps which result in large HASH and full table scans in many cases, we can dramatically reduce the execution times of queries. Notice in the example how we reduced cost figures from 104 for a full table scan to 2 for an index (either our normal or the function based index) scan.

Conclusion

So, what should we take from this? Several conclusions can be reached:

Whenever possible avoid the use of NULLable columns for columns that will be joined.

Use default values in place of NULL, note that to avoid outer joins, both parent and child tables must have a value, for example if we had a parent table the related to the OBJECT_ID in our example, we would have an entry of (-1, ‘NO ID’) in that parent to resolve the -1 entries in the child.

If IS NULL is required, consider replacing it with a NVL call and a function based index, while this may not help with outer joins, it will help with IS NULL type selects.

Comments (2)

Senthil B

Developer, Data Modeler, Scrum Master

Author

Commented: 2010-10-20

Thansk to the editor

Greg Clough

Senior Oracle DBA

Commented: 2013-12-09

Remember that Oracle won't include NULL rows in an index "If all columns in the index are NULL"... but if you have a composite index and some columns are NULL whilst others are populated, then the full row including the NULL columns **IS** included in a standard b-tree index.

Here's a reference on the subject by Mr. Tom Kyte:

http://tkyte.blogspot.co.uk/2006/01/something-about-nothing.html