Solved

Help me understand how Microsoft SQL handles Nulls

Posted on 2015-01-19
8
108 Views
Last Modified: 2015-01-19
Help me understand something.  I have a database of patients.  A small percentage (15%) will actually have a value in the field "MedicalRecordNumber".  Of that small percent, many will start with the characters "IN" or "IG" which denotes a particular status.  Recently I've been asked to modify a query to omit these users.  Seems simple enough but the results are messing with my head.  

Total records (select * from patientprofile where pstatus = 'A'; (Active patients) = 4,724

Count of all patients with IG or IN as a prefix

select count(*) from patientprofile
        where pstatus = 'A'
        and left(MedicalRecordNumber,2) in ('IG','IN')
Returns 531

Logically speaking, the inverse should be obvious and return the inverse of 4,724 - 531 = 4193 but by simply changing the IN to NOT IN returns 248 which I didn't expect.  This apparently has to do with the NULL values being ignored and I have to add "or MedicalRecordNumber is null" to the value.  Why is that?  And where the heck did the value of 248 come from??  A null value is not in the list values and should therefore be included. (By the way, I get the same behavior if I try to use <>  or != instead of the "IN".   I've worked with Oracle for many years and this doesn't seem to be the case.  I really could use an explanation or understanding of why this scenario occurs.
0
Comment
Question by:sifugreg
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 33

Expert Comment

by:Mike Eghtebas
ID: 40558604
See if you get 248 for

select count(*) from patientprofile
        where pstatus IS NULL

MedicalRecordNumber  is either PK (never is null) or FK (by definition could be null) but highly unlikely in medical business for this field.

Null is empty (missing) entry.
0
 
LVL 9

Accepted Solution

by:
Valliappan AN earned 500 total points
ID: 40558615
Have a look at here on how NULL values are handled by SQL Server:
http://technet.microsoft.com/en-us/library/ms191504%28v=sql.105%29.aspx

It is always better to handle NULL values properly in queries, to avoid such results:

There are multiple ways to handle NULL, like check for IS NULL, or convert NULL values to empty strings as follows:

select count(*) from patientprofile
        where pstatus = 'A'
        and left( ISNULL(MedicalRecordNumber,'') ,2) NOT IN ('IG','IN')

-- note the ISNULL function there.  

HTH
0
 
LVL 33

Expert Comment

by:Mike Eghtebas
ID: 40558659
If there is a patient with MRN = 100 just brought in to emergency and needs to be assigned to a room

1. This new patient's data gets entered into Patient table

A. Just patient info has been entered into table Patient
Patient table                Room table                                Doctor table
==============              ===========================            =================
MRN       Name   etc.       Room#    MRN     Vacant  DrID          DoctorID    DotcorName
---       ---               ---      ---     ---       ---          ---       ---
100       Joe                1      Null      Yes     Null          199     DR. A
                             2       150      No      199           200     DR. B

Open in new window

Because Room# 1 is available has no patient is assigned to it MRN is null so is DrID

B. Now it has been decide to put the new patient in room 1 (so MRN as FK in table Room is no longer null.
Patient table                Room table                                Doctor table
==============              ===========================            =================
MRN       Name   etc.       Room#    MRN     Vacant  DrID          DoctorID    DotcorName
---       ---               ---      ---     ---       ---          ---       ---
100       Joe                1      100      Yes       Null          199     DR. A
                             2       150      No      199           200     DR. B

Open in new window

But still DrID in  Room table is null (meaning no doctor assigned to visit MRN =100

B. Now Dr B assigned to visit the patient.
Patient table                Room table                                Doctor table
==============              ===========================            =================
MRN       Name   etc.       Room#    MRN     Vacant  DrID          DoctorID    DotcorName
---       ---               ---      ---     ---       ---          ---       ---
100       Joe                1      100      Yes       200          199     DR. A
                             2       150      No      199           200     DR. B

Open in new window


Null means not assigned here. Null is not the same as empty string or zero

And we never can say null in one cell is equal to null elsewhere. We cannot compare what doesn't exist.
0
 
LVL 33

Expert Comment

by:Mike Eghtebas
ID: 40558702
Also try:

select count(*) from patientprofile
        where pstatus Not IN('IG','IN','A')

to see 248 records. They may have a different status you don't know of.
0
Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

 
LVL 1

Author Closing Comment

by:sifugreg
ID: 40558704
Thanks, it makes sense but it is interesting that querying without specifications shows all fields with null values but when you specify a NOT it omits them.  A little confusing.  If ind it interesting that MS states "No two null values are equal." which lends itself toward supporting being included in a query that states "show me all cars that aren't red" to include cars that have not yet been painted.  But what do I know ;-).  I'll tear through the document you provided to see if it makes more sense.
0
 
LVL 1

Author Comment

by:sifugreg
ID: 40558706
eghtebas,

PSTATUS can't be null and is only A,I,O or X.  IG and IN were the first two characters of the MedicalRecordNumber.
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 40558730
NULL is the absence of value (in that location)

Because NULL has no value it cannot be equal to any compared value.
Nor can NULL be unequal to a compared value because you have to know if it equals something before you can decide it does not equal a value.

MedicalRecordNumber IN ('IG','IN')

is comparing the field to values, locating just those values that are equal to the listed values

Using IN ('IG','IN') it is the equivalent of:

( MedicalRecordNumber  = 'IG' OR MedicalRecordNumber  = 'IN' )

The reverse:
MedicalRecordNumber NOT IN ('IG','IN')

is comparing the field to values, locating just those values that are NOT equal to the listed values
but because NULL cannot be equal to anything it also cannot be NOT equal either

Using NOT IN ('IG','IN') it is the equivalent of:

NOT ( MedicalRecordNumber  = 'IG' OR MedicalRecordNumber  = 'IN' )

You have 531 records that meet IN ('IG','IN') and 248 records that meet NOT IN ('IG','IN')

531 + 248 = 779 records where MedicalRecordNumber  IS NOT NULL

4,724 - 779 = 3945 records where MedicalRecordNumber  IS NULL

try this:

select count(*) from patientprofile
where pstatus = 'A'
        and MedicalRecordNumber  IS NULL


See this blog:
http://weblogs.sqlteam.com/markc/archive/2009/06/08/60929.aspx

----------------------------------------
Note:
Using LEFT() or ISNULL() on MedicalRecordNumber  in the where clause will remove the ability of the query to use an index on the affected field. Try to avoid these "non-sargable" constructs.

sargable: Search ARGument ABLE
http://en.wikipedia.org/wiki/Sargable

Rules of thumb
Avoid applying functions on data values in a sql condition.
Avoid non-sargable predicates and replace them with sargable equivalents.
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 40558738
too late is seems

You have chosen a non-sargable solution regrettably.

compactness does not immediately mean better
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Suggested Solutions

Introduced in Microsoft SQL Server 2005, the Copy Database Wizard (http://msdn.microsoft.com/en-us/library/ms188664.aspx) is useful in copying databases and associated objects between SQL instances; therefore, it is a good migration and upgrade tool…
Everyone has problem when going to load data into Data warehouse (EDW). They all need to confirm that data quality is good but they don't no how to proceed. Microsoft has provided new task within SSIS 2008 called "Data Profiler Task". It solve th…
Viewers will learn how the fundamental information of how to create a table.
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now