Link to home
Start Free TrialLog in
Avatar of snyperj
snyperjFlag for United States of America

asked on

Avoiding Nulls

I am writing an access front end that will write some data to a SQL Server 2005 db.  In turn our IT folks will then be doing something further with the data.  I received a note from them that said make sure no 'NULL' values are written to the tables.   I have never worried about this with Access back ends (though maybe I should have been...)

In any event, can someone help me understand why it is important, and how should I save a field if the user has entered no data....?  Something like nz(txt1),"") maybe?
Avatar of DatabaseMX (Joe Anderson - Former Microsoft Access MVP)
DatabaseMX (Joe Anderson - Former Microsoft Access MVP)
Flag of United States of America image

Fire the IT peeps!

Nz(txt1,"") should do it.  Empty strings are horrible!

mx
"can someone help me understand why it is important,"
Actually, you should ask them!  WHAT ... is their issue with Nulls

mx
Avatar of snyperj

ASKER

I will... just wanted to make sure the answer wasn't obvious.  :)
"I will" ... meaning you will fire the IT peeps!  (lol)
I'm not an SQL Server guy ... so, it;s *possible* ... SS has some issue with nulls, but that seems unlikely.

mx
What about numeric datatypes, including datetime?  You cannot stuff a zero-length string in those fields.  

As a work-around you might have to use: Nz(fldNumeric,0) or Nz(fldDateTime,0)

Yes .... good point.  I should have clarified ... "" for Text fields.

mx
null means no value, whats the problem with that... I hate empty strings in table, and prefer null :) and I like queries like

select ... where col is null

:) if you don't know how to use nulls then it may be problem...
"I received a note from them that said make sure no 'NULL' values are written to the tables"

Reply as: Dear IT guys, when you get a NULL make sure you convert to "" 

:)
Since we don't know what other requirements there are, I don't think we should be firing the IT department yet.  They may have to run some third party software, mandated by management, that gags on NULLs.
I totally disagree with that article.  Nulls are a beautiful thing.  RE:

"null means no value, whats the problem with that.."

EXACTLY !!!
Really though guys ... lets *not* go down the Null Debate Road ... it's an endless deadend.

mx
If you actually read the article you would see that "null means no value" IS the problem.
Many shops have rules about avoiding nulls. There are many situations where they should be avoided--and for good reason--but a universal prohibition is a bit extreme.  In any case, if you must avoid them, then you either need a database design that doesn't require them or another value that you can substitute.  If you go with the substitution approach, then I hardly recommend you adopt a standard practice.  For strings, using a space or empty string (except for Oracle) is common practice.  For dates, something like 1/1/1800 or 12/31/9999 can often be used.  Numbers are a sticky wicket, because very often any value you choose is already a legitimate value.


 


   

   
   
     





 

"For strings, using a space ...."
WOW.  

"For dates, something like 1/1/1800 or 12/31/9999 "
Bogus dates ... unreal data ...

This pretty much says it all (from Allen B):

"A Null is "no entry" in a field. The alternative is to require an entry in every field of every record! You turn up at a hospital too badly hurt to give your birth date, and they won't let you in because the admissions database can't leave the field null? Since some fields must be optional, so you must learn to handle nulls."


mx
The Allen Browne article explains but does not contradict the first article.  The next page http://www.allenbrowne.com/casu-12.html goes over the problems with using NULLs inappropriately.
Key word: "inappropriately"
I'm very appropriate -)

mx
Nulls are a beautiful thing...like AutoNumbers?


If we could get null AutoNumbers, then MX should be very happy.   :>)


I've found that nulls can be both cute and insidious at the same time.  May I once again draw your attention to the prom queen.  


snyperj,

Your issue not where you should use Null in your own applications but that the system you are passing your data to does not use Nulls.   You will need to convert your Null values to the correct data type.

I deal with this a long when migrating data between applications.

Nulls are a beautiful thing...like AutoNumbers?
Ab So Lutely ...

Prom Queen.  OK ... just admit it dqmq ... you got dumped by a prom queen, and have never let that go, lol !!!
MX, you're getting more and more Inappropriate all the time...  The answer to NULLs is not Yes or No but Sometimes.  And the answer for @snyperj is do what you gotta do.
I was working on a Q for a OP who was doing some extensive updates to a half-million row table that contained lots of nulls.
The problem was that updating the nulls to real values was causing massive bloat as each updated record (the best guess by one expert) was being written to a new code page.  So I can see a situation where some of the IT folk may not want to have a pack of null values around.

The OP had his db bloat over the 2 GB limit as he did his processing, though his end solution wasn't something I comprehended.

The bottom line is that the IT guys want each field to have a default value.  That is easy enough to set up in table design.
In logic design that is not so easy!  I agree that nulls are useful--they mean unknown.  Now, you have to go through each field and decide on some placeholder value to mean 'unknown.'  You also need it to NOT be something that could be otherwise meaningful.  And you have to change your interface to hide these placeholders from the end-user.

I'm not a db admin.  I run the SQL Server--and I'm the only guy who could mess it up, so it doesn't scare me.
Why your IT guys have that rule, I don't know.  But I know and have discussed one scenario where it could make sense.
@JDettman: Like I suggested @ http:#a36532666   

@DaveBaldwin:
In the almost 19 years now I've worked 24x7x365 with Access ... I have zero fear of Nulls. And in an Access database, for me the Answer is Yes.  

And it's Microsoft's answer also ... because if you have say 6 fields in a table (text, numeric, datetime, memo, currency, ole object),  and assuming you have made none of these fields required, then for a give record where you have *not* entered any data into those fields, the value of those fields is Null.  Same of course when going to a new record.  They are all Null until a real value is entered.

"The bottom line is that the IT guys want each field to have a default value.  "
Even worse in my book, because it's totally arbitrary.  IF ... I need a default value for some purpose, then at the Form level, I use the BeforeInsert event to set that value.

mx
Avatar of snyperj

ASKER

Thanks for all the input and the links.  I am trying to get some clarification on the directive (but often, there is none to be given in these cases..).  In any event I have learned much on this thread.  Thanks again... I will be splitting points and awarding shortly.
mx@

The prom queen and I are getting along fine.  The main problem: sometimes she's not allowed in the bar because her ID is, well, blank.  Rather than fight a battle, we've just found a fake drivers license.

Given a mandate that prohibits nulls, what would you do? I mean, short of redesigning the database.  You suggested substituting a null string, but what about for dates and numbers?    
ASKER CERTIFIED SOLUTION
Avatar of Dave Baldwin
Dave Baldwin
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
<Given a mandate that prohibits nulls, what would you do? I mean, short of redesigning the database.  You suggested substituting a null string, but what about for dates and numbers>

Design or redesign the database is really the only choice.  
Every field needs to be looked at.  
Each one will need some sort of default value.  
For some, this will be easy, like 0 or "" 
For some this will be hard because these default 'non-null' placeholders need to be both distinctive and meaningless.  
Then every control on every form and report will need to be looked at.  
Are these 'non-null' something we want the end-user to see?  
In cases where that is NOT desired how do we hide or replace them?  
On forms do we muck with font color to be the same as the control background or use some other mechanism?  
On reports, do we muck with the underlying query so it doesn't return these placeholders?  
Or do we use unbound controls and the Detail_Format() event to fill in what we want to display?
How do we deal with users changing their minds about what they entered, and ensuring that a null isn't their input?
Before_Update() event on each form?
Or unbound forms and add/edit records through VBA code?

Never permitting nulls means <redesigning the database>
Which is why @mx is a bit indignant.
Because, given the amount of work involved there had better be a good operational reason for such a prohibition.

And there likely isn't
Avatar of dportas
dportas

There are perfectly sound reasons to avoid or minimise use of nulls. The question of when and why to put in a null is inherently a very subjective one and lots of database designers have different justifications for doing it and mean different things when they use a null.

For what it's worth, here is Microsoft's advice: "To minimize maintenance and possible effects on existing queries or reports, you should minimize the use of null values [sic]."
http://msdn.microsoft.com/en-us/library/ms191504.aspx

Fundamentally null isn't necessary for accurate recording of facts about the business domain so it can't be part of any properly formulated business requirement (a database is collection of facts and any fact about the world outside the database can be stated without using a null). Nulls are symbols created for the convenience (or otherwise) of database designers/developers and users rather than being something implicitly required from a data modelling perspective. Unfortunately, nulls do present certain logical problems and many of the common uses of null will inevitably lead to incorrect or incosistent results.

For example it was stated here that null means "no value". That is one of many possible interpretations. In that case however x + "no value" reasonably ought to be x, which is not the result returned by SQL from the expression x + null. So as far as SQL is concerned null doesn't consistently mean "no value".

It was also stated in this thread that null means "unknown" (another possible interpretation). That isn't true either though because if x is unknown then x=x is still a true statement, but SQL doesn't evaluate x=x as true.

There is another good reason for avoiding null. Since nulls are not part of the business domain they typically have to be removed from database outputs at some point (reports, screens, interfaces). So whenever the database designer decides to put in a null some other developer may have to do additional work to remove it again.
snyperj,

Your IT folks undoubtedly have very sound reasons for wanting to avoid the use of Nulls.  I do agree with dqmq, however that globally prohibiting nulls seems extreme.

Given the substantial effort that you as the developer are going to have to put into this requirement, it may be well worth your time to explore/discuss this requirement further with your IT people to ensure that you understand the requirements and reasons for it completely, that they understand what you need to do to implement it, and then decide on a plan.

For starters, ask them if the prohibition on nulls is indeed global or if nulls are okay in fields that are not specifically being manipulated or reported.   Mention any specific fields or data types that you foresee as possibly problematic (such as date fields).  Make sure that you agree upon what values you are going to use as null substitutes, and in what cases (if any) nulls are okay.

Just make sure that you are all seeing eye-to-eye, so that you come up with a sound solution that meets their requirements and makes sense.
I'm with mx on this one.  I do not agree that a null is not part of a business domain.  Start substituting 'fake' values in any of your aggregate functions and see what a mess you come up with.  
@dportas
If you have seen Marolw Thomas or Danny Kaye being surprised, you'll know what happened when I read your post.
I am keeping in mind JDettman's admonition that we should stick to @snyperj's needs.
Since perhaps he may encounter logic similar to yours when he goes to argue that this anti-null prohibition is inane, I'll take the time to demolish the logic behind your post.
Moreover, since these posts stick around, letting what you wrote go unchallenged would be a disservice to any coming after.

<Fundamentally null isn't necessary for accurate recording of facts about the business domain so it can't be part of any properly formulated business requirement (a database is collection of facts and any fact about the world outside the database can be stated without using a null). Nulls are symbols created for the convenience (or otherwise) of database designers/developers and users rather than being something implicitly required from a data modelling perspective. Unfortunately, nulls do present certain logical problems and many of the common uses of null will inevitably lead to incorrect or incosistent results.>

This is just false.
Let us say we decide to record the birthday of every customer.  Everyone has a birthday.  There is no reason for this field to be null.  
Except for all those customers who refuse to tell you their birthday.  
Now you have a field that is null--unknown.  Data should exist for it, but doesn't--and for a very good reason.
<...rather than being something implicitly required from a data modelling perspective>  
Sorry, this null comes from the data modelling; we have to deal with not knowing some of our data.  
<Nulls are symbols created for the convenience (or otherwise) of database designers/developers and users>
No, nulls come from set mathematics.  There is a difference between the empty set {} and undefined

The very link you quoted (http://msdn.microsoft.com/en-us/library/ms191504.aspx) demolishes the math you did
<For example it was stated here that null means "no value". That is one of many possible interpretations. In that case however x + "no value" reasonably ought to be x, which is not the result returned by SQL from the expression x + null. So as far as SQL is concerned null doesn't consistently mean "no value".>

x + "no value" does not reasonably have to be x, as your link shows.  The "no value" is not zero.  It could be anything from -infinity to +infinity
x + "no value" = "no value"  because you have no idea what "no value" is

Null = Null doesn't evaluate to True.  That is correct.  Logically you cannot state that two completely unknown things are equal--you don't know what they are!
Maybe one is actually 0 and the other 100--but you don't know that.

<they typically have to be removed from database outputs at some point (reports, screens, interfaces). So whenever the database designer decides to put in a null some ...[one]... may have to do additional work to remove it again. >
Here at least we agree.  Nulls have to be handled.  For this, in MS Access we have the very fine Nz() function.  In T-SQL we have ISNULL()
When we have data that is unknown we have to handle it.
But it is much easier to handle it at the code and query level than pushing the handling into the interface as I noted in ID:36540042.

And Null can and does capture information.  I have a field 'Miscellaneous Cost'
When it is null, I know that no-one has tried to enter a value.
When it is zero, I know that someone definitively knows that there is no miscellaneous cost, or has tried to make an entry and reversed themselves
These two things are different.
When you force a default value, this differentiating information is lost.

Ultimately, every null has to get handled.  Most times the user interface does it for us by displaying nothing.  Other times we do it for ourselves with Nz()
The fact that some information is not known is not avoidable.
Null exists for a reason, and the rules for handling null are well thought out and consistent.

Do @snyperj's people have sound reasons for prohibiting null?  Perhaps.
Consider.
Suppose you have a table with 10 currency field and a half-million records.
Now you'd like the sum of them.
If you permit null, and the table is very sparsely populated you could be running Nz()  many. many times.
And Nz()'s computational cost is NOT 0

So, maybe they have reasons.
Or maybe they subscribe to @dportas's logic
Nick:  

>Suppose you have a table with 10 currency field and a half-million records.
Now you'd like the sum of them.<  

You do not to 'handle" the null values of a field to get the sum.
But you can't do something like field1 + field2 + field3 ... and get something valid.
Have a null in there and you'll get null for an answer
And  so something like (and the syntax of this will undoutably be wrong because I never have cause to do these things)
Select Sum(field1 + field2 + field3...) as myTotal from SomeTable isn't going to fly
Select Sum(nz(field1,0) + nz(field2,0) + nz(field3,0)...) will work

But that is conceivably a LOT of Nz() calls
Have a look at query1 in the sample... and then think bigger
ManyNulls.mdb
Wouldn't it be faster to  sum the individual fields, and then sum the sums doing one Nz() on each sum?
I posed the example for the sake of theory
Sum() is null aware, so this works
Sum1: nz(Sum([field1])+Sum([field2]...),0)

So Sum() is handling the nulls internally, where '+' does not
@harfang has documented the cool things that the '+' operator can do in regard to nulls when you use it as the string concatenator instead of '&'
Have a look for his article about that, incidentally
But when '+' is arithmetic, it does not handle null.

I find null useful and valuable, personally.
Can I see instances where dbadmins find Null despicable?--yes I can imagine some.
Instances where the internal handling of null is processor intensive.
Feel free to offer more examples of such.

Although I do find it a bit odd that the 'nulls are bad' folks aren't offering @snyperj any concrete examples of why his dbadmins may have laid down that law.
Which is a shame.
in my test table gp are all nums, str are all nulls.

SELECT nz(sum(gp) + sum(str),0) as S2
FROM tbltest;

gives me a big fat zero instead of 1670 - the sum of gp
In the shim I posted
mytotal: Sum([field1]+[field2]+[field3]+[field4]+[field5]+[field6]+[field7]+[field8]+[field9])
is null
myHandledTotal: Sum(nz([field1],0)+nz([field2],0)+nz([field3],0)+nz([field4],0)+nz([field5],0)+nz([field6],0)+nz([field7],0)+nz([field8],0)+nz([field9],0))
=45
Sum1: nz(Sum([field1])+Sum([field2])+Sum([field3])+Sum([field4])+Sum([field5])+Sum([field6])+Sum([field7])+Sum([field8])+Sum([field9]),0)
=45

So I don't know why yours didn't work the same.
The shim is reposted to see that result


ManyNulls.mdb
I see the problem, none of your sums is a null.  Make any field nothing but nulls so the sum is a null and you will see where I am coming from.  Thanks for the repartee.  Time to go sailing.
Yup,
That it'd do it.
Sum() will return a value if at least one record is non-null.
When they were all null, it returned null, and then '+' cratered it all.

Which in the end goes back to the one irreducable truth of the discussion
One way or another, you have to handle the data you don't know, and handle it carefully.
The only question is "what is the best way?"
And that question has no 'right' answer

Happy sailing @GRayL
@Nick67: "Let us say we decide to record the birthday of every customer.  Everyone has a birthday.  There is no reason for this field to be null. Except for all those customers who refuse to tell you their birthday. Now you have a field that is null--unknown.  Data should exist for it, but doesn't--and for a very good reason."

Why would you want to create a null for a birthday that wasn't supplied? You could easily have put the birthdate in another table. Instead you apparently want to denormalize your database by putting the birthdate in a table where in Normal Form terms it does not belong. To do that you therefore choose to invent some nulls or some other dummy value to use a placeholders. Maybe you think you have a good technical reason for wanting to use nulls in that way. My point is that your reason for using null is purely a technical issue in database deisgn. Nulls are in no way implicit in the problem domain. A birthdate that isn't specified isn't a "null" birthdate at all and since null does not accurately model the semantics of some attribute being unknown it would only be a very poor approximation of reality in the situation you describe.

@Nick67: "Sorry, this null comes from the data modelling; we have to deal with not knowing some of our data."

It ought to be self evident that there are no nulls in the real world. They are purely a symbol used by database designers. If you doubt that then consider this: Scientists, mathematicians and everyone else have been modelling the real world without using null for thousands of years before SQL came along. They continue to do so without nulls today. Millions of people use or work with databases that don't have nulls. The sky is not going to fall in if you decide not to use nulls.

If you do choose to use nulls then you ought to do so with the wisdom that that is your choice and your responsibility as database designer. Don't try to use the business problem as an excuse for using null. Incidentally, I wonder why in your reply to me you seem to suggest that "no value" means the same as "unknown". Those are different things but anyway both of those terms are wrong as a description of nulls in SQL, as I already pointed out.


@dportas
Now we descend into farce.
Which I guess is what @mx and @JDettman warned about much earlier in the thread!

It may be instructive to @snyperj as he talks to his dbadmins about this prohibition.

<Why would you want to create a null for a birthday that wasn't supplied? You could easily have put the birthdate in another table. Instead you apparently want to denormalize your database by putting the birthdate in a table where in Normal Form terms it does not belong.>

<astonishment!>You aren't serious.  I can't think offhand of anything else that has a better one-to-one relationship than a human being and a birthday.  To suggest that people's birthdays being stored in the same table as other immutable things about them is not properly normalized just about renders me speechless.  Shall we put sex in it's own table for the Chaz Bono's of this world?</astonishment!>

But ok, just for fun let's grant that.  People may change their names, so let's put that in its own table.  People may not have middle initials so let's put each part of their name in its own table, with a foreign key back to the name table and a number that indicates first name, second name, third name, fourth name, and another boolean field that indicates surname for all those Spaniards and Brazilians with a gaziilion part names.  Let's have another field that indicates that the Cher's Madonna's and Prince's of this world only have one name.  Heck, most of those last three things might be null so lets bust them out into their own tables, too.  Let the name table have a foreign key back to the people table and these three name qualifier tables have foreign keys back to the name table. Now we are completely and utterly normalized and we never have to enter any nulls.  Now there won't be any records where there wasn't any data.

Great.  Now there aren't any nulls.
And when we left join people to names and left join names to PartsOfNames and left join those to the name qualifier tables...
... and run it to get something usable for someone's name?

Oh yeah, we're going to get NULL for all those values that are unknown, don't exist or weren't entered
Which is what we had in the first place.
Then we get to handle them as we concatenate the name string together.

You weren't serious, were you?

I'll take the explanation of null from the very excellent link you posted--and you'll note that in my posts I used unknown when talking about null.
<A value of NULL indicates that the value is unknown. A value of NULL is different from an empty or zero value. No two null values are equal. Comparisons between two null values, or between a NULL and any other value, return unknown because the value of each NULL is unknown.

Null values generally indicate data that is unknown, not applicable, or that the data will be added later. For example, a customer's middle initial may not be known at the time the customer places an order.>

So perhaps I should have used 'no known value' although in all fairness @HainKurt used that term and I quoted YOU in refuting it.

Bottom line: Nulls happen.  They may happen in your table, or they may happen when you LEFT JOIN tables.  But they happen.
How you deal with them is your choice @snyperj
<Don't try to use the business problem as an excuse for using null>
I'm not.  Nulls happen.  Don't pretend that absurd levels of normalization will make the need to handle NULL go away.
There is data that may be unknown or non-existent in every left join set you make.
No one needs <an excuse for using null>
Null happens.

Can I get that printed on a T-Shirt?
That'd be cool.

"NULL happens, man!" on the front
"Deal with it." on the back
@snyperj
Maybe something good does come out of farce.
While @dportas normalization suggestion is astonishing, he does touch on a good point.

Insufficiently normalized or poor modelled data may have a whack of nulls.
So, if your dbadmins are used to dealing with pikers, that may be why they gave you a heads-up
There can be a trade off between nulls and more normalization.
Take in Access, the Field.Properties collection
There can be about 28 or so properties: name, caption, datatype and so on.
When you go to query them though, if you didn't fill in the value, like the caption or smart tag, it throws an error.
Why?
Because so few of the possible poperties are used, rather than taking the space to hold the property and a non-existent value, the property doesn't exist in the collection until you add some data to it.
Much like the birthday table idea @dportas floated.

Good data design will mean a few nulls that makes sense about a unique item.
Bad data design will either mean a lot of nulls or a retarded level of normalization.

And that's worth thinking about.
LOL .... save me one of those shirts!

This is still my favorite example:


"A Null is "no entry" in a field. The alternative is to require an entry in every field of every record! You turn up at a hospital too badly hurt to give your birth date, and they won't let you in because the admissions database can't leave the field null? Since some fields must be optional, so you must learn to handle nulls."

No problem ... lets just put in some arbitrary birth date ... and hey ... maybe an arbitrary SSN also. Why not
<The alternative is to require an entry in every field of every record!>
@dportas astonishing alternative was that every field that could be null should be normalized out into its own table.
An unconscious person with XY chromosomes and XX plumbing arrives at a hospital.
The only entry in a table is an auto-number key value and the time of arrival.
Everything else has been normalized out into its own table because the values might be null.
Mocking the problem isn't a way of dealing with it.  And the admissions clerk or nurse knows nothing about nulls when they leave a field blank.  No matter how they go about it, the programmers and database people have to have a way to deal with it.  And when a Group of people have to use the data, they all need to know how it is being defined and handled.
"And the admissions clerk or nurse knows nothing about nulls when they leave a field blank."

How is that related to:

"because the admissions database can't leave the field null"

?

mx
To me it means the users don't care, they do what they need to get their work done.  If you make them enter something, someone will make up an 'arbitrary' date that tells the staff that it wasn't real.  On paper forms, people used to put "n/a" for "not available" rather than leaving the field blank.  In the long run, it's not really about nulls but about making something that works and doesn't cause stupid errors.  And that's the job of the programmers and database people.
No disagreement there.  I think we are almost all in agreement that no matter how you slice it, nulls happen and they have to be dealt with.  Some folks find the blanket prohibition on nulls in the table to be...excessive.
Nulls happen.  Deal with them.
@Nick67, Databases don't just "happen". They get designed with a purpose in mind. It is the database designers who chooses whether and when to use nulls. You obviously aren't going to discuss this sensibly so there's no point continuing with what might otherwise have been an interesting discussion.
This topic has already been discussed 150,000 times ... always with the same result:
Inconclusive, no agreement, countless opinions and so on. It's not even interesting anymore.

What we seem to have here is the SQL Server Camp and the Access Camp. In 19 years of working with Access, I have *never, ever* had an issue with Nulls ... and I consider them a beautiful thing, and very easy to deal with. And concerning Access ... note what I stated @ http:#a36533285 "And it's Microsoft's answer also ...".  That pretty much sums it up for the Access world as far as I'm concerned.

mx
@dportas
I find it...disagreeable...that you say 'I' am not going to discuss this sensibly.  That strays close to personal, which is always an error.  If you have a problem with the logic or factual accuracy of a statement, say so.  That is good.  Discussing personality or motives...not so much.
Now,  I have given you credit for raising the very valid point that insufficiently normalized data tends to contain a lot of null.  I have taken issue with your approach that ANY null value in field is a sign of insufficient normalization.  Virtually no one would grant that removing a person's birthday from a table describing an individual is a logical thing to do.  There is such a thing as too much normalization.  Left joins have computational costs, too.  And normalization pursued to its ultimate end results in all the tables being three field entities--an identity field, a foreign key, and a datum.  And that is an absurd data design.
Left joins happen.  Most North American addresses leave room for two address lines and then city, prov/state, and postal/zipcode.  Virtually no production databases will break out that mostly-unused second address line into it's own table to avoid the entry of null values.
Because when you left joined the two tables the null would be back--and it would have to be handled ANYWAY.  So you will have gained absolutely nothing by normalizing that field out to a new table.  And that isn't hard to grasp--which left me astonished that you had proposed such a mechanism.

As for the idea that null is somehow a new idea,  trust me, I was struggling with the idea that null != null long before Windows or SQL Server hit the scene.  Heck, Aristotle was struggling with the idea that since True or Null = True and False and Null = False, how could you describe the world in ways that the boolean results that equalled Null never occurred.
Null is a VERY old concept, and no matter what you do, null is going to happen--and need to be dealt with.

From Aristotle on down, folks have been trying to describe the world in ways that avoid null.  Hasn't happened yet.  You are welcome to make arguments about how a table design that permitted no null can then result in data that does not need to handle null.  Since left joins happen in the real world, you won't have a winning argument.  And that's ok, there is something to be learned by thinking it all the way through.

@dportas

"Scientists, mathematicians and everyone else have been modelling the real world without using null for thousands of years before SQL came along. "

And then we have this:

http://www.nullphysics.com

Seriously ...

@Nick67 You have some interesting ideas about Aristotle! Term Logic (and indeed Predicate Logic) as I learnt them include nothing like nulls and are firmly rooted in two-value logic. Please tell me what specific work of Aristotle you are referring to and tell me where I can read about his views on null.

Nulls were invented in the 1970s by the designers of SQL (Null pointers were invented in 1965 by Tony Hoare but nulls in the SQL sense are not null pointers). Not everyone uses SQL and its derivatives and not everyone uses nulls.
I see this thread has grown somewhat since my last entry.  I can't add anything else here so time to Stop Monitoring.
<are firmly rooted in two-value logic>
That is the 2500 year old problem.
The world does not conform to two value logic, and every convoluted attempt to make it conform fails.
<Nulls were invented in the 1970s by the designers of SQL>
That is about as true as saying the Arabs invented zero.
Zero existed long before the notation and concept was formalized.
The same thing is true about NULL.
Null as a concept is as old as set mathematics and arrays, but the notational idea of null as an empty set of a single element, that is a thing like zero.
It existed independently of its invention, but the need to formalize the concept is fairly recent.

If we have
A = {1,2,3,4,5,6,7}
B = {1,2,3,4,5,6}
C = {1,2,3,4,5}
we can do joins based on the first element equals 1.
When you ask the question what is the value of the seventh element of B and C the answer is {}
But that notation {} is not precise.
A union based on the first element = 9 is also {} but that empty set describing something different.
Hence the need for more precise terminology, and the rise of the word NULL.
But it is not an invention of the 1970's, it's a fundamental characteristic or reality.

<Not everyone uses SQL and its derivatives and not everyone uses nulls.>

I am genuinely interested in how a relational database system without left joins would function.
Because when you join domains with a different number of elements, you get something we typically describe as null.


@Nick67:
>> If we have  A = {1,2,3,4,5,6,7} B = {1,2,3,4,5,6} C = {1,2,3,4,5}
>> When you ask the question what is the value of the seventh element of B and C the answer is {}

These are supposed to be sets, right? Sets are unordered so there is no 1st, 7th or Nth element.

>> A union based on the first [sic] element = 9 is also {} but that empty set describing something different.

{} is the empty set. The empty set is not a null - not even in SQL databases and certainly not in the relational model. The empty set, like all sets, is a value. Like all values the empty set has the property of being equal to itself, something which is not true of null because null is not a value. If you are under the illusion that an empty set is remotely like a null then you might find a course in set theory useful.

>> I am genuinely interested in how a relational database system without left joins would function.
Maybe you aren't familiar with the relational algebra and the fact that an "outer" join is just a natural join or theta join followed by a union.

You still didn't explain the source of your claim about Aristotle. My strong suspicion is that you made it up.
I'm done.
This is about helping @snyperj, not about me you or anyone else.
If you want to quibble about the semantics of 'set' this is not the place.
The idea is straight-forward.  To go sideways from the idea because my description of it is nowhere near perfect seems rude and does nothing for the OP.
If you do not wish to discuss how someone can deal with the non-existent elements of a left join, ok.

But this is not being hijacked into anything else either.
<Maybe you aren't familiar with the relational algebra and the fact that an "outer" join is just a natural join or theta join followed by a union.>
Nope.
Please explain how that makes null disappear and what commercial products use such an approach.

You have asserted that null has no basis in reality and need have no basis in commercial database prodcuts.
I don't agree.  And there are MANY commercial database products that use and allow for null.
It ain't about Aristotle or logic.
It's about how a database programmer can avoid the use of null.
I say he can't.  One way or the other, null happens.

You beg to differ.  Ok fine. :)
>> It ain't about Aristotle or logic.

You brought Aristotle into the discussion, not me. I'm glad if you now agree that Aristotelian logic has nothing to do with nulls. We are certainly in agreement on that.

And I'm not the one hijacking the thread. I'm offering practical advice, not unsupported assertions. My practical suggestion for the snyperj is that he model the facts of the business process in the database without using nulls. Since all facts can be accurately represented without null (as Aristotle and thousands of years of logic and set theory amply demonstrate) that will not be a problem.

The safest princple to apply in database design is that of 5th Normal Form, which is sound practice for any database and very widely used even though you for some reason claim it is "too much". You say you aren't familiar with relational algebra so maybe you aren't familiar with database design theory and the closed world assumption either. That might explain the reasons why you take the point of view you do. This is not the place for a course in relational database theory. I will just say that if there are no nulls in the database then obviously there need be none in queries against that database - unless you choose to create them.

@dportas
This isn't the place for theory, it's the place for practicality
http://www.deeptraining.com/litwin/dbdesign/FundamentalsOfRelationalDatabaseDesign.aspx
http://www.phlonx.com/resources/nf3/

And at this point we are at potayto potawto.
Take a simple example.  We have people with a first name, a last name and an optional middle name.
Some contend that the decision to put all three name fields in one table is a choice to create a null
Some contend that spliting the nameparts into a table of their own so that nulls are not created in the table, but in a left join, is a choice to create a null as well.
I say the null results from the reality that the middle name is optional.

And before you'd like to go beating me on the head with my own source that relational database theory was formulated in 1969
<The relational database model was conceived by E. F. Codd in 1969>
make sure that you get down to the part where it is acknowledged that the formalization put words to realities that already existed
<To quote C.J. Date, the principles of database design are "nothing more than formalized common sense.">
Don't forget to note this part, too
<The model is based on branches of mathematics called set theory and predicate logic>
You can go here and buy Cobb's book (1990 second edition)
http://dl.acm.org/citation.cfm?id=77708&CFID=43614591&CFTOKEN=35592378
<In developing the relational model, I have tried to follow Einstein's advice, "Make it as simple as possible, but no simpler." >
How does that fit together with 'abstract everything possible null value into it's own table until no nulls need be created?"

And my reading of what the normal forms do
http://www.bkent.net/Doc/simple5.htm
Doesn't hang together with your assertion that you should move a person's birthday out of a table describing individuals, because we might not be able to collect and enter that datum.
<To summarize, a record is in second and third normal forms if every field is either part of the key or provides a (single-valued) fact about exactly the whole key and nothing else.>  Does a birthday provide a single valued fact about an indvidual? Yuuuuuuuuuuuuuup.  So why would we put it in its own table?

But it was good you brought up the normal forms.  Because that was on point.
"which is sound practice for any database and very widely used"
Somehow ... I seriously doubt that.
@mx
Most of the time the fourth and the fifth don't come into play
<If you've normalized your database to 3NF, you've likely also achieved Boyce/Codd Normal Form (and maybe even 4NF or 5NF).>
If you're consciously applying 4th and 5th normal forms, you do it because you've got buggery data that's hard to model.
<Fourth [5] and fifth [6] normal forms deal with multi-valued facts. The multi-valued fact may correspond to a many-to-many relationship, as with employees and skills, or to a many-to-one relationship, as with the children of an employee (assuming only one parent is an employee).>

Another one that can be real bitchy is prices for items.  An item has a price.  Ok, that's third normal.  But the price can change over time.  That's less fun.
When you don't think that one all the way through in financials it can be ... unpleasant.
'Let's update the price for X today.'
'Yeah, but then if we re-print yesterday's invoice, the total will be different.'

To make that even less fun, items have prices for time intervals :(
The price you want to display depends on the item and a time value that is between a start and end date.
And then when different customers can have different prices for the same item...it just gets ugly.
This then certainly comes into play.
<Normalization certainly doesn't remove all redundancies. Certain redundancies seem to be unavoidable, particularly when several multivalued facts are dependent rather than independent. >
Avatar of snyperj

ASKER

This is what I did and I am all set.
snyperj,
Just curious.  How is what is a question (what you Accepted) ...an answer to this question?
Part of your question was:

"and how should I save a field if the user has entered no data....?  Something like nz(txt1),"") maybe?"
Which was answered in the first couple of posts.

"This is what I did and I am all set."
What did you do?

mx

You asked your dbadmins for their desired defaults?
What were they and how many fields did you wind up altering?
And did they allow you to have ANY nulls?
@Nick67
>> I say the null results from the reality that the middle name is optional.
You previously stated that nulls mean something is unknown. Are you now suggesting they also be used to represent things which are known - as in when you want to record the known fact that a person doesn't have a middle name? Are there other known facts for which you would suggest using a null or is it just for personal names? If you are using nulls to represent so many different things how do you expect users to know what you mean in each place that you use a null? How do you expect them to get consistent results from such a database?

As you say, the relational model was indeed first described in 1969 (an internal IBM report). The better known paper which brought it a wider audience was the 1970 one "A Relational Model of Data for Large Shared Data Banks". Contrary to what you seem to be implying those papers don't mention nulls or anything like nulls. Codd's model is based on set theory and predicate calculus which don't include anything like nulls and never have done.

A form of nulls are mentioned in Codd's later papers (1979, 1986) and his book (1990) describes I-marks and A-marks but they are quite different from what SQL calls a null. The topic of this question is nulls in SQL, which date from Don Chamberlin and Ray Boyce's work in the 1970s.

This history lesson isn't really relevant but since you brought it up I hope that clarifies the place of nulls in history. You can't blame Codd or Aristotle or any other innocent logician for them!

Re your questions about normal form. I mentioned normal form in my post 36546434 because of its relevance to dealing with snyperj's question. None of the normal forms include nulls because a null doesn't satisfy any dependency (functional, multi-valued or join). 3NF, BCNF, 5NF all are written assuming that tables don't have nulls. Under 3NF/BCNF the FD A->B isn't satisfied if there is no value for B. Similarly, 5NF guarantees that every relation is the join of projections on its superkeys, which cannot be the case with nulls because "nulls don't join". A relational schema can always be made to satisfy 5NF (in most cases without loss of dependencies) and therefore any schema can be made into an equivalent or near-equivalent one without nulls.

If you want to use nulls in conjunction with normalization then I suggest you design a normalized database first and then decide whether and when to add nulls to that design. If you fail to do that then there's a good chance you may miss some dependencies and related problems with your design. In all cases the normalized design without nulls is what should result from accurately modelling the domain of discourse. The nulls are a technical compromise of that.














The OP has moved on and so have I.
You wish to belabor the point that data can be modelled so that the tables will have no nulls.
I have granted you that point for the sake of argument.
If we have a table with individuals, and three separate, related tables with first name, middle name and last name, we can enter the data without using null.
If middle name is optional, then to gather the entire name together we would left join the 'names' tables to the individuals tables.
Functionally, the resulting query would be identical to having a single table with null entries in 'middle name'
Operationally, the single table would probably perform better because left joins can be expensive operations.
I cannot see a valid reason for creating three additional tables and some expensive left joins to get the same results as a single table would provide.
Moreover, since middle name clearly correlates directly with the individual, it should be in the individuals table.

And any end-users who does not intuitively grasp that the null middle name means that the individual doesn't have one, wouldn't give one, or the data entry folks coudln't obtain one is too obtuse to be using a computer.

<I suggest you design a normalized database first>
Very sound advice.  In context, almost as useful as suggesting that someone should breathe
<then decide whether and when to add nulls to that design>
Also very sound advice.  After you have decided that a datum is fully dependent on the table's key is certainly the time to decide whether the datum will always exist and will always be able to be entered at the time new records are created.

<In all cases the normalized design without nulls is what should result from accurately modelling the domain of discourse>
I will politley disagree with that.  Removing an individual's birthday from a table contain data about an individual because it may never be known is not an action required to normalize the data.  You clearly believe the opposite.

Ok.
This discussion has arrived at it's conclusion
After you've made so many blatantly wrong claims about Aristotle, set theory, Codd and the relational model it might be helpful to Snyperj and others if you acknowledged you got those things wrong before "moving on" - especially given the grandstanding way you responded when I tried to correct you. That's just my opinion. I wasn't belabouring anything but simply responding to what you wrote when you made new and unfounded claims with each new comment.

Not clear what new things you are saying now. Are you trying to claim that a table that permits nulls does satisfy 5NF as you seemed to be suggesting it might in a previous post? It really is very simple to see that a join dependency isn't satisfied when nullable attributes are joined. As simple as writing a join query using columns that have nulls. If you still don't see that then I really don't think there is anything I can add.
We don't agree.
Ok.
http://www.databaseanswers.org/codds_rules.htm
Rule #3
<3. Systematic treatment of null values
The DBMS is required to support a representation of "missing information and inapplicable
information" that is systematic, distinct from all regular values
(for example, "distinct from zero or any other number," in the case of numeric values),
and independent of data type.
It is also implied that such representations must be manipulated by the DBMS in a
systematic way.
>

You clearly feel that nulls are an evil, a laziness, and not necessary.
Check.
I think I have support my case well that your postition is incorrect.
You do not.
Let anyone else who arrives at this thread be the judge.
I stand by all that I have written.
You feel that it is rubbish.

Ok.
Anything else I should add?

@dportas feels everything @Nick67 has written in this thread is rubbish

Done.
>> Rule #3
>> 3. Systematic treatment of null values
So now you are quoting a paper that Codd wrote in 1985 about the properties of DBMSs and that is not even about database design. How exactly is that supposed to be relevant to this discussion? It does not support any of your previous assertions.

>> You clearly feel that nulls are an evil, a laziness
>> @dportas feels everything @Nick67 has written in this thread is rubbish
Where did I say those things? More fantasy and lies on your part.

>> I stand by all that I have written
That Aristotle considered nulls and that an empty set is a null? Oh boy.
<<Let anyone else who arrives at this thread be the judge.>>
Well regarding the content of this thread, I'd have to say that what dportas claims makes perfect sense.