• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 554
  • Last Modified:

NULL in a date field

I have a table with several date field. The particular field is a timestamp with default '000-00-00 00:00:00'. There are two row types. One row type always has a date int the field in question. The other type presently has '000-00-00 00:00:00' because no value has been loaded there. There is an index on this field.

It is our intention at some point to also populate this field for the other row type.

My question is the following: would it be better to change the default zero to a null?
0
jasimon9
Asked:
jasimon9
  • 4
  • 3
  • 3
  • +1
1 Solution
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
both are equal afaik,
only small differences:
* using zeros will use more space, but as you want to fill in a non-null value anyhow, the zeros approach is better (as the row size does not change)
* when you query for the zeros values, the index can be used, for null values this is not possible (as indexes don't store information about the rows with the null value for the indexed field), but as you probably identity the rows with some other fields which are probably indexed, this doesn't matter either.

so, I claim the zeros solution to be better in your situation
0
 
AlThePal7Commented:
What I tend to do is set the default value to a value which the user will never use e.g. 1970-01-01 00:00:00.  I found that not having a default caused some problems when trying to update only certain fields in a record.  By using a valid but 'impossible'  value you can still easily remove these fields from reports etc.
0
 
jasimon9Author Commented:
Interesting comments, but both are in the "not too big a deal" category. Nuances rather than "killers."
0
Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

 
NovaDenizenCommented:
angelIII:
I have to disagree with you about NULLs in an index.  Using an 'IS NULL' clause in a query on an indexed field will make efficient use of the index.  In fact, with UNIQUE indexes multiple records are allowed to have NULLs in the indexed field (except in engine=BDB tables).  See http://dev.mysql.com/doc/refman/5.0/en/create-index.html

There's only one aspect I can think of that exceeds the "nuance" category, and possibly rises as high as an 'annoyance'.

Say you want to search for records older than a year.  If you're using nulls, you can just do "WHERE date < DATE_SUB(CURDATE(), INTERVAL 1 YEAR)", thanks to tri-value logic.  If you're using zero-dates, you have to do "WHERE date < DATE_SUB(CURDATE(), INTERVAL 1 YEAR) AND date > '0000-00-00 00:00:00'"

A "nuance":
When a field is defined as NOT NULL, the record just contains the constant-size field.  When the field is defined as possibly being NULL, there is a cost of an extra byte for each possibly-null field, needed to hold the null-or-not-null flag.  So records with possibly-null dates will take up an extra byte.
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
>I have to disagree with you about NULLs in an index.
here is what I said:
... as indexes don't store information about the rows with the null value for the indexed field

this does NOT mean that you cannot have NULL values in the column that is indexed, but ONLY that the ROWS that have a NULL value in that field are NOT actually indexed.
hence, if you search for WHERE FIELD IS NULL, a full table scan will be the result.

AFAIK
0
 
NovaDenizenCommented:
I have an InnoDB table with a few thousand records, with a field about 50% populated with NULLs.  I added an index on that field, and did an "EXPLAIN SELECT * FROM mytable where field IS NULL", and it indicated that the index would be used.  I dropped the index, ran the same EXPLAIN, and it said it would use a full table scan on all rows.  This is on a 4.1 mysqld.

I think this indicates that rows with null values in indexed fields are actually included in the index.
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
>I have an InnoDB table with a few thousand records, with a field about 50% populated with NULLs.
how many rows? I would expect for a large table that the index is not used, because looking up 50% of the rows using an index would not make sense...
Looks like I am wrong.
0
 
NovaDenizenCommented:
roughly 10000, 4500 nulls.

You don't need that many to see the effect with explain, though.

create temporary table a (a int);
insert into a values (if(rand() < 0.5, NULL, rand() * 100));
repeat that insert about 10 times
0
 
jasimon9Author Commented:
Great discussion, but as far as my question goes, still no "slam dunk" either way. Either way was advantages.
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
There will be no "slam dunk" anyhow. just a matter of choice here.

Well, you COULD think of putting the 2 types or rows into 2 different tables...
with of course some overhead when programming etc.

With other database products like oracle and sql server 2005, you can make partitionned tables where this kind of things could be "more" effectful, but still no great thing...
0
 
jasimon9Author Commented:
the "2 types" or subtypes, is a HUGE design matter (meaning very significant) which is already been fully optimized. For the reason stated above by angelIII (that eventually all will have dates), that the 'zero' solution is better.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 4
  • 3
  • 3
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now