?
Solved

How to best represent Inheritance in a MS Sql Server database?

Posted on 2009-02-11
19
Medium Priority
?
427 Views
Last Modified: 2012-05-06
Say i have two classes that inherit from the same base class.  The base class has members A, B, and C.  Class one has members W and X.  Class two has members Y and Z.

What would be the best way to structure my database?  My main concerns are simplifying future updates, simplifying my queries in general and performance.
0
Comment
Question by:abemiester
  • 9
  • 7
  • 3
19 Comments
 
LVL 41

Expert Comment

by:Kyle Abrahams
ID: 23612670
Your database structure has nothing to do with your classes.

Your database structure should be based on the data your storing (google database normalization) . . . your classes just interact with that data.
0
 

Author Comment

by:abemiester
ID: 23612750
I want to store the data from instances of class one and class two in a database.  Hence the question...
0
 
LVL 41

Expert Comment

by:Kyle Abrahams
ID: 23612795
post what data you want to store among all classes.  How do they relate?

I would do a union of all columns and store it in a table.
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 

Author Comment

by:abemiester
ID: 23612878
I want to store all data.
For class 1 this would be: A,B,C,W, and X

For class 2 this would be A,B,C,Y, and Z

Please keep in mind that I have simplified this example.  In reality the two classes could have 95% of their members in common with only 5% being unique.
0
 
LVL 41

Expert Comment

by:Kyle Abrahams
ID: 23614934
I would do
ABC W X  Y Z in a table with WXYZ being nullable.
0
 
LVL 41

Expert Comment

by:Kyle Abrahams
ID: 23615004
scratch that .

ID A B C in Table Widget

setup a foreign key to another table
ID, FK_WidgetID, W,X   Extended Widget

ID, FK_WidgetID, Y,Z   Ultra Widget.


IF (big IF) W,X and Y,Z are of the same datatype and are always going to be the same datatype, you can combine in one table.

ID, FK_Widget, W/Y, X/Z, Type_Of_Widget


 
0
 

Author Comment

by:abemiester
ID: 23616318
yeah those were the two possible structures i came up with too.  I had started to lean towards the one table solution with nullable columns because it wouldn't require joins when selecting data.  Also i believe inserts and updates would be simpler.
0
 
LVL 41

Expert Comment

by:Kyle Abrahams
ID: 23616411
Don't be lazy, do what is right not what is easy.
0
 

Author Comment

by:abemiester
ID: 23616553
How am i being lazy?  Could you please explain the benefits of your solution vs. mine?  In my experience simpler solutions are almost always better and joins can hurt performance...
0
 
LVL 41

Expert Comment

by:Kyle Abrahams
ID: 23616635
Primary keys give you more physical indexes which are better for seeking data.

Say you have the same ABC but different values for each XY
in your table
ID,ABC X1W1Y1Z1
ID,ABC X1W2Y1Z1
etc.

in the 2 table system
ID,ABC

ID, FKID, X1 Y1
ID, FKID, X1 Y2

The data is normalized so you are not repeating any information.  Data size will grow less rapidly and searches will be faster.  Proper joins don't take a hit on performance as much as parsing that many more N rows.


0
 

Author Comment

by:abemiester
ID: 23616833
Ahh i see what you are saying.  Let me clarify my example:
ABC are members of the base class, not the same values.  WX are members of class1 and YZ are members of class 2.

maybe this real life example will help:

class Shape  //base class
{
   double area; //Member A
   string color;  //member B
   int perimeter; //Member C
}

class PolyGon : Shape // class 1
{
   int numberOfSides; //Member W
   string someOtherMember; //Member X
}

class Circle : Shape  // class 2
{
   double radius; //Member Y
   double diameter; //Member Z
}


Above would be an example of the class structure i have.
Now lets say i have the following code snippit

Circle c = new circle();
c.radius = 1.0;
c.diameter = 2.0;
c.SaveToDatabase();

Polygon p = new Polygon();
p.numberOfSides = 5;
p.someOtherMember = "who cares";
p.SaveToDatabase();


This code would ( and should) make two different entries in my database.  I am almost positive that hitting one table instead of hitting three tables with joins will be faster.  And again, the inserts and updates will be simpler...
0
 
LVL 41

Expert Comment

by:Kyle Abrahams
ID: 23617073
Would do 3 tables in this case.

Table Shape
ID, area, string, permiter,

Table PolyGon
Id, fk_shape_id, numSides, oMember

Table Circle
id, fk_shape_id, radius, diameter.

My lazy statement was not meant as an insult, but as encouragement . . . it's simpler/easier to do the 1 table approach, however you're still going to have nulls for circle if you have a polygon and vice versa . . . which in DB land is a "bad" thing if can be avoided.  (Sometimes it can't be, but most times there are ways around it.)

Just my advice though, do as you like.

0
 
LVL 22

Expert Comment

by:dportas
ID: 23627162
I second what Ged has said. The principle at work here is to create one table for each distinct sub class but then to push each attribute up to the highest level that it can belong in. That's the way to minimise redundancy in the hierarchy.
0
 

Author Comment

by:abemiester
ID: 23665299
Thank you for the input Dportas.  Do you think there is anything to the last paragraph on my previous comment:

"This code would ( and should) make two different entries in my database.  I am almost positive that hitting one table instead of hitting three tables with joins will be faster.  And again, the inserts and updates will be simpler..."

If I made those entries into one table I still wouldn't be repeating any data.  There would just be columns with null values...

Ged:
Why are nulls in a database a bad thing?  I'm not trying to be a smart ass I just really didn't know that :)
0
 
LVL 41

Accepted Solution

by:
Kyle Abrahams earned 2000 total points
ID: 23671519
Abe,

No issues.  I actually just read an article in a magazine about this.  Nulls in the "truest" sense represent an unknown.  Consider a table person with a column eye color.  You may not know what a person's eye color is, but you know they have one.  It's information that will be filled out eventually.

Null does not represent a void.  It is IMPOSSIBLE for a circle to have any number of sides.  As such you will always have to treat this data in your code . . . in essence re-defining null.  While it's fine for you and this project, consider a larger application with many programmers working on it.  If a programmer in group A doesn't mention it to group B, nulls can be handled much differently.  

Also, say everyone in group A is replaced or moves on to another project.  Some other group then takes up the charge of group C . . . and is left to figure out how group A re-defined null.

Like I said, it may be fine for you now but bad habits can form which in general should be avoided.  


you said: "If I made those entries into one table I still wouldn't be repeating any data.  There would just be columns with null values.."

While that's correct, you're inserting Extraneous data which doesn't need to be added.  
0
 

Author Closing Comment

by:abemiester
ID: 31545593
Thanks for the engaging discussion!  I think i will use the multi-table solution after all.
0
 
LVL 22

Expert Comment

by:dportas
ID: 23676371
>> Nulls in the "truest" sense represent an unknown

I don't know what "Nulls in the truest sense" means but presumably it does not refer to SQL. Nulls in SQL are not a true and accurate representation of something unknown. If X is an unknown quantity then we still know that X=X is true but that isn't so in SQL: X=X is neither true or false. So if you choose to use nulls to indicate "unknown" in SQL then you are still forced to create extra logic to re-define SQL's intended meaning of null if you want correct answers from your database.
0
 
LVL 41

Expert Comment

by:Kyle Abrahams
ID: 23683927
http://en.wikipedia.org/wiki/Null_(SQL)#Mathematical_and_string_concatenation

"Null was introduced by E. F. Codd as a method of representing missing data in the relational model. Codd later reinforced his requirement that all RDBMS' support Null to indicate missing data in a two-part series published in ComputerWorld magazine.[1][2] Codd also introduced a ternary (three-valued) logic, consisting of the truth values True, False, and Unknown, which is closely tied to the concept of Null. The Unknown truth value is generated whenever Null is compared with any data value, or with another Null.

Codd indicated in his 1990 book The Relational Model for Database Management, Version 2 that the single Null mandated by the SQL standard was inadequate, and should be replaced by two separate Null-type markers to indicate the reason why data is missing. These two Null-type markers are commonly referred to as 'A-Values' and 'I-Values', representing 'Missing But Applicable' and 'Missing But Inapplicable', respectively.[3] Codd's recommendation would have required SQL's logic system be expanded to accommodate a four-valued logic system. Because of this additional complexity, the idea of multiple Null-type values has not gained widespread acceptance."

Null is accepted for the A-Value . . . there is no use in keeping track of the I-value in a relational database when it can be avoided.

null can't be related to X because X is a representation of a value in the range of unknowns.  
0
 
LVL 22

Expert Comment

by:dportas
ID: 23685924
Codd's model of nulls was no better than SQL's model. In both cases many-valued logic demonstrably fails to represent the semantics of "unknown" according to every-day maths, logic or common sense - a fact which Codd admits in his book. Codd's choice of two null marks was not a sound one because there are more than two possible reasons for data to be missing. On the other hand, having only one such mark in SQL isn't helpful either - as evidence the multiple and contradictory uses that are made of null in the SQL language and in SQL databases.

Nulls don't add to the expressive power of any system, ie: there is no information expressible with nulls that isn't equally (or better) expressible without them. Nulls only add complexity and cause the system to generate wrong answers to some queries. Unfortunately the constraints of SQL can make it very hard (though not impossible) to manage without nulls. In a truly relational (non-SQL) system all data can be represented as (non null) values within tuples within relations and the problems inherent in SQL nulls go away.
0

Featured Post

Veeam Disaster Recovery in Microsoft Azure

Veeam PN for Microsoft Azure is a FREE solution designed to simplify and automate the setup of a DR site in Microsoft Azure using lightweight software-defined networking. It reduces the complexity of VPN deployments and is designed for businesses of ALL sizes.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article shows how to get a list of available printers for display in a drop-down list, and then to use the selected printer to print an Access report or a Word document filled with Access data, using different syntax as needed for working with …
Instead of error trapping or hard-coding for non-updateable fields when using QODBC, let VBA automatically disable them when forms open. This way, users can view but not change the data. Part 1 explained how to use schema tables to do this. Part 2 h…
Via a live example, show how to setup several different housekeeping processes for a SQL Server.
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.
Suggested Courses

862 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question