Solved

Database Design Ideas

Posted on 2014-03-13
18
292 Views
Last Modified: 2014-03-26
Consider the following tables

tblPerson(PersonId, GroupId, Forname, Initial, Surname, DOB, etc)

A Person belongs to a group (like employee to employer)

tblGroup(GroupId, GroupName)

In the past we have used the following table such that a Supplier is always a Group and Owner, CurrentKeeper and Installer are a People.

tblItem(SupplierId, OwnerId, CurrentKeeperId, InstallerId)

But now we are faced with the fact that Supplier, Owner, CurrentKeeper and Installer can be a Person or a Group!

Ideas please.
0
Comment
Question by:Dodsworth
  • 6
  • 4
  • 4
  • +2
18 Comments
 
LVL 34

Expert Comment

by:PatHartman
ID: 39926445
This is handled by creating an Entity table which is the parent of Person and Group.  Parent and Group become 1-1 with Entity.   The Entity table includes an indicator to specify whether the entity is a Person or a Group and it contains all common attributes.  Attributes unique to a Person or Group are stored in their respective tables.  

ALL relationships to other tables are made to the Entity table.  Parent and Group are only ever related to Entity.
0
 
LVL 1

Author Comment

by:Dodsworth
ID: 39926641
So

EntityTable(Id, IsGroup, Name, Address, GroupId, PersonId)

Group(GroupId, VATNo, CompanyNo)

Person(PersonId, DOB, JobTitle)

?
0
 
LVL 34

Expert Comment

by:PatHartman
ID: 39927443
No.

EntityTable(EntityId, EntityTypeID, Name, Address)

Group(GroupId, EntityID, VATNo, CompanyNo)

Person(PersonId, EntityID, DOB, JobTitle)
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 39927724
I think a "parent"/supertype of Entity with a "child"/"subtype" of Group (OR) Person could make sense here.  There might be better options, but it should be workable.

However, in that case I believe the cleanest design would be to include no detail columns in tblEntity, even columns (currently) present in both subtypes, such as Name and Address.  That's because validation rules could be different between subtypes, even for the same column name.  Besides, you might wish to name a common column something different for one subtype.

Therefore:

tblEntity ( EntityId, SubentityType, "SubentityId" ) --SubentityId will be Group (OR) Person id

tblGroup ( ..., EntityId ) --optional, but I'd "double-link" the tables myself to avoid excessive EntityId lookups
tblPerson ( ..., EntityId ) --optional, but I'd "double-link" the tables myself to avoid excessive EntityId lookups


[Btw, you really should get rid of the "tbl" prefix in the table names; that's an extremely dated practice.]
0
 
LVL 34

Expert Comment

by:PatHartman
ID: 39927845
In a relational database, the parent table never points to the child so SubIdentityID does not belong in tblEntity.  The child tables point to the parent.  tblEntity would only normally be one of the subtypes so you could technically get away with pointing in the wrong direction.  But, if you ever wanted to use the method in a situation where an entity could be multiple subtypes, you'd have to do it right.

The primary key of the child tables would be a long integer that is the foreign key to the parent table.  In this schema since there is no overlap, an entity is only a person or a group but never both, the set of keys in tblGroup would never duplicate any values in tblPerson.

Because the PK of the subtype is the Foreign key to the Entity, there is nothing that prevents you from joining directly to any other table.  For example, even though tblAddress would be related to tblEntity, it could be joined directly to either tblGroup or tblPerson if you were working with a query that needed only one of the entity types.  You could even join to either if you used a left join and some IIf() logic to pick out the common fields.  It is to avoid the IIf() logic that I put common columns in the entity table.
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 39927936
>> In a relational database, the parent table never points to the child so SubIdentityID does not belong in tblEntity. <<

In that design, Entity and Group/Person are not actually parent and child, they are supertype and subtype.


>> tblEntity would only normally be one of the subtypes so you could technically get away with pointing in the wrong direction. <<

In this design, a tblEntity entry must be one and only one of the subtypes.  If an entity can be multiple types, then a different structure, with a separate table, must be used.


>> The primary key of the child tables would be a long integer that is the foreign key to the parent table. <<

It would only need to a bigint if you could ever have more than 2 billion entries.


Probably the easiest design would have the entityId that is assigned be used as the GroupId / PersonId key value also.  But I wasn't sure that could be used in this case because in the legacy data the existing GroupId and PersonId values almost certainly overlap.


>> It is to avoid the IIf() logic that I put common columns in the entity table. <<

You do avoid it just for those common columns, but you also prevent doing proper database validation of those columns because you're forced them all to be in the same column.

For example, say one subtype only allows a name up to 30 bytes but another allows 50 ... now you've got complex rules to implement that instead of just setting the actual column length to the proper value.
0
 
LVL 32

Accepted Solution

by:
Stefan Hoffmann earned 500 total points
ID: 39928660
btw, I'd prefer the model to be as precise as possible. A group cannot be a person. So using DRI otherwise you need triggers:
CREATE TABLE Entities
    (
      EntityID INT NOT NULL ,
      EntityTypeCode NVARCHAR(255) NOT NULL ,
      CONSTRAINT PK_Entities 
		PRIMARY KEY ( EntityID ) ,
      CONSTRAINT UQ_Entities_Superkey 
		UNIQUE ( EntityID, EntityTypeCode ) ,
      CONSTRAINT CK_Entities_EntityTypeCode 
		CHECK ( EntityTypeCode IN ( N'Group', N'Person' ) )
    );

CREATE TABLE Groups
    (
      EntityID INT NOT NULL ,
      EntityTypeCode NVARCHAR(255) NOT NULL ,
      Payload NVARCHAR(255) NOT NULL ,
      CONSTRAINT PK_Groups 
		PRIMARY KEY ( EntityID ) ,
      CONSTRAINT FK_Groups_Entities 
		FOREIGN KEY ( EntityID, EntityTypeCode ) 
		REFERENCES Entities ( EntityID, EntityTypeCode ) ,
      CONSTRAINT CK_Groups_EntityTypeCode 
		CHECK ( EntityTypeCode = N'Group' )
    );

CREATE TABLE Persons
    (
      EntityID INT NOT NULL ,
      EntityTypeCode NVARCHAR(255) NOT NULL ,
      Payload NVARCHAR(255) NOT NULL ,
      CONSTRAINT PK_Persons 
		PRIMARY KEY ( EntityID ) ,
      CONSTRAINT FK_Persons_Entities 
		FOREIGN KEY ( EntityID, EntityTypeCode ) 
		REFERENCES Entities ( EntityID, EntityTypeCode ) ,
      CONSTRAINT CK_Persons_EntityTypeCode 
		CHECK ( EntityTypeCode = N'Person' )
    );

Open in new window

0
 
LVL 1

Author Comment

by:Dodsworth
ID: 39929101
Headache.. I hate fonts with serifs !
0
 
LVL 34

Expert Comment

by:PatHartman
ID: 39929365
In that design, Entity and Group/Person are not actually parent and child, they are supertype and subtype
Semantics.  The supertype does not point to the subtype.
It would only need to a bigint if you could ever have more than 2 billion entries.
I said long int not big int.  ACE does not support bigint.  Long Integer is the data type of autonumbers.  And since this thread is in the Access forum, that would seem to be relevant.
For example, say one subtype only allows a name up to 30 bytes but another allows 50 ... now you've got complex rules to implement that instead of just setting the actual column length to the proper value.
If they don't have the same business rules, they are not common and so would not be in the common table.
0
Backup Your Microsoft Windows Server®

Backup all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

 
LVL 69

Expert Comment

by:ScottPletcher
ID: 39929475
In that design, Entity and Group/Person are not actually parent and child, they are supertype and subtype
Semantics.

There is a huge difference between a parent/child and a "supertype 'is' subtype" relationship that goes well beyond semantics.  Sometimes it can help non-design people to call a "supertype is subtype" similar to a "parent  / child" relationship, because people inherently understand those, but they are vastly different.


If they don't have the same business rules, they are not common and so would not be in the common table.

There could be a new subtype tomorrow that does not
0
 
LVL 1

Author Comment

by:Dodsworth
ID: 39929513
I found this interesting.  What I found even more interesting is that in the 2012 version of AdventureWorks, They've changed the implementation to use a separate key for the Person and Store.

Is there any need for a Type column ? If you have a corresponding record in the group table then you can be sure that it is a group ?

(@Scott I don't know why I added tbl prefixes ! I stopped doing that in code years ago.  There again.. I am an extremely dated practitioner ;)
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 39930717
And since this thread is in the Access forum, that would seem to be relevant.
Not really, it could also mean the author made a mistake by including the Microsoft Access Database topic.   Is there an AdventureWorks 2012 for MS Access?
0
 
LVL 34

Expert Comment

by:PatHartman
ID: 39930830
You've come late to the party Anthony.  Perhaps you should read the original post.  I also think we could do without the snide, unhelpful remarks.  But, to answer your question just in case you are not being snotty (in which case I would apologize), no there is not an Access application that replicates the entire front end to the AdventureWorks database but the Adventureworks database itself is used in some examples of Access applications because it is fairly comprehensive and easy to download.
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 39930899
>> Is there any need for a Type column ? If you have a corresponding record in the group table then you can be sure that it is a group ? <<

I strongly believe so.  For integrity reasons, you want to force the type to be explicitly defined when the entity is created.  Also, if/as you add subtypes -- image 4, 5 or 6 instead of 2 -- you'll very much want to know the subtype immediately.

You might even prefer a design that has only a single table containing all Groups and People (and any other subtypes of the same supertype).  Columns that don't apply can be left null.  The big disadvantage of this, as noted above, is that custom editing for the different subtypes is somewhat harder.  But you avoid having to join to different tables.

For either design, the entity type needs to be explicitly stored.
0
 
LVL 1

Author Comment

by:Dodsworth
ID: 39930981
I included Access, as well as SQL Server, as a topic because I often use Access to prototype and thought that it's inclusion would broaden the scope of the replies.

I'm still not seeing why the entity type needs to be explicitly stored, if both Person and Group exist as entities in there own right with no common columns included in the Entity table.

If the design is flawed, then why would MS have changed AW from a design that included CustomerType to one that didn't ?
0
 
LVL 34

Expert Comment

by:PatHartman
ID: 39931539
Many of Microsoft's samples are flawed.  This is especially true when you look at the samples they create for Access.  No professional would ever create such samples.  They are geared to the lowest common denominator.   When I asked them at an MVP conference, the answer was, they wanted the user to be able to easily substitute SharePoint Lists.  The discussion deteriorated from there.  The AdventureWorks sample is targeted to a higher caliber of developer though so I expect it to be more robust.  I have not examined the entire application so I don't know what difference this made behind the scenes.  You need to look at the front end too.

I have to agree with Scott.  I would include the type field.  If only to give yourself some flexibility for future expansion.  I would also consider a combined table since you have only two types and you seem to think that is extremely likely that there would ever be more than two types.  It sort of depends on how many unique fields you have for each subtype.  If you have the indicator in the supertype table, you can use it to choose which specific form to display.  Without the indicator, the user needs to specify.  This would come into play mostly in reporting/searching/analyzing.  It would be best to be able to combine those actions for the entity types rather than keep them separate.  Of course, when a client adds a record, the indicator would be required immediately so you would know which form to switch to.  So you just have to decide if you want/need to simplify the schema and deal with the differences in the interface or go with a purer schema and a completely independent interface.  But whichever you choose, I think the indicator even though some would consider it redundant, would simply certain processes.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 39931749
I also think we could do without the snide, unhelpful remarks.
I could not agree more and I am glad the thread has got back on track.
0
 
LVL 32

Expert Comment

by:Stefan Hoffmann
ID: 39932483
Whether an indicator is necessary depends on the requirements. Thus the kind of cardinality 1:1 or 1:n (makes no sense in this case) and if you want to ensure data logic as DRI or with triggers. Using an indicator allows us to use DRI. Without you need triggers to ensure 1:1.

Just my 2¢.
0

Featured Post

Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

Join & Write a Comment

International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
Describes a method of obtaining an object variable to an already running instance of Microsoft Access so that it can be controlled via automation.
Viewers will learn how the fundamental information of how to create a table.
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now