Solved

Database Design Ideas

Posted on 2014-03-13
18
296 Views
Last Modified: 2014-03-26
Consider the following tables

tblPerson(PersonId, GroupId, Forname, Initial, Surname, DOB, etc)

A Person belongs to a group (like employee to employer)

tblGroup(GroupId, GroupName)

In the past we have used the following table such that a Supplier is always a Group and Owner, CurrentKeeper and Installer are a People.

tblItem(SupplierId, OwnerId, CurrentKeeperId, InstallerId)

But now we are faced with the fact that Supplier, Owner, CurrentKeeper and Installer can be a Person or a Group!

Ideas please.
0
Comment
Question by:Dodsworth
  • 6
  • 4
  • 4
  • +2
18 Comments
 
LVL 34

Expert Comment

by:PatHartman
ID: 39926445
This is handled by creating an Entity table which is the parent of Person and Group.  Parent and Group become 1-1 with Entity.   The Entity table includes an indicator to specify whether the entity is a Person or a Group and it contains all common attributes.  Attributes unique to a Person or Group are stored in their respective tables.  

ALL relationships to other tables are made to the Entity table.  Parent and Group are only ever related to Entity.
0
 
LVL 1

Author Comment

by:Dodsworth
ID: 39926641
So

EntityTable(Id, IsGroup, Name, Address, GroupId, PersonId)

Group(GroupId, VATNo, CompanyNo)

Person(PersonId, DOB, JobTitle)

?
0
 
LVL 34

Expert Comment

by:PatHartman
ID: 39927443
No.

EntityTable(EntityId, EntityTypeID, Name, Address)

Group(GroupId, EntityID, VATNo, CompanyNo)

Person(PersonId, EntityID, DOB, JobTitle)
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 39927724
I think a "parent"/supertype of Entity with a "child"/"subtype" of Group (OR) Person could make sense here.  There might be better options, but it should be workable.

However, in that case I believe the cleanest design would be to include no detail columns in tblEntity, even columns (currently) present in both subtypes, such as Name and Address.  That's because validation rules could be different between subtypes, even for the same column name.  Besides, you might wish to name a common column something different for one subtype.

Therefore:

tblEntity ( EntityId, SubentityType, "SubentityId" ) --SubentityId will be Group (OR) Person id

tblGroup ( ..., EntityId ) --optional, but I'd "double-link" the tables myself to avoid excessive EntityId lookups
tblPerson ( ..., EntityId ) --optional, but I'd "double-link" the tables myself to avoid excessive EntityId lookups


[Btw, you really should get rid of the "tbl" prefix in the table names; that's an extremely dated practice.]
0
 
LVL 34

Expert Comment

by:PatHartman
ID: 39927845
In a relational database, the parent table never points to the child so SubIdentityID does not belong in tblEntity.  The child tables point to the parent.  tblEntity would only normally be one of the subtypes so you could technically get away with pointing in the wrong direction.  But, if you ever wanted to use the method in a situation where an entity could be multiple subtypes, you'd have to do it right.

The primary key of the child tables would be a long integer that is the foreign key to the parent table.  In this schema since there is no overlap, an entity is only a person or a group but never both, the set of keys in tblGroup would never duplicate any values in tblPerson.

Because the PK of the subtype is the Foreign key to the Entity, there is nothing that prevents you from joining directly to any other table.  For example, even though tblAddress would be related to tblEntity, it could be joined directly to either tblGroup or tblPerson if you were working with a query that needed only one of the entity types.  You could even join to either if you used a left join and some IIf() logic to pick out the common fields.  It is to avoid the IIf() logic that I put common columns in the entity table.
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 39927936
>> In a relational database, the parent table never points to the child so SubIdentityID does not belong in tblEntity. <<

In that design, Entity and Group/Person are not actually parent and child, they are supertype and subtype.


>> tblEntity would only normally be one of the subtypes so you could technically get away with pointing in the wrong direction. <<

In this design, a tblEntity entry must be one and only one of the subtypes.  If an entity can be multiple types, then a different structure, with a separate table, must be used.


>> The primary key of the child tables would be a long integer that is the foreign key to the parent table. <<

It would only need to a bigint if you could ever have more than 2 billion entries.


Probably the easiest design would have the entityId that is assigned be used as the GroupId / PersonId key value also.  But I wasn't sure that could be used in this case because in the legacy data the existing GroupId and PersonId values almost certainly overlap.


>> It is to avoid the IIf() logic that I put common columns in the entity table. <<

You do avoid it just for those common columns, but you also prevent doing proper database validation of those columns because you're forced them all to be in the same column.

For example, say one subtype only allows a name up to 30 bytes but another allows 50 ... now you've got complex rules to implement that instead of just setting the actual column length to the proper value.
0
 
LVL 33

Accepted Solution

by:
ste5an earned 500 total points
ID: 39928660
btw, I'd prefer the model to be as precise as possible. A group cannot be a person. So using DRI otherwise you need triggers:
CREATE TABLE Entities
    (
      EntityID INT NOT NULL ,
      EntityTypeCode NVARCHAR(255) NOT NULL ,
      CONSTRAINT PK_Entities 
		PRIMARY KEY ( EntityID ) ,
      CONSTRAINT UQ_Entities_Superkey 
		UNIQUE ( EntityID, EntityTypeCode ) ,
      CONSTRAINT CK_Entities_EntityTypeCode 
		CHECK ( EntityTypeCode IN ( N'Group', N'Person' ) )
    );

CREATE TABLE Groups
    (
      EntityID INT NOT NULL ,
      EntityTypeCode NVARCHAR(255) NOT NULL ,
      Payload NVARCHAR(255) NOT NULL ,
      CONSTRAINT PK_Groups 
		PRIMARY KEY ( EntityID ) ,
      CONSTRAINT FK_Groups_Entities 
		FOREIGN KEY ( EntityID, EntityTypeCode ) 
		REFERENCES Entities ( EntityID, EntityTypeCode ) ,
      CONSTRAINT CK_Groups_EntityTypeCode 
		CHECK ( EntityTypeCode = N'Group' )
    );

CREATE TABLE Persons
    (
      EntityID INT NOT NULL ,
      EntityTypeCode NVARCHAR(255) NOT NULL ,
      Payload NVARCHAR(255) NOT NULL ,
      CONSTRAINT PK_Persons 
		PRIMARY KEY ( EntityID ) ,
      CONSTRAINT FK_Persons_Entities 
		FOREIGN KEY ( EntityID, EntityTypeCode ) 
		REFERENCES Entities ( EntityID, EntityTypeCode ) ,
      CONSTRAINT CK_Persons_EntityTypeCode 
		CHECK ( EntityTypeCode = N'Person' )
    );

Open in new window

0
 
LVL 1

Author Comment

by:Dodsworth
ID: 39929101
Headache.. I hate fonts with serifs !
0
 
LVL 34

Expert Comment

by:PatHartman
ID: 39929365
In that design, Entity and Group/Person are not actually parent and child, they are supertype and subtype
Semantics.  The supertype does not point to the subtype.
It would only need to a bigint if you could ever have more than 2 billion entries.
I said long int not big int.  ACE does not support bigint.  Long Integer is the data type of autonumbers.  And since this thread is in the Access forum, that would seem to be relevant.
For example, say one subtype only allows a name up to 30 bytes but another allows 50 ... now you've got complex rules to implement that instead of just setting the actual column length to the proper value.
If they don't have the same business rules, they are not common and so would not be in the common table.
0
Enterprise Mobility and BYOD For Dummies

Like “For Dummies” books, you can read this in whatever order you choose and learn about mobility and BYOD; and how to put a competitive mobile infrastructure in place. Developed for SMBs and large enterprises alike, you will find helpful use cases, planning, and implementation.

 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 39929475
In that design, Entity and Group/Person are not actually parent and child, they are supertype and subtype
Semantics.

There is a huge difference between a parent/child and a "supertype 'is' subtype" relationship that goes well beyond semantics.  Sometimes it can help non-design people to call a "supertype is subtype" similar to a "parent  / child" relationship, because people inherently understand those, but they are vastly different.


If they don't have the same business rules, they are not common and so would not be in the common table.

There could be a new subtype tomorrow that does not
0
 
LVL 1

Author Comment

by:Dodsworth
ID: 39929513
I found this interesting.  What I found even more interesting is that in the 2012 version of AdventureWorks, They've changed the implementation to use a separate key for the Person and Store.

Is there any need for a Type column ? If you have a corresponding record in the group table then you can be sure that it is a group ?

(@Scott I don't know why I added tbl prefixes ! I stopped doing that in code years ago.  There again.. I am an extremely dated practitioner ;)
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 39930717
And since this thread is in the Access forum, that would seem to be relevant.
Not really, it could also mean the author made a mistake by including the Microsoft Access Database topic.   Is there an AdventureWorks 2012 for MS Access?
0
 
LVL 34

Expert Comment

by:PatHartman
ID: 39930830
You've come late to the party Anthony.  Perhaps you should read the original post.  I also think we could do without the snide, unhelpful remarks.  But, to answer your question just in case you are not being snotty (in which case I would apologize), no there is not an Access application that replicates the entire front end to the AdventureWorks database but the Adventureworks database itself is used in some examples of Access applications because it is fairly comprehensive and easy to download.
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 39930899
>> Is there any need for a Type column ? If you have a corresponding record in the group table then you can be sure that it is a group ? <<

I strongly believe so.  For integrity reasons, you want to force the type to be explicitly defined when the entity is created.  Also, if/as you add subtypes -- image 4, 5 or 6 instead of 2 -- you'll very much want to know the subtype immediately.

You might even prefer a design that has only a single table containing all Groups and People (and any other subtypes of the same supertype).  Columns that don't apply can be left null.  The big disadvantage of this, as noted above, is that custom editing for the different subtypes is somewhat harder.  But you avoid having to join to different tables.

For either design, the entity type needs to be explicitly stored.
0
 
LVL 1

Author Comment

by:Dodsworth
ID: 39930981
I included Access, as well as SQL Server, as a topic because I often use Access to prototype and thought that it's inclusion would broaden the scope of the replies.

I'm still not seeing why the entity type needs to be explicitly stored, if both Person and Group exist as entities in there own right with no common columns included in the Entity table.

If the design is flawed, then why would MS have changed AW from a design that included CustomerType to one that didn't ?
0
 
LVL 34

Expert Comment

by:PatHartman
ID: 39931539
Many of Microsoft's samples are flawed.  This is especially true when you look at the samples they create for Access.  No professional would ever create such samples.  They are geared to the lowest common denominator.   When I asked them at an MVP conference, the answer was, they wanted the user to be able to easily substitute SharePoint Lists.  The discussion deteriorated from there.  The AdventureWorks sample is targeted to a higher caliber of developer though so I expect it to be more robust.  I have not examined the entire application so I don't know what difference this made behind the scenes.  You need to look at the front end too.

I have to agree with Scott.  I would include the type field.  If only to give yourself some flexibility for future expansion.  I would also consider a combined table since you have only two types and you seem to think that is extremely likely that there would ever be more than two types.  It sort of depends on how many unique fields you have for each subtype.  If you have the indicator in the supertype table, you can use it to choose which specific form to display.  Without the indicator, the user needs to specify.  This would come into play mostly in reporting/searching/analyzing.  It would be best to be able to combine those actions for the entity types rather than keep them separate.  Of course, when a client adds a record, the indicator would be required immediately so you would know which form to switch to.  So you just have to decide if you want/need to simplify the schema and deal with the differences in the interface or go with a purer schema and a completely independent interface.  But whichever you choose, I think the indicator even though some would consider it redundant, would simply certain processes.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 39931749
I also think we could do without the snide, unhelpful remarks.
I could not agree more and I am glad the thread has got back on track.
0
 
LVL 33

Expert Comment

by:ste5an
ID: 39932483
Whether an indicator is necessary depends on the requirements. Thus the kind of cardinality 1:1 or 1:n (makes no sense in this case) and if you want to ensure data logic as DRI or with triggers. Using an indicator allows us to use DRI. Without you need triggers to ensure 1:1.

Just my 2¢.
0

Featured Post

Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Why is this different from all of the other step by step guides?  Because I make a living as a DBA and not as a writer and I lived through this experience. Defining the name: When I talk to people they say different names on this subject stuff l…
For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.
Access reports are powerful and flexible. Learn how to create a query and then a grouped report using the wizard. Modify the report design after the wizard is done to make it look better. There will be another video to explain how to put the final p…

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now