Database Design

I have a very unusual situation and wanted to get some thoughts from people on how they might handle this. Normally when you build a product database you have a table of items and in that table you might have fields that have the following information:

Name
Manufacturer (int)
Size
Weight
Color

And then you might create another table with manufacturers and then use the PK_ID field to fill the manufacturers field in the table above.

However, what if multiple manufacturers make an item. I have a situation where a table consists of about 5.9 million rows. The first table has all the basic fields in it like Name, Size, Weight, Color and that is consistant across all the items regardless of manufacturer. However, there can be a number of manufacturers and they all have unique data of their own that must be displayed in a single result set. So a release date might be a field, since maybe the manufacturers released the items at different times. So far I have had to have two 5.9 million row tables join each other by a PK and a FK in order to have multiple manufacturers for given item, but this is incredibly slow.

Anybody ever encounter this issue before? Any thoughts on how this should be setup.
tarrigoAsked:
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

x
 
dis1931Connect With a Mentor Commented:
Hello tarrigo,

I'm no db expert but my thoughts are you would have three tables..

Table 1 -  Manufacturer ID, Manufacturer Name
Table 2 - Manufacturer ID, Item ID
Table 3 - Item ID, Item Name, Size, Weight, Color

Say you have the same item let's say item 23 that is made by three manufacturers you won't duplicate any rows....The item exists once and each manufacturer exists once.  Then Table 2 links those tables to be useful.  Your queries, reports, etc will have to take this into account or alternately you could use a view to abstract this from anyone outside of the dba.

Thanks,

dis1931
0
 
chilternPCCommented:
sounds like you are trying to design a 'one size fits all' database for all your manufacturers.
maybe each manufacturer has its own database(table) design since they all seem to have different criteria and variables.
0
 
dqmqConnect With a Mentor Commented:
Seems to me you have two issues.  One easy, one more challenging.

The original tables you described represent a 1:many relationship between manufacturer and item--a manufacture produces many items, but an item only has one manufacturer.   But what you want is a many:many--a manufacturer produces many items and and item is produced by many manufacturers.  To represent that, use an "intersection" or "associative" table (two names for the same thing).  It looks like this:

Manufacturer
  ManufacturerID
  ManufacturerName
 
Item
  ItemID
  ItemName
  Weight
  Color

ManufacturerItem
  ItemID
  ManufacturerID

That design is standard practice and very common. It's also the easy answer. Hard answer to follow.


0
Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

 
tarrigoAuthor Commented:
Yeah, I am in a way. You should know that the table has nothing to do with color and sizes and so on, just creating a shortened example. I am just a bit perplexed by it because I have never come across an item in the real world like a car that is made by multiple manufacturers. Usually it is just one manufacturer and so you could easily ask through a query to see all cars that match a given manufacturer. However the result that needs to come back would show all the size, name, and then all the specifics about the manufacturer.

So you might see a result like.

Name                                 Color            Weight                    Release Date
Chevrolet Sierra                Blue             10,000 lbs.              March 2008
Chevrolet Tahoe                Blue             10,000 lbs.              December 2007
0
 
xplinscottConnect With a Mentor Commented:
From what it sounds like, you have a Product table where some products only differ by their manufacturer id? Well in that case extract the manufacturer id from the table and create a third table called ProductJoinManufacturer consisting of a unique index on ProductID and ManufacturerID. If this is truly the case, this should cut down the size of your intial Product table seeming that it would be more normalized.
0
 
tarrigoAuthor Commented:
In response to dgmg. A linking table was my first choice, but the amount of items being 5.9+ plus is makeing it too slow. When I bring the manufacturer data into the main table, it is very fast because there is no join. I am considering denormalizing for performance.
0
 
dqmqCommented:
The harder part of your question is this.  What if you want to store different information about some items than others?  And what if the information varies by manufacturer?  Those are interesting and challenging design problems.

Your example about release date is a case where different manufacturers have a different value for the same item.  That's easy--you put that field in the association table.

But what if some items have fields that others do not.  For example, a cell phone has battery life, but a land phone does not.  There are several ways to handle it:

1.  Generalization hierarchy
2.  Nullable fields (some people avoid at all costs)
3.  User defined fields (also adamantly avoided in some circles)
4.  More granular entity definitions








0
 
dqmqCommented:
>In response to dgmg. A linking table was my first choice, but the amount of items being 5.9+ plus is makeing it too slow. When I bring the manufacturer data into the main table, it is very fast because there is no join. I am considering denormalizing for performance.

I seriously doubt you need to denormalize that for performance. There are better ways to address the performance issue.  With good indexing, decent SQL, and possibly partitioning, the joins should perform quite adequately.

In fact, I rather suspect that kind of denormalization may hinder performance for some kinds of access--certainly for updates of product information where you then have to propagate changes to multiple rows.  Consider that you will be repeating all the common product information for each manufacturer that produces the product.

 
0
All Courses

From novice to tech pro — start learning today.