Solved

Data tables - normalised?

Posted on 2004-08-19
4
281 Views
Last Modified: 2006-11-17
We have a database structure to design for NewsItems. They relate to Topics and to Countries. Any NewsItem can relate to nil, one or many countries; and to nil, one or many Topics. Later we need to be able to search for NewsItems as broadly as possible: by country, by topic, by combination of country and topic etc.

I thought we should set up a Country table, a Topic table and a NewsItem table, and then store a record in an Intersection table which had a CountryID, a TopicID and a NewsItemID. So if a NewsItem related to 2 countries and 3 topics there would be six records put into it.

Someone else suggested we do without the Country table, and just treat countries as a kind of topic. The Intersection table would then have just a TopicID and a NewsItemID. So if a NewsItem related to 2 countries and 3 topics there would be five records put into it. This seems to represent a saving.

I am sure there is something inherently wrong with this, but I am blowed if I could come up with it. I muttered some buzzwords like "data cube" and "normalised" so I think I got away with my reputation intact for the moment but I could use some ammunition later. Can anyone point out any obvious flaws in the second structure?

Obviously I am biased toward solutions that support my own thinking, but if someone shows why the second structure IS OK, I will nevertheless (hopefully gracefully) still award points for it, admit I was incorrect at work and ask for the official Office Dunce Cap to wear for the statutary week (we're a pretty enlightened office - we're only burnt at the stake if we make coding errors).

Discussion welcome.  

Martin C
0
Comment
Question by:MartinC
4 Comments
 
LVL 17

Accepted Solution

by:
BillAn1 earned 90 total points
Comment Utility
You really want to have two intersection tables, for your 3 entities NewsItems, Topics & Countries
one that relates NewsItems to Topics, and one that relates NewsItems to Countries
This is better than having one table with 6 records. The only reason for having a table with all 3 in one row is if you wanted to record the fact that certain topic/Country combinations were in the NewsItem, but no others.
e.g. (and I don't know what your data is really like, so excuse my examples) One news Item may have two topics - weather &  food, and be related to three countries - France, Italy and Spain.
Now if you needed to record the fact that weather was discussed for France & Italy only, and food was discussed for Italy and Spain only, then you need to go with your first model of a table with all 3. If however, the News item simply refers to weather in general, and Italy in general, and there is no specific link as to which topics went with which countires, then the correct method is to have two seperate tables.
Now, if the only thing you record about a country is the NAME, you could do without the table, and so have only tables
NewsItems , Topics, NewsItems_Topics and NewsItems_Country. The only drawback is that if you wanted to record any information about a country, e.g. national press contact number, or capital city or whatever, youcouldn't. Also, if a country were to change it's name (as does happen) you would have to update hundres of records, instead of 1.
So, the most correct, normalised view is 2 join tables, for 3 entities.
However, if you are designing for a data warehouse, then you would most probably use the single 'fact' table i.e. your first proposal with 6 records in a join. However, this should really be done as a second layer warehouseing databse, feeding from a more normalised operational one.


0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
Comment Utility
I agree with BillAn1 in all he said!

Just to say that you might think about more complex situations (which bosses really LIKE to ask):
what if you want to search later news related to continents (ie a list of countries)

Store the Continent as country... --> Don't like that really (think about Australia)
Store the Continent as separate table + add a Continent_ID in the country... --> ok, how do you relate a news to the continent itself --> buzzzzz

Ok, here my full-blown complex design:

News Table
  News_ID
  News_Data

Entity Table
  Entity_ID
  Entity_Type (Country, Continent, Topic, Galaxy etc)

News_Relation Table
  Relation_ID
  News_ID
 
Relation_Entity Table
  Relation_ID
  Entity_ID

Ok, how to read the News_Relation and the Relation_Entity tables:
 Every News can have 0 or more News_Relation
 and every of those can relate to 0 (1) or more Entities:

Say you have news 1 that should relate to Europe (Entity_ID = 5)
News_Relation:  Relation_ID = 1, News_ID = 1
Relation_Entity: Relation_ID = 1, Entity_ID = 5

Say you have news 2 that relate to Spain (entity = 7) and Portugal (entity = 8), and on topic Food (entity = 10), which means to store this:
News_Relation: Relation_ID = 2, News_ID = 2
News_Relation: Relation_ID = 3, News_ID = 2
Relation_Entity: Relation_ID = 2, Entity_ID = 7
Relation_Entity: Relation_ID = 2, Entity_ID = 10
Relation_Entity: Relation_ID = 3, Entity_ID = 7
Relation_Entity: Relation_ID = 3, Entity_ID = 10


For your sample (So if a NewsItem related to 2 countries and 3 topics, which no condition between the topics and the countries):
News = 3
News_Relation: relation_id = 4, News_ID = 3
Relation_Entity_ relation_id = 4, Entity_ID = C1
Relation_Entity_ relation_id = 4, Entity_ID = C2
Relation_Entity_ relation_id = 4, Entity_ID = T1
Relation_Entity_ relation_id = 4, Entity_ID = T2
Relation_Entity_ relation_id = 4, Entity_ID = T3

To find all the related news for a given entity:
select n.*
from news n
join News_Relation r
on r.news_id = r.news_id
join Relation_Entity e
on e.relation_id = r.relation_id
where e.entity = <your entity id>

To find news to related to 2 entities:
select n.*
from news n
join News_Relation r
on r.news_id = r.news_id
where r.relation_id in
     ( select e.relation_id
       from Relation_Entity e
       where e.relation_id = r.relation_id
       and e.entity_id in ( <entity_1>, <entity_2> )
       group by e.relation_id
       having count(*) = 2
     )

etc

Cheers



0
 
LVL 4

Expert Comment

by:nmwis70
Comment Utility
Keeping country and topic entities seperate also allows for more flexibilty later on.  You may end up having content other than news items that also needs to relate to a specific country.  It may not make sense to call the destination of that relation a topic.
0
 

Author Comment

by:MartinC
Comment Utility
OK, thanks for all the discussion. Looks like the consensus is that the gains of separating out the Country table are small and therefore not worth the bother. Thanks to all who contributed: points to BillAn1 who got there fustest with the mostest.
0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

Suggested Solutions

Entity Framework is a powerful tool to help you interact with the DataBase but still doesn't help much when we have a Stored Procedure that returns more than one resultset. The solution takes some of out-of-the-box thinking; read on!
I annotated my article on ransomware somewhat extensively, but I keep adding new references and wanted to put a link to the reference library.  Despite all the reference tools I have on hand, it was not easy to find a way to do this easily. I finall…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

6 Experts available now in Live!

Get 1:1 Help Now