niico
asked on
DeNormalise data - easy?!
I have an old database, currently it stores customer data (along with contact details) and in a related table (on customersID) stores alternate contact information (name, phone email). There are usually 0-3 of these alternate contacts for each customer, though sometimes there are more.
In my new system direct in the customers table there is space for three alternate contacts, which can be filled in or left blank as required. How can I transfer the old format all into one customers table that includes room for the alt contacts - in other words denormalise the data (in a way anyway). Remembering that current customers may have no alternate contacts or over three (in which case the 4th, 5th etc are just disguarded).
Thanks...
In my new system direct in the customers table there is space for three alternate contacts, which can be filled in or left blank as required. How can I transfer the old format all into one customers table that includes room for the alt contacts - in other words denormalise the data (in a way anyway). Remembering that current customers may have no alternate contacts or over three (in which case the 4th, 5th etc are just disguarded).
Thanks...
ASKER
Thanks
Im denormalising basically because the customers table is queried thousands of times a day - to pull records from two tables will hit performance, and complicate things a lot more than pulling from one no?
ok ill give it a go!...
Im denormalising basically because the customers table is queried thousands of times a day - to pull records from two tables will hit performance, and complicate things a lot more than pulling from one no?
ok ill give it a go!...
not necessarily
its a trade off, you've just made the customer row much bigger than necessary , and subject to more update activity
which means less efficient I/O overrall...
how often does the contact information need to be referenced when you reference the customer?
how was the contact information clustered?
a half way house would be to hold an indicator on customer to say that contact information was actually present
or implement the customer / contact information as a view (like my select)
do you perform name searches against this customer table?
if you do I'd expect performance to markedly decrease (depending on your server configuration)
did you do any performance analysis before you considered this route?
its a trade off, you've just made the customer row much bigger than necessary , and subject to more update activity
which means less efficient I/O overrall...
how often does the contact information need to be referenced when you reference the customer?
how was the contact information clustered?
a half way house would be to hold an indicator on customer to say that contact information was actually present
or implement the customer / contact information as a view (like my select)
do you perform name searches against this customer table?
if you do I'd expect performance to markedly decrease (depending on your server configuration)
did you do any performance analysis before you considered this route?
ASKER
in this case it 'felt' like the right thing to do. Everytime the customer record is accessed there will need to be a scan to at least check if alternate contacts exist (your indicator is a good idea but adds even more complexity). Then if they are added or updated it adds to the complexity of the update a lot as well.
It speeds development and simplifies updates. Also we've decided that three alternates is all that's necessary - if we go the normalised route there could be 0-any number of alternatives (though we could artificially limit this)> this requires additional interface complexity etc - same with updating.
ultimately a select with a join will be slower than a select with no join (if the combined row size is the same) - no?
Yes like you say there is a downside with larger row - but overall it makes things a lot easier (performance hasn't been a problem - like you say some analysis would be the absolute best route but in this case is probably not necessary).
Thanks for views - I'll get back to you on the solution.
It speeds development and simplifies updates. Also we've decided that three alternates is all that's necessary - if we go the normalised route there could be 0-any number of alternatives (though we could artificially limit this)> this requires additional interface complexity etc - same with updating.
ultimately a select with a join will be slower than a select with no join (if the combined row size is the same) - no?
Yes like you say there is a downside with larger row - but overall it makes things a lot easier (performance hasn't been a problem - like you say some analysis would be the absolute best route but in this case is probably not necessary).
Thanks for views - I'll get back to you on the solution.
ASKER
its working but creating a new row for each alternate customer id and putting all id's in the a1name column - any ideas? thanks...
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
hi - thanks again
There is a key on the alt contacts table (altid, int) (if not I could add one)> is the above code for without a key?
would a cursor be easier for this? As its code I only need to run once then its done?...
There is a key on the alt contacts table (altid, int) (if not I could add one)> is the above code for without a key?
would a cursor be easier for this? As its code I only need to run once then its done?...
ASKER
also - out of interest, the (column list) select column List - parts of the sql didnt work, I had to remove them to execute the code. This is sql2k sql isnt it?
ASKER
thanks for the input - you seemed to be on the right track. As it was a one off job and I needed it to be done fast I just used a cursor in the end and its worked - thanks again though.
but
insert into newcustomer
(column list)
Select column List
.A.name,a.phone
,A1.name,A1.phone
,A2.name,A2.phone
From OldCustomer as O
Left Outer Join AltContact as A
on O.CustomerID = A.CustomerID
Left Outer Join AltContact as A1
on O.CustomerID = A1.CustomerID
and A1.Name > A.Name
Left Outer Join AltContact as A2
on O.CustomerID = A2.CustomerID
and A2.Name > A1.Name
should do it...