Link to home
Start Free TrialLog in
Avatar of niico
niico

asked on

DeNormalise data - easy?!

I have an old database, currently it stores customer data (along with contact details) and in a related table (on customersID) stores alternate contact information (name, phone email). There are usually 0-3 of these alternate contacts for each customer, though sometimes there are more.

In my new system direct in the customers table there is space for three alternate contacts, which can be filled in or left blank as required. How can I transfer the old format all into one customers table that includes room for the alt contacts - in other words denormalise the data (in a way anyway). Remembering that current customers may have no alternate contacts or over three (in which case the 4th, 5th etc are just disguarded).


Thanks...
Avatar of Lowfatspread
Lowfatspread
Flag of United Kingdom of Great Britain and Northern Ireland image

I could ask why denormalise...?

but

insert into newcustomer
  (column list)
 Select column List
          .A.name,a.phone
          ,A1.name,A1.phone
          ,A2.name,A2.phone  
  From OldCustomer as O
  Left Outer Join AltContact as A
   on O.CustomerID = A.CustomerID
 Left Outer Join    AltContact as A1
   on O.CustomerID = A1.CustomerID
 and A1.Name > A.Name
 Left Outer Join    AltContact as A2
   on O.CustomerID = A2.CustomerID
 and A2.Name > A1.Name

should do it...
Avatar of niico
niico

ASKER

Thanks

Im denormalising basically because the customers table is queried thousands of times a day - to pull records from two tables will hit performance, and complicate things a lot more than pulling from one no?

ok ill give it a go!...
not necessarily

its a trade off, you've just made the customer row much bigger than necessary , and subject to more update activity
which means less efficient I/O overrall...

how often does the contact information need to be referenced when you reference the customer?

how was the contact information clustered?
a half way house would be to hold an indicator on customer to say that contact information was actually present
or implement the customer / contact information as a view (like my select)

do you perform name searches against this customer table?
  if you do I'd expect performance to markedly decrease (depending on your server configuration)


did you do any performance analysis before you considered this route?

   
Avatar of niico

ASKER

in this case it 'felt' like the right thing to do. Everytime the customer record is accessed there will need to be a scan to at least check if alternate contacts exist (your indicator is a good idea but adds even more complexity). Then if they are added or updated it adds to the complexity of the update a lot as well.

It speeds development and simplifies updates. Also we've decided that three alternates is all that's necessary - if we go the normalised route there could be 0-any number of alternatives (though we could artificially limit this)> this requires additional interface complexity etc - same with updating.

ultimately a select with a join will be slower than a select with no join (if the combined row size is the same) - no?

Yes like you say there is a downside with larger row - but overall it makes things a lot easier (performance hasn't been a problem - like you say some analysis would be the absolute best route but in this case is probably not necessary).

Thanks for views - I'll get back to you on the solution.
Avatar of niico

ASKER

its working but creating a new row for each alternate customer id and putting all id's in the a1name column - any ideas? thanks...
ASKER CERTIFIED SOLUTION
Avatar of Lowfatspread
Lowfatspread
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of niico

ASKER

hi - thanks again

There is a key on the alt contacts table (altid, int) (if not I could add one)> is the above code for without a key?

would a cursor be easier for this? As its code I only need to run once then its done?...
Avatar of niico

ASKER

also - out of interest, the (column list) select column List - parts of the sql didnt work, I had to remove them to execute the code. This is sql2k sql isnt it?
Avatar of niico

ASKER

thanks for the input - you seemed to be on the right track. As it was a one off job and I needed it to be done fast I just used a cursor in the end and its worked - thanks again though.