In a survey database, consider two tables, Respondents (rt) (one record for everyone who takes a given survey) and Responses (rs) (one record for each response to each item on any survey). We have about 500K records in rt and 35000K records in rs. Each rt record is related to about 70 rs records. (I.e. a typical survey has 70 items). The database as a whole is keeping track of about two dozen different unique surveys.
The problem comes when fields are added to rs to make sql query times acceptable. For example:
Obviously one field in rs is rtKey, a FK pointing to the PK in some rs record. And one field in rt is suKey, a FK pointing to the survey itself in some third table that defines survey parameters. Another field in rs is rsResponse, a tinyint representing the response on an item, and another is rsItemnr, the number of the survey item for that record (or possibly a pointer to an item bank, but for simplicity let's just say the item number itself.)
The problem: it turns out that if you want to do a simple query such as find the decimal mean score to item 1 on survey 1 across all respondents, you have something like:
onse)) from rs, rt WHERE rs.rtKey = rt.rtKey AND rt.suKey = 1 AND rs.rsItemnr=1
The execution time with a properly indexed SQLServer 2005 database is about 10 seconds over the subset of the 35 million records! And over 90% of that time is resolving the inner join, whose sole purpose is to determine which survey to select. So it is very tempting to denormalize the database and add an extra field to rs containing the survey number. This eliminates the inner join and reduces the execution time to under 1 second.
Thus the question, is it possible to enforce referential integrity, whereby a record in rs containing the survey number and also pointing to a record in rt that also contains the survey number is assured to be pointing to the same survey number? This is not a unique constraint, and doesn't appear to be a foreign key constraint. Is there a way to do this? (The new field would be rs.suKey).
SELECT AVG(CONVERT(decimal,rsResponse)) from rs, rt WHERE rs.rtKey = rt.rtKey AND rt.suKey = 1 AND rs.rsItemnr=1
SELECT AVG(CONVERT(decimal,rsResponse)) from rs WHERE rs.suKey = 1 AND rs.rsItemnr=1