We use referential integrity to validate data. But does that really help or hinder performance? The answers to these questions are Yes, they can and No, they don’t. Foreign key constraint improve performance at the time of reading data but at the same time it slows down the performance at the time of inserting / modifying / deleting data.
In case of reading the query, the optimizer can use foreign key constraints to create more efficient query plans as foreign key constraints are pre declared rules. This usually involves skipping some part of the query plan because for example the optimizer can see that because of a foreign key constraint, it is unnecessary to execute that particular part of the plan.
Let's take an example to understand the behavior with a foreign key constraint.
Create the following two tables...
create table Employee(EmployeeID int primary key) create table EmployeeOrder(OrderID int primary key, EmployeeID int not null constraint fkOrderCust references Employee(EmployeeID))
You can notice that optimizer did not access Employee table and is not shown in execution plan. This is because the optimizer knows that it is not necessary to execute the EXISTS operator in this query because the foreign key constraint(Trusted constraint) requires all EmployeeOrders to refer to an existing Employee, which is what the WHERE clause checks.
Now drop EmployeeOrder table by executing drop table EmployeeOrder and recreate the table with the following query without foreign key constraint
Create table EmployeeOrder(OrderID int primary key, EmployeeID int not null )
Now execute the above select query again and see execution plan.
You can see this time optimizer executes the EXISTS operator and Employee table is shown in execution plan. This is because no foreign key constraint was found and SQL Server could not be sure that all orders actually have valid employee references. Therefore it had to execute the EXISTS operator.
I have found subtreecost of the first query was 0.0376 and the second one is 0.0443. Remember these are empty tables. For a large table, this can make a huge difference in performance.
But for any DML operation i.e. Insert, Update & Delete, foreign key constraint degrades performance as SQL Server needs to validate data with primary table's column. However, the data is referentially correct and that can save some additional time with queries being able to rely on data integrity.
Comments (1)
Commented:
additional question to the Foreign key is: Do we need to create an index on columns having foreign key constraint defined?