Link to home
Start Free TrialLog in
Avatar of Robert Berke
Robert BerkeFlag for United States of America

asked on

referential integrity makes my query run 30 times longer.

--------- summary of problem ----------------

I have a very simple snapshot query which was taking 30 seconds to run. I made it run much faster by dropping referential integrity.
But, that was an ugly solution because I usually prefer referential integrity rules.

For instance, in this application I was using referential integrity to insure that every sales order had a valid Region.  Dropping referential integrity was fairly safe because users choose the Region from a drop down list, but I don't like losing the additional protection of a referential integrity database rule.

I will live with my ugly solution, but I wonder if anybody has some comments about this performance issue?  

Of course, the standard anwer will probably be "switch to SQL Server", but that it is going to have to wait for a long time.


--------- details of problem  ----------------


RegionTable a record for each of our 4 company regions, North, South, East and West.   OrderTable has a records for every order with about 100,000 records.


The original query was very complicated, but the following simplification still takes 30 seconds to run.

    SELECT OrderTable.CustomerNumber
    FROM RegionTable INNER JOIN OrderTable ON RegionTable.region = OrderTable.[region]
    WHERE (((OrderTable.CustomerNumber)=222222) AND ((OrderTable.[Product Year])=2009) AND ((OrderTable.DateRun)>#1/5/2010#));

The two tables are in a Relationship with Join Type of 1 and "Enforce Referential Integrity" turned on.

When I turn off Referential Intergity,  the query runs in 1 second.

The purpose of the relationship was to insure that every OrderTable record had a legitimate Region.   I can understand why that relationship might make and append query run more slowly, but it does not make sense that a Snapshot query would run more slowly.


Relationships between Table1 and table2 are displayed follows: (the OO is my shorthand for the 'infinity' symbol sideways 8.)

RegionTable                OrderTable
Region  (1) ------(OO) Region    
TimeZone                  order number
                                DateRun
                                total sale
                                CustomerNumber
Avatar of Rey Obrero (Capricorn1)
Rey Obrero (Capricorn1)
Flag of United States of America image

see if this make any difference


SELECT OrderTable.CustomerNumber
    FROM OrderTable INNER JOIN RegionTable ON RegionTable.region = OrderTable.[region]
    WHERE (((OrderTable.CustomerNumber)=222222) AND ((OrderTable.[Product Year])=2009) AND ((OrderTable.DateRun)>#1/5/2010#));

also, have you created any index in your table?
Avatar of Robert Berke

ASKER

Swapping the order of the tables did not improve performance. (I would have been shocked if it did)

RegionTable has Region as primary key.  OrderTable has OrderNumber-DateRun as primary key   and Region as secondary key.
Another interesting point --  if the database is moved from the server to a local drive, everything runs very fast.
 
"if the database is moved from the server to a local drive, everything runs very fast"

Then this would be network-related. Check hardware connections, uplug/replug cables, check NICs error logs, etc etc.
ask  your IT to monitor the network connection.
I am the IT manager, and the programmer asked me.  Both of us are stumped.

The problem occurs on all clients (6 computers), so it is not a port issue.  
Removing the referential integrity solves the problem on all computers.

No events on client computer in any of the logs.

On server there a couple of "Success Audit" events but those are information only.

The problem occurs even if client is plugged directly into the server, so it is not a switch issue.

A compact and repair does not help.

All other Access applications run fine.  

Server is 6 months old and was purchased with plenty of power for future needs.  Cpu utilization typically runs below 3%.






SOLUTION
Avatar of Rey Obrero (Capricorn1)
Rey Obrero (Capricorn1)
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
The problem originated in a system with backend database on server and front end database on the clients with dozens of tables/queries/forms/reports/modules.  

Over the last week, I simplified and isolated the problem by combining the back end with front end, then stripped it down. I now have a single database with no links and only 3 components:  2 tables and 1 query.  Since I suspected database corruption, I then recreated that database from scratch by copying the structure, then running an append query to create the data.

This is the simplest system that demonstrates the problem.

I then followed your instructions which essentially asked me return to a backend/frontend model.
When the frontend is on local machine an backend is on server,  things still run slow.
When both data bases are on the local machine things run fast.
   
what os is running on the server?
sbs 2003 sp2
<When the frontend is on local machine an backend is on server,  things still run slow.
When both data bases are on the local machine things run fast. >

This will always be the case. Access is a "file server" type of database, which basically means it's just a text file. All processing is done by the frontend program, so when the FE requests a query Access must pull all the tables across the wire, perform whatever is needed, and then present the data. Obviously if the tables are on the same disc as the FE that would be much quicker since disc read/writes are much faster than processes over the wire.

Compare this to a server-type database, which includes a query parser. Progarms request information and the server processes that info returns a formed dataset to the caller. Access/Jet/ACE don't do this. The only exception to this is if your query includes all indexed fields. In that case, the FE can use only the index tables (which are smaller) and as such performance can be better.

In an Access setup the power of the workstation is more important than the power of the server.

Here's a webpage that may help. The most relevant would be the SMB issue, in my opinion (turn OFF Small Message Blocking):
http://www.granite.ab.ca/access/performancefaq.htm

Also, as Cap said earlier, indexing is very important.
LSMConsulting


The only one I could not try is the SMB issue, because it requires a reboot. I will try that tomorrow evening.

One of the tips mentioned opening a related recordset at login and leaving it open. That is very similar to a "solution" which I came up with in https://www.experts-exchange.com/questions/25860761/When-query-is-first-opened-it-takes-30-seconds-Afterwards-it-is-very-fast.html   It sort of works, the 30 second query happens immediately at login, then goes away for the rest of the session.  But it is almost as ugly as dropping the relationship.  

No luck on all the others. I went though the whole granite.ab.ca list (except, since my test database only has 1 query and two tables, I skipped everthing related to inserts/deletes/development gui/ forms/reports/shared access and other unrelated issue.)
Most of the remaining tips had been implemented long ago (for instance I turned off name tracking about 7 years ago, and periodically review it to make sure it does not get reset accidentally.)  

"I can understand why that relationship might make and append query run more slowly,"
Why? On the contrary.  If they are set up correctly, it should run faster ... which is one of the benefits of having relationships with RI enforced.  Something else is going on.

What happens if you make this a dynaset query?

Is Region the Primary Key of the Region table ?

mx

ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
mx:
query was originally dynaset and was also slow.  In fact, I only changed it to snapshot in an attempt to make it go faster.
RegionTable has Region as primary key.  OrderTable has OrderNumber-DateRun as primary key   and Region as secondary key.

Also, you ask why I thought a relationship might make an append query run more slowly.
Of course, I was assuming the tables were already indexed properly.
Without a relationship, an append into OrderTable is very straightforward. MS Access just appends the records to the end of the database then updates the OrderTable indexes.
With a relationship with referential integrity, appends to OrderTable must first be checked against the related tables which requires extra I/Os and should therefore go slower.
 
"I don't necessarily agree with mx re: "one of the benefits of having relationships with RI enforced". Indexing tends to make data inserts slower and data reads faster, as a general rule, so there is some balance that must go on"

"MS Access just appends the records to the end of the database then updates the OrderTable indexes."

But ... the SQL you posted at the top is a SELECT query, so no indexes are being updated.  Am I missing something?

mx
mx:  The data is slightly proprietary. Can I email it to you?
Sorry, the violates EE rules.  You can only upload here so any expert can participate ...

Did you see my question here @ http:#a31289767 ?

mx
Mx:  No, you are not missing something.  The slow query is Select and Snapshot  so it should not be affected by RI.  Nonetheless, RI still makes things go slower.  That is the whole point of this post  !!!!   Why the heck is RI hurting things ?


LSM:  I agree that too many indexes can clearly hurt update performance.  

The OrderTable in this particular database started off with 8 secondary indexes.  But, I eventually cut that number by 60%, and the problem still continued.


Also, I have seen cases where too many indexes even hurt INQUIRY performance.  I was never quite sure why, but dropping a few indexes actually helped. One theory was that MS Access accidentally used the "wrong" index and ended up taking longer to get at the data. But, I never proved it.
Removing proprietary data will take me a bit of work.  If I put in the effort do you have an SBS 2003 server network to test it on?  Or at least a Server 2003 network?
But you keep talking about 'update performance', which is not the issue here.

There is no argument that indexes are a double edge sword. I just see how that has anything to do with the issue here?

Well, if you can't upload ... don't really know what else to suggest.  I still say something else is going on.

mx
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Removed OrderTable's Pre-existing index and compacted.  Did not help.    The RegionTable's only had one index which was the primary index, so I did not remove that.

I have trimmed the database down so that a zip file is only 1 meg.  I will remove proprietary data and post the results soon.
 
The update performance is a tangent which is not productive. I only mentioned it briefly in my first post, and all subsequent mentions have simply been my responses to questions by the devoted followers of this thread.
Remember to post the mdb that is represents the original issue ... with whatever indexes, etc ... in other words not the one that you just 'Removed OrderTable's Pre-existing index and compacted.'  

So that we can compare avocados to avocados.

mx
Yeah, I a pretty well tuned in as to what people would like.

But, the bad news is that I am running out of time.

I did some mass changes replacing customer name address etc with all Xxxxs.  I changed too much.  Now the query is consistently slow with or without RI.

For what its worth, the production system is working better now that RI has been turned off (just between those two tables, another dozen RI relations remain unaffected.

I may play with this some more next week.

I will still try dropping the SMB and rebooting, and hope to do that tonight.

rberke
" I changed too much.  Now the query is consistently slow with or without RI."
Well ... that's ok ... if you still want to upload.  However, I will have to add a load of records the mess with.

mx
No--the record count has not changed, just the content of the records.  Something very weird is happening and it will have to wait until next week.

By the way, I made those SMB that channges suggested by LSM's link and rebooted.  They did not seem to help.  I returned them to their previous values.

I am closing this problem.  my "solution" was to remove referential integrity from a relation,
AND to put a dummy query into my database open routine.  

I wish I had more time to document details, but I have wasted too much time on this crud already.

For anybody who wants to see some of my false starts, here is a summary of the related problems.

When query is first opened it takes 30 seconds. Afterwards it is very fast.
   https://www.experts-exchange.com/questions/25860761/When-query-is-first-opened-it-takes-30-seconds-Afterwards-it-is-very-fast.html

How to reoganize all tables sequencing them by primary key?
    https://www.experts-exchange.com/questions/25877487/How-to-reoganize-all-tables-sequencing-them-by-primary-key.html

referential integrity makes my query run 30 times longer.
    https://www.experts-exchange.com/questions/25959786/referential-integrity-makes-my-query-run-30-times-longer.html
I feel bad abandoning this after having taken so much time for the contributing experts.

Unfortuantely I can't justify spending more time when my workarounds have eliminated the problem.

Thanks for everybody's help

"Author Comments:
I feel bad abandoning this after having taken so much time for the contributing experts.

Unfortuantely I can't justify spending more time when my workarounds have eliminated the problem.

Thanks for everybody's help"

You are welcome.

mx
Minor change:

The login form still stays open and invisible throughout the entire session.

But, originally it made things run fast by having the open event issue a Docmd.openquery "myQuery", acviewnormal, acreadonly.

But, since the open query was visible, it  would occasional appear to a user when other forms got closed.

To avoid the query showing, we changed the open event to issue a set rs = db.openrecordset("myQuery",dbopendynaset)  instead of the openquery.

The new code works just as well as the first solution. I tested it with both referential integrity both on and off, and it improves performance in both situations.