Link to home
Start Free TrialLog in
Avatar of Robert Berke
Robert BerkeFlag for United States of America

asked on

How to reoganize all tables sequencing them by primary key?

Are there any utilities that will do this automatically reorganize/reorg an ms access database.

In the good old days of Mainframes and IBM, the word "reorg" meant compact, repair, and resequence by primary key.

Of course, in the modern world we all have computers that are so fast that no one bothers to think about things like that.

Except when an applications starts to be very slow.

Compact and repair does NOT seem to resequence the tables.

To do it manaully, I copy Table1's structure to Table1New, then I write an append query with ORDER BY with the primary key listed.
After that runs, I disconnect Table1 from all the relationships, delete Table1, and rename Table1New back to Table1.

Of course, that is really impractical for a database with a bunch of tables.  Plus, heaven help me if the relationships include referential integrity.

Of course, I think the answer will be "Switch to SQL Server" which we may do. Wo, does SQL server have such a function?
ASKER CERTIFIED SOLUTION
Avatar of DatabaseMX (Joe Anderson - Former Microsoft Access MVP)
DatabaseMX (Joe Anderson - Former Microsoft Access MVP)
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi,

as you already said: In modern computers a database like Access should be fast enough to handle all necessary requests fast enough. I would say, if you have performance problems then sorting the table physically will not really boost the performance. You should analyze the database if there can be done something in the design to improve performance. In most cases it can.

In SQL Server (which handles a lot more data and users than Access) normally a primary key column is saved as clustered index which means that this is the physical order of the table - for this reason a clustered index should have values like automatically saved with an autoincrement feature (called "identity column" in SQL Server). But it is possible to use other column types as clustered index (of course only one clustered index per table) so SQL Server reorganizes itself everytime a record is saved.
I'm not sure if Access does the same with it's primary keys but as this is a desktop database and not meant as a database server (although you can use it with multiple users) I would guess it will not. Yes, you can change the order of a table by opening a table and save the order, but I don't believe that the physical order will be changed, only an attribute is saved in the system tables (like the width of the columns etc.).

For some informations about clustered indexes in SQL Server you can read here:

http://www.sql-server-performance.com/articles/per/index_data_structures_p1.aspx

Cheers,

Christian
This little bit of code will reset all the ordering of all tables to the primary key:

Public Sub ResetOrder()
  Dim db as Database, td as table def
  For each td in db.TableDefs
    td.Properties("OrderByOn")=False
  Next td
  db.TableDefs.Refresh
End Sub
gRay ... tables don't have an inherent OrderByOn property only OrderBy.

mx
mx:  from A2003

set db = currentdb
for each prop in db.TableDefs("customers").Properties:? prop.name:next prop
Name
Updatable
DateCreated
LastUpdated
Connect
Attributes
SourceTableName
RecordCount
ValidationRule
ValidationText
ConflictTable
ReplicaFilter
Description
FilterOn
OrderOn
DatasheetGridlinesBehavior
OrderByOn   '<---  what's this??
Orientation
NameMap
GUID
DefaultView
OrderBy
Humm ... doesn't show on property sheet.

mx
Well then ... seems you would want to do:

td.Properties("OrderBy")= ""
td.Properties("OrderByOn")=False

mx
No, its worse than that.  If a table has never been 'sorted', both fields do not exist.  Only after a sort has been made are the properties created.  Once created, the property tblName.fldName  cannot have a null or "".  Ideas?
both fields -> both properties   -   sorry
Well, you can create the properties ahead of time with that standard create property procedure.  Then you code checks each time to see if prop exists, if not ... create it (just happens once).

Yeah ... it's the old bit that you SEE IT on the prop sheet (OrderBy) but ... it don't exist yet, lol.  crazy!

mx
Does one need to reorder tables? Are tables used in queries in order of creation? Any performance benifits from doing so?
I have to agree that there is no real purpose of worrying about the sort order of a raw table.

mx
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Robert Berke

ASKER

The table's orderBy and orderByOn CANNOT be related to the physical order of the database.  Since a Back End database can be shared, two different users can ask for different sort sequences at the SAME TIME.  Since the the Back End database only has one physical table, both users cannot be satisfied.  So, to resolve the situation,  Access creates gives each viewer an invisible query that forces the data to display in the correct order. Each users gets a different invisible query.  Also, since each user has his or her own Front End database, each can "save" the table's view, and see their preferred view the next time they open it.  The OrderBy property gets stored in the Front End Database, not the Back End database

And, even if you ignore the Front End database, and open the Back End database directly Access still uses an invisible query. The only way I know of altering the physical sequence is the method I outlined in the initial post.

So, the talk about dropping the orderBy clause was interesting but not useful for my purposes.

Also, Bitsqueezer is absolutely right that clustered indexes exist for a reason. I used to be the database manager on a large company's IBM mainframe, and I can definitively say that YES the physical sequence of data sometimes dramatically affects performance.   That was 15 years ago and more modern database systems now minimize the physical sequence impact, but it still matters.

In my particular Access situation for one specific query, a properly sequenced table performs nearly 30 times faster than the unsequenced data.    Other queries are nearly unaffected.
When I first posted, I thought my original question was very clear, but I now realize I should have included an important word. It should have said "How to PHYSICALLY reoganize all tables sequencing them by primary key?"

I believe the answer is "this cannot be done automatically without a major amount of effer.  Even with Vba, there are many steps that would pose huge challanges."



SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi mx,

"There is absolutely no logical or good reason to have a back end physical table pre-sorted in some manner."

I thought I already answered this above. If there would be no logical readon could you tell us why SQL Server has a clustered server (and I'm sure this is not only the case in SQL Server)? Your own knowledgebase article quote stated that even Access sorts the table physically after "compact & repair" (I didn't knew that but it makes sense as Access always creates a temporary copy during compact & repair).

See here for further informations:
http://msdn.microsoft.com/en-us/library/aa933131(SQL.80).aspx

and to quote from there:
"Clustered indexes are also efficient for finding a specific row when the indexed value is unique. For example, the fastest way to find a particular employee using the unique employee ID column emp_id is to create a clustered index or PRIMARY KEY constraint on the emp_id column."

Cheers,

Christian
"Access sorts the table physically after "compact & repair" "
That's just what C&R happens to do.  Maybe I should clarify. I mean that there is no reason to have it sorted by other than the native Primary key sequence, and this is basically what is implied in my first post re Primary Keys.

mx
"users in a front end should *never* be directly exposed to a table in the back end."

I agree completely and should have used the word "developer" instead of the word "user".  

But, I have done some more testing and now agree that a compact and repair DOES physically resequence by primary key.  I am still confused about a few things and will post more shortly.

"I agree completely and should have used the word "developer" instead of the word "user".  "
Even less of an issue then.

mx
"Each developer gets a different invisible query"
Mx asked: "What are you talking about ?"

My answer is attached.
The 3 databases all have different views of Table1 and none of the views reflect the actual physical structure which is entry sequence. (To see table1 in its real physical sequence, open BackEnd then design table then delete index then close and reopen table.)

The only way I can explain this behavior is with an "invisible query".  Perhaps my terminology is wrong, or maybe mx can give me alternative explanation.  


My-Documents.zip
Before you delete the index, please note that BackEnd seems to show Table1's seqence as Key1-Ascending.
A compact and repair leaves that apparant sequence untouched.
But that is because the Index is NOT a primary index.  Compact and Repair only does a physical resequence if there is a primary index.

By the way, I still agree with Bitsqueezer the physical sequence can give a great performance boost.  But in my original question I thought  Compact and Repair could not change the physical sequence. Since I now know that I was wrong, I think we can stop talking about SQL Server.  SQL Server may do a far better job of utilizing clusters, but MS Access at least has the ability to put data records into a desired sequence.

Now, armed with this new knowledge, I will take a closer look at the offending database and try to figure out why I got a 30 fold improvement after the "Manual Reorg" that I described in my initial post.  I had assumed the reason was due to physical sequence of data, but I now think it must be something more subtle.  
This post is just in case people still don't see the point I am making about "invisible queries".

We have a programmer here that thought he could reorganize a database's physical sequence by opening the table in datasheet view, choosing a field, the sorting the field on ascending key, then saving the table.  Sometimes he followed those steps with a compact and repair.

He was disappointed that putting the table in that "better" sequence didn't improve performance, and came to the conclusion that physical sequence does not affect performance.

But, his attempts to resequence the database did NOT change the physical sequence of the database even a little bit, it only created my so-called "invisible queries".  So his conclusion was unwarranted.
"By the way, I still agree with Bitsqueezer the physical sequence can give a great performance boost. "

IF ... it's by Primary Key (or some other unique index).

mx
The performance will be greatest when the application hits the data in the same sequence as the primary key.

"Some other unique index" will only improve if the other index shares many of the same groups as the primary key.

For instance if primary key is a composite of State, City     Then an application process things using unique secondary key of customer number will probably not improve.
But, if customer numbers were assigned starting with number 1 in Anchorage Alaska, and ending with number 999,999 in Zulu, Wyoming then there might be an improvement.

MX: do you now understand and agree with my concept of "invisible query"?
I can't look at the mdb until later tonight ...

mx
I continue to investigate why two identical tables (which I have named SlowTable and FastTable) with identical indexes and identical primary keys perform radically differently in identical queries, even after a compact and repair.

I have noticed that REMOVING some secondary indexes seems to improve SlowTable dramatically.  Something very weird is happening.

But, it is time for me to go home, so I will leave further investigation until tomorrow.
It turns out that the 30 fold increase was totally unrelated to my posted question about the physical sequence of the database.
I will therefore be closing this question shortly and awarding points.

But, I still don't fully understand what is realy happening, so I will open another question soon.

Just in case folks are interested, the new question will focus on the 30 fold increase in performance and relational integrity.  The facts are as follows:

OldQueryA had this SQL:
    SELECT * FROM tblServiceTypes INNER JOIN LargeTableA ON SmallTable.Key1 = LargeTableA.Key1
    WHERE (((Key2)=222229) AND (([Key3])=2009) AND ((Key4)>#1/5/2010#));

I copied LargeTableA to LargeTableB and created NewQuery with this SQL:
    SELECT * FROM tblServiceTypes INNER JOIN LargeTableB ON SmallTable.Key1 = LargeTableB.Key1
    WHERE (((Key2)=222229) AND (([Key3])=2009) AND ((Key4)>#1/5/2010#));

For some reason, NewQueryB is 30 times faster than OldQueryA.
I believe the issue is related to referential integrity.

Database relationships were 1 to many enforcing referential integrity.  This showed as follows in the relationship window
    SmallTableA  (1)---------(OO) LargeTableA.    where 'OO' is actually the 'infinity' symbol which I. I can't type that into EE, so I type OO instead.

When I copied LargeTableA to LargeTableB there were NO RELATIONSHIPS for LargeTableB, so it ran 30 times faster.  
When I added a 1 to many relationship onto LargeTableB, it also ran ran much slower.



   
8 this can't be done :-)
Strange, I copied the infinity symbol from my MS Character Map where it looks like OO only smaller, and after posting it turned in '8'
Sorry, no points for GrayL.  His contributions were interesting, but they only affect the "invisible query" views of the database, so they have to impact on performance.  

And, I am still interested to know if DatabaseMX agrees with my concept of an "invisible query", and if he has any different terminology to describe it.