TADO and how to speed up iterating through an entire MDB file

I am trying to understand why a particular TADO operation takes so long to perform compared to the native performance of MS Access.

I have an MDB file containing 250,000+ records. It takes 25 minutes to step through the MDB, one record at a time, using the bare bones code below.  This is on a 2 ghz Pentium IV processor.  In contrast, it takes MS Access 2000 less than 1 minute to WRITE the entire database from scratch. IOW, if I were to open the MDB file in Access, and modify the table structure, such as truncating a field from 255 characters to 5 characters, it only takes less than 1 minute to restructure and rewrite the entire table! How is it that Access can WRITE 250,000 records afresh and yet my Delphi application takes 25 times longer just to READ the same number of records?

Is there a QUICKER way to iterate through a MDB file using Delphi 7?



type
  TDM = class(TDataModule)
    zipData: TADOConnection;
    tblZipcodes: TADOTable;
    tblZipcodesCITY: TWideStringField;
    tblZipcodesSTATE: TWideStringField;
    tblZipcodesZIP: TWideStringField;
    tblZipcodesLATITUDE: TWideStringField;
    tblZipcodesLONGITUDE: TWideStringField;
    Data: TADOConnection;
    tblUserTable: TADOTable;
    TST: TADOConnection;
  private
    { Private declarations }
  public
    { Public declarations }
  end;
 
var
  DM: TDM;
 
implementation
 
{$R *.dfm}
 
 
procedure Tform1.button1Click(Sender:TObject);
var
	Ctr : Integer;
	ZipCode : String;
begin
	dm.Data.Connected:=false;
	dm.Data.ConnectionString:='Provider=Microsoft.Jet.OLEDB.4.0;'+
                            'Data Source='+OpenDialog1.FileName+';'+
                            'Persist Security Info=False';
	dm.Data.Connected:=true;
	for Ctr :=1 to dm.tblUserTable.RecordCount do
	begin
	  if AbortOperation then break;
	  ZipCode := dm.tblUserTable.fieldbyname('zip').AsString;
	end;
end;

Open in new window

SofttechAsked:
Who is Participating?
 
senadCommented:
softech is right ...
if you have that many records then you must temp disable controls....
0
 
Geert GOracle dbaCommented:
iterating is the slowest possible way of doing something ...

what are you trying to accomplish ?

maybe a simpl update statement ?

update table set zip = 'X' + zip where zip like '1234%';

let the database unleash it's power, don't try and yield that same power yourself ...
0
 
SofttechAuthor Commented:
>> what are you trying to accomplish ?

We are trying to perform ZIP Code distance calculations.

The 250,000 record MDB contains: FIRSTNAME, LASTNAME, ZIPCODE

A second MDB contains 42,000 ZIP Codes.

The Delphi app prompts the user to:

1) Input a starting ZIP Code

2) A maximum distance radius (in miles)

3) The the Delphi app connects to the 250,000 record MDB and iterates through each record, stopping at each record to obtain the ZIPCODE value.  Then the app looks up the latitutude and longitude for that ZIPCODE and via trigonometry determines the distance between the user input ZIPCODE and the ZIPCODE in the 250,000 record MDB.  If the difference in mileage is less than the max. radius distance, then the record is written to memory for later saving to CSV.

So, how would you recommend this be accomplished faster?

Thanks.
0
Cloud Class® Course: Python 3 Fundamentals

This course will teach participants about installing and configuring Python, syntax, importing, statements, types, strings, booleans, files, lists, tuples, comprehensions, functions, and classes.

 
SofttechAuthor Commented:
>> A second MDB contains 42,000 ZIP Codes.

I should have clarified by adding that the 42,000 ZIP Code table also has the latitude and longitude values in it.  These would be the fields in this table:

ZIPCODE
LATITUDE
LONGITUDE

0
 
Geert GOracle dbaCommented:
well, the thing is, this table will never change again, unless they move a country around or change zipcodes of a town.
so you only ever need to do the calculation once for a certain combination of zipfrom to zipto

so create a zipcalc table to store these calculations

zipcalc table
zipfrom
zipto
distance

and then use a select with for example a square radius

select a.zipcode as zipfrom, b.zipcode as zipto, c.distance
from zipcodes a
  left join zipcalc b on a.latitude >= b.latitude - 5 and a.latitude <= b.latitude + 5
    and a.longitude >= b.longitude - 5 and a.longitude <= b.longitude + 5
  left join c on a.zipcode = c.zipfrom and b.zipcode = c.zipto

 
you only need to calculate when the distance is empty
the calculation process should add a record to the zipcalc table.

0
 
SofttechAuthor Commented:
>> this table will never change again

WHAT table are you referring to?  There are 2 MDBs/tables:

1) 42,000 records: ZIPCODE, LAT, LONG

2) 250,000 records: LASTNAME, FIRSTNAME, ZIPCODE

Both of the tables could change.  ZIP Codes are updated monthly by the US Postal Service.

>> so you only ever need to do the calculation once for a certain combination of zipfrom to zipto

But there are conceivable 42,000 possible combinations for every single record in the 250,000 record table.   The TEdit requesting a user inputed "center" ZIP Code is the variable, and can/will change from user to user.

>> so create a zipcalc table to store these calculations

You are talking about creating a table with nearly TWO BILLION values?!  You want to precalculate every possible combination that could exist?  That's approx. 1,764,000,000 combinations (42,000 x 42,000).  Creating such a table is going to take many times longer to create than my current method, and is going to be astronomical in size.
0
 
Geert GOracle dbaCommented:
lol, just kidding

why do you want to iterate ? i don't get this ...

i would only let the user select a destination and then calculate the distance
not calculate all the possibilities and show a list

or use a database function (in pseudo)

select a.*, calcdistance(a.zipcode, :tozip) distance from clients a
where a.id = :id

parambyname('id').asinteger := selectedclientid;
parambyname('tozip').asstring := selectedzipcodedest;





0
 
SofttechAuthor Commented:
>> then use a select with for example a square radius
>> a.latitude >= b.latitude - 5 and a.latitude <= b.latitude + 5

You do NOT calculate distances in miles between two geographic points by subtracting latitude and longitude values.  That would only get you a delta (difference), i.e. the two short sides of a right triangle.  Distances between point A and point B require that the hypotenuse be calculated...something that is not pre-stored in the 42,000 ZIPCode table, and therefore must be calculated on the fly.
0
 
Geert GOracle dbaCommented:
it's just a sample of a where clause ...

i you want to go that way, why use an access database ? it's so limited ...
0
 
SofttechAuthor Commented:
>> why do you want to iterate ?

How are you going to determine which of the records in the 250,000 record database fall within IntToStr(TEdit1.Text) miles of center ZIP Code (TEdit2.Text) unless you iterate through each of the 250,000 records and perform inverse calculations?

The only other option I can see is to create a third temporary table that would calculate 42,000 ZIP Code calcuations in advance, and store the distances, then use that third table to compare against the 250,000 record table.

But pre-calculating 42,000 values in a third temporary table would seem wasteful if the 250,000 record table actually only contained 2,000 records....which it may, since you can see from the source code that I allow the user to specify the Firstname/lastname/zipcode MDB via OpenDialog1.FileName.
0
 
SofttechAuthor Commented:
>>  why use an access database

We are writing an application that can be used with our client's database files, which actually start out as CSV files, but they create MDB's since Access is the most popular small office database application.  We have no say in the file format (although we could use their raw CSV, but that would be a major headache).
0
 
Geert GOracle dbaCommented:
it's just simple math ... with sin and cos

http://www.movable-type.co.uk/scripts/latlong.html
0
 
SofttechAuthor Commented:
>> it's just simple math

You still have to take into consideration that the earth is a sphere, not a flat sheet.  But regardless, we are getting off topic.  I already know how to calculate distances between ZIP Codes.  The calculation is not the issue here.
0
 
SofttechAuthor Commented:
>> iterating is the slowest possible way of doing something

Granted, it may be slower than the "ideal" method (whatever that might be) but I still fail to see why it would take 25 times longer.  There is no index on the 250,000 MDB, so we're talking about a very simple disk and memory access issue.  There's no index lookups on an un-indexed table.  Just start at the first physical record of the table and step to the next record, then to the next.  Why would this take 25 times longer than anything you could perform using a SQL query?
0
 
Geert GOracle dbaCommented:
well actually it is ... you should do it in the database.
it's a lot faster than getting 1 record from the database, calculating, next ...
this is just getting all records from the db (calculation included)

in pseudo your query would be

select X.* from (
  select a.*, distance(a.zipcode, :tozip) as distance from zipcodes a) as X
where X.distance < 5

then you could use same query like:

select * from clients
where zipcode in (
  select X.zipcode from (
    select a.*, distance(a.zipcode, :tozip) as distance from zipcodes a) as X
  where X.distance < 5)

0
 
zorfaelCommented:
Of course there is:

1. Keep your ADO Connection on the form.
2. Delete the ADO Table.
3. Add an ADO Query.
4. Link the ADO Query to the ADO Connection.
5. Setup the ADO Query SQL Property, something like this:

SELECT tblZipcodesCITY, tblZipcodesSTATE, tblZipcodesZIP, tblZipcodesLATITUDE, tblZipcodesLONGITUDE FROM Zipcodes

*Consider "Zipcodes " the table name that has the zip codes.

6. Set the ADO Query Active property to true within the code and iterate the rows as you would do with the table, however it is much faster.
0
 
SofttechAuthor Commented:
I discovered the simplest solution to my problem was to just add DisableControls to my TADO tables.  After disabling the data controls, the performance increased astonomically.  What used to take 1700 seconds now takes 25 seconds!
0
 
Geert GOracle dbaCommented:
my estimate for 42000 records on a database:
.2 seconds maximum for processing all records.

so your app would be (25 / .2) = 250 times slower than mine ...

my 2 cents: following the query way, either do all in the database or do all in delphi, not half each.
the power gets lots from copying data from 1 processing engine to an other (db to delphi)

I did this once for somebody's app once, it was processing 5000 records with iterating and took 5 days.
I modified it to use queries and in the end in 1 stored procedure.
Then it processed the records in 4 seconds.
just so you know what a huge difference the method of processing can make

did you ever see the cartoon ice age 2 ?
you are doing what that squirrel is doing at the beginning.  
disablecontrols is using the first finger to plug a hole, ...
0
 
Geert GOracle dbaCommented:
processing 1 record at a time is the speed problem
and this is what you are doing when you iterate
0
 
senadCommented:
true...
I personally think the whole concept of zipcode coordinates stuff, is wrong.
There are better way of doing this...
0
 
SofttechAuthor Commented:
>> zipcode coordinates stuff, is wrong.

And what precisely is wrong with this "stuff"?

>> There are better way of doing this

Such as...?
0
 
senadCommented:
If I am not mistaken this thing is used to calculate distances from the cities.
Am I right ?
0
 
senadCommented:
if so, there are tons of freeware for such stuff.why reinvent the wheel ?
0
 
senadCommented:
Check out geonames.org for geocoding of cities without using the
Google Maps API. You'll need to do the calculation yourself, but this
is a simple Haversine formula that can be found with a Google search.

The only requirement for doing this type of operation is the ability
to connect to and process an http request inside your program, which
can be done in Delphi.

0
 
senadCommented:
What you need is a database of post office locations that includes lat/long
information on each post office. Each Post Office is a zip code. Then you
need a database of whatever you're trying to locate, which also includes
lat/long information about the locations (zip code not necessary). Then you
create a query that uses a bit of trigonometry to get the distance from the
zip code to each of the locations in your locations table, by calculating
the distance between the lat/long of the post office to the lat/longs of
each location. This part can be little tricky, as calculating distances
between lat/longs means calculating distances on the surface of an
ellipsoid, not a sphere, but depending on how accurate you need the results
to be, you can approximate a sphere using the Mean Earth radius.

0
 
senadCommented:
0
 
Geert GOracle dbaCommented:
this is just what i have been saying all along:

use a query to get the information needed including a function to calculate the distance
1 query inwhich included a funcation and no iterating in Delphi !

or can't you create a function to do the calc in Access ?
0
 
ThePlagueCommented:
this question is becoming ridiculous, you keep asking for random stuff over and over, please get your solution and close this or ADMIN please remove me from this, don't wanna get more emails.
0
 
Geert GOracle dbaCommented:
you can just unsubscribe from this Q ... with the link Stop Monitoring
0
 
SofttechAuthor Commented:
>> What you need is a database of post office locations
>> that includes lat/long information on each post office.

We already have such a database.  I stated this in an early posting.

>> there are tons of freeware for such stuff.why reinvent the wheel ?

You do not understand what I am doing.  We are not reinventing the wheel.

>> or can't you create a function to do the calc in Access ?

No, because all the CUSTOMERS in future who I've never seen, are not going be inserting SQL code and macros into their MDBs.

I am writing a generic tool to be used by unforseen "others" so that these "others" don't have to know a lick of code.

>> you keep asking for random stuff over and over

Random stuff?  Please make a list of all the stuff I've specifically asked for "over and over".
0
 
SofttechAuthor Commented:
To put this question to sleep I am awarding points although none of the answers were accurate.
0
 
ThePlagueCommented:
s**t, I'm not monitoring this question and keep getting emails, admin can you please remove me from this endless discussion?
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.