Link to home
Start Free TrialLog in
Avatar of DANNYMAC
DANNYMAC

asked on

using an index to search access tables

I would like to search a very large table (several million records) by the index.  I need to find the relevant records and then peek at the record before and the record following.  The records I need to find first may be several hundreds apart.  In pseudo code I would like to:

primarykey =field(date)& field(Time) & Field(primeID)

select all records between  date_start and  date_end
do until EOF
   Seek next record where showID = myID
         'using the index into the table because it holds the
           sort order
      move back 1 record in the index
      print show title
      movenext & print  show title
      movenext & print  show title
   loop
The problem is searching even a small date range you would try to pull a couple million records (takes a life time) I would like to look at the primary index and move along it, find the start date, and only pull records where data is needed. If there is a better approach please let me know.  Any help would be greatly appreciated
Avatar of DANNYMAC
DANNYMAC

ASKER

Edited text of question.
Edited text of question.
Are you saying that there are still a couple million records between a short date range, or are you still pulling all the records eventhough you want to get back a few records from the table between the date range?

You should be able to pull only the records needed using a 'WHERE' clause in your SQL query.  E.g.

    .
    .
    .
    sQuery = "SELECT * FROM user_table " &_
        " WHERE start_date >= '" & sStartDate & "'" &_
          " AND end_date <= '" & sEndDate & "'"

    Set rs = db.OpenRecordset(sQuery)
    .
    .
    .

This should only get you the records between the date ranges.

Hope this helps.
ok use this


strSQL = "SELECT YourField " & _        "FROM YourTable " & _
         "WHERE ID = " & YourId
     
YourRecordset.CursorLocation = adUseClient    
YourRecordset.Open strSQL, g_conDatabase, adOpenStatic, adLockReadOnly

FirstId = YourRecordset!ID
YourRecorset.MovePrevious
IdBefore = YourRecordset!ID
YourRecorset.MoveNext
YourRecorset.MoveNext
IdBefore = YourRecordset!ID



GoodLUck

I am searching broadcast schedules.  The table index is 3 fields- station, date, time and a 4th field is the show.  The goal is to find a paticular show and then determine what is on before and after.
Search criteria are the dates of airing and the show.
If I select all records For 1 week I get 10 million records.  For a month ???
I need to get the 20 records with the paticular show and the 20 records before and after.  The show is easy.  Get the record before???
the table might look like (very simply table)
CNN, 1/1/99,12:30, Frontline
CNN, 1/1/99, 13:00, WeatherToday
CNN, 1/1/99, 13:30, LArry King
CNN, 1/1/99, 14:00, HEADLINES
....
....
TBS, 1/1/99,15:30, All in the Family
TBS, 1/1/99, 16:00, Black Statlion
TBS, 1/1/99, 16:30, Larry King
TBS, 1/1/99, 17:00, Local News
 If you do
where date between 1/1/99 and 2/1/99 and show = Lary King
you get
CNN, 1/1/99, 13:30, LArry King
TBS, 1/1/99, 16:30, Larry King
I need to know what aired before LarryKing and it airs on 2000 stations 5 days a week.
You should migrate over to SQL Server 7.0 and create stored procedures to process the information, because the database should handle all the processing of data especially for the amount you're working with.

To help with determining programs before and after a particular program, you can use cursors in the stored procedure to move up and down the records that qualify in your search criteria and then return the results back to your application.

If the results are too large or too many, then you can store them in a temporary table and retrieve from that table when you're done processing them in the stored procedure.

Good luck

Sorry, that answer should have been a comment.  But if it helped, then it's an answer.
Take my comment

it will work perfectly for You and it will be very fast

You just have to do 10 movePrevious and 10 moveNext.

Do you understand ?

Richard,  The proposed answer does not get the records needed.  Check the comments with the sample table data.  What I need to do in the example is track the the show Frontline and then tell what the show before it was for each instance of airing.  Your response would get all records for Frontline but not the records with the show before.  What I am looking for is to be able to "walk" the index, without opening the entire table, and pull a pointer to the schedule record.  Then have a routine pull the records needed into a temp table.  While the table example is simple there are 2200+ broadcast station and almost as many cable systems.  A single day of schedules is over 200,000.  The report is often run on a 6 month or 1 year period.
DannyMac
One more time, this is the way

strSQL ="SELECT YourField " 
        "FROM YourTable " & _
        "WHERE Show = '" & YourShow     & "'"
     
YourRecordset.CursorLocation = adUseClient    
YourRecordset.Open strSQL, g_conDatabase, adOpenStatic, adLockReadOnly

FirstShow = YourRecordset!YourShow

YourRecorset.MovePrevious
ShowBefore = YourRecordset!
''You can go up 10 times in your case

'then moveNext and go back to
YourRecorset.MoveNext
YourRecorset.MoveNext
showAfter = YourRecordset!YourShow



I so want you to be right.
using the above, and I looked for some hidden issues I was missing.
Did I miss something?
you query:
select show form mytable where show = 'larry King';
recordset:
CNN, 1/1/99, 13:30, Larry King
TBS, 1/1/99, 16:30, Larry King

the question is On cnn what played before Larry King at 13:00 and
On TBS What played before larry King at 16:00
Is it possible to return a pointer to the table where larry king exist? From that pointer can I look back 1 place in the index and see

CNN, 1/1/99, 13:00, WeatherToday

Or do I need to pull everything for 1/1/99 109,935 records.

that is the goal.  But to try to return a keyset of 100,000 records is just to big in memory and takes to long to return.
In most DBs you can access the index directly.  Is it possible in Access?

The reason PatTheRoucky's answer is no good for you is because you are using Access. Access *can* only work with client-side cursors, which means sending over the *whole* table, which means walking the dog, making coffee, getting lunch, go shopping, do the dishes, and return to your computer in the hope that the results are there.
It may not be the answer you were hoping to hear, but what you are looking for is Server-side cursors, which means a full-fledged database server, which means RichardLouie gave a good answer, even though it didn't return any records ;-)
Hmm, sounds harsh, doesn't it? If you are doing this for work, I would recommend you try to get your boss to invest some money.
If that's not an option, there are always workarounds of course. Take this: getting the records for a whole week is slow, agreed, but how many radio shows are broadcasting a whole week non-stop? You know the channel, time and date of the primary show, so why not select all the shows from the same channel and date, from two hours previous to that time to two hours after that time? (a simple WHERE clause, I'm sure you can do it, if not, we will do it for you) That will probably return a bit more moderated result set which is faster to look through.
Jeremy
thanks for the comments.  SQL7 is slated for 7/2000.  Can't wait but in the meantime I own this problem.  The real issue is this ends in an ASP page. Also the show might play at 14:00 on CNN, 7:00 on TBS, 3:00 on USA and so on.  SO in Access I guess I need to pull the whole thing by date.
Dan
I read this string with interest and a question. What is so complicated about the problem?

Use the query to find a record set of the desired shows.

Establish a recordset of the big database ordered by station, date and time.

Step thru the query rs getting each occurance.

Find the record on the big database's recordset by finding the station date and time. This will be fast if it is indexed on station, date and time.

get the before and after record on the big database rs (move previous -- move next).

repeat with  the next occurrance or the next record you should have a record for each occurrance in station, date order.

What am I missing?
Same thing for me

look my answer body .. you got your easy way ... that's the way



If you're stuck with Access and having to pull a lot of records from the database, then I would suggest beefing up your computer.  Use the fastest Pentium III out there and put in a lot of memory in it.

However, before you go out there and invest in any hardware, try these performance tuning methods:

   - Put your Access database on a
     separate drive and controller, or
     even a different processor.

   - Build indices on the columns
     you are qualifying in the WHERE
     clause of your SQL query.

   - Maybe setup your program to run
     with a higher priority than other
     programs.

Otherwise, I would still encourage you to migrate over to SQLServer.

Rich
Thanks for all the Helpful hints.  What the goal was to achive was to not pull a recordset which would be slow and very large.  My example was simple but in fact the query would involve several sub-tables and the index is so large the response is poor.  Yes the answer is move to SQL7, well soon.  In other dbs systems it is possible to "walk" the index and not actualy create a recordset.  My thought was to create a linked list or array of pointers building as I walked the index tothe records needed and then pull the 400 records I would need instead of opening up a monster 400,000 record set.  You see while opening upa large recordset is the obivous answer it totaly ignores response time and response is what it's all about.
My solution.
From the master schedule table I pull all the records for a date range.  
I do the findnext, moveback and movenext thing and write these to a temp table.
from the temp table I then run the query that gets all the aditional fields(title, episode number, duration, description, CC, repeat ...
which is what takes all the time to build.  This solution reduces the response from 2 minutes to 12 to 18 seconds. You see collecting the titles for 400 records is 1/1000 the work of collecting the title on 400,000
But the actual question was never answered.
Can I, in access, simply move along the index of a table without actually returning a recordset for a table?
I have been following the question since the beginning and think i understand the problem.

I think there's many thing you don't understand about your problem and you should read about all this.  Your databse is not that big, database can be big and fast, it's not a problem.  A lot of database have thousand of record and it's not a problem at all.  It's made for that.

 If you query your database with something like this :

strSQL ="SELECT YourField " 
        "FROM YourTable " & _
        "WHERE Show = '" & YourShow     & "'"

you will have a recordset will only one record and this will be fast.  For now, no problem.

After this query save your position and make a new query where your are looking for the 10 records above and the 10 records after. Something like that :


strSQL ="SELECT YourField " 
        "FROM YourTable " & _
        "WHERE showID BETWEEN " & SaveIdFromPreviousQuery -10 and SaveIdFromPreviousQuery +10

Now you have a recordset with 20 records.  You can navigate throught your recordset with
Recordset.moveNext
Recordset.movePrevious.

This is still easy and fast.
Go simple and it's not a problem.  A brand new powerfull database will not correct the problem if you don't know how to query your information.









perhaps we could converse.  I am at info@pubtv.net, send ne you address and I will give some detail.
  Maybe there is something I am missing but I don't think it's as simple as you think.
With PatTheRoucky's approach if you have 10,000 records in the first query you will have to build 10,000 separate queries to get before and after. Could be more time consuming.

Dan :-)
ASKER CERTIFIED SOLUTION
Avatar of Jeremy_D
Jeremy_D

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
This is what you write:

"Even if you only pull one record from the table, the file-server will have to send the complete table (and all the other tables that might be joined into the query later) over the network, that is, all the millions of records."

I really don't agree with this.  This is not different from access to Oracle or Sql server.  If you query one record, you will send over the network only this record.

Ok Access is maybe not the fastest when you got millions of record but if you query by an Index, response time will be really good.  I just made a test to check that.  I put 1 000 000 records in a relational access database and i made a query on one record.  Time is acceptable, about 1 second.  I have also try to query 20 records like your case, 10 above and 10 after. Same thing, working well.

I really don't know what to say anymore.  I wish I could be doing it for you.  ???










The trick in optimizing queries is knowing which fields to create indices on.  As I stated earlier, you basically want to create indices on the fields that are in your WHERE clause.

Here's an example.

SELECT tab2.col1, tab2.col2
   FROM tab1, tab2
   WHERE tab1.col1 = tab2.col1
     AND (tab2.col2 BETWEEN 100 AND 200)

You'll want to create a unique index on:
   
    tab1.col1 and tab2.col1

and a non-unique index on:

    tab2.col2


Danny, if you're still having performance problems even after checking hardware, network, creating indices, then you should consider SQLServer7.

Rich

Pat, the reason yuor answers will not work is you make the assumption that a show airs in a specific time peroid.  Just not true. It can air in any slot over 24 hours.  So you are unable to use time as a criteria.  It airs ona specific station.  Shows air on stations in reality not networks.  So if a show is on ABC.  It airs on several hundred affiliate stations and independent and cable systems.  So what we are left with is a show ID and date range.  Pick a show like "Barney & Friends."  Between 1/3 and 1/30, a 4 week peroid, it airs 13099 times on PBS stations alone.  Under your stratagy it would require me to run 28000 queries. Why 28000, 1 to get the first set, 2 for each record, I to read the temp table that now hold the working dataset.  And yes you need a temp table to browse the data. Do the math.  That could never be done in 15 seconds.
The question is answered by Jeremy, And Thanks Jeremy, and you can not move along an index without returning a recordset.
PatTheRoucky > If you query one record, you will send over the network only this record.

I'm sorry, but this is just not true. The JET engine doesn't work that way. Believe me, I wish it did.