Fetching rows from Oracle in a given range

Posted on 1999-01-08
Last Modified: 2007-12-19

      I have one table which has got around 10 million records and
   since this table is so big it takes a while to populate these records
in Crystal
   reports.  Instead i was thinking of a way to populate say first 1000
records and
   then the next 1000 records and so on. So is there any way to populate
   within a given range, say populating first 1000 records and then
   records from 1001-2000,2001-3000
   and so on. I thought Rownum might help but it doesn't.
   Can somebody help.
Question by:subingeorge

Accepted Solution

dslavin earned 100 total points
ID: 1083614
Here is an article I wrote for a company newspaper.  It contains within it a method that might do what you need to do.  Please read this and if you need clarification, I'll be glad to respond.

Hope this helps.

Processing All Records in Large Database Tables
If you are trying to process all of the records of a table with many records,
then you have probably encountered ORA-1555.

The traditional way to process all of the records in a table is to:
1) Create a PL/SQL stored procedure
2) In the Procedure, create a cursor which selects ALL of the table records
3) Loop through each record and perform the corresponding inserts, updates
   and deletes.
4) If the table has a large number of records, then perform a COMMIT at
   regular intervals to prevent the ROLLBACK SEGMENT from overflowing.

The problem with this methodology is that you are continuing to fetch
records from the same table "past" each commit.  Although Oracle allows
you to do this, this is not ANSI/SQL.  Also, even though you can keep the
cursor open between commits, the rollback segment is NOT getting flushed
with every commit.  Since the cursor is open against the same table, the
rollback segment keeps those entries until the cursor is closed.

There are three increasingly complex ways to deal with this problem.  Here
they are in order of complexity:

1) Increase the size of your rollback segments.  
   If you have privileges to choose which rollback segment to currently use,
   or even better, privileges to create your own rollback segments, then this
   is the easiest solution.

   In this case you can avoid having to execute a COMMIT after every so-many
   The problem with this method, however, is that you may not have the
   required permissions to alter or create rollback segments.  Even if you
   do, the database may not have enough space to support a large enough
   rollback segment.  Also, you may not know how large a rollback segment
   you will need.  Thus you are reduced to guessing/trying/erroring-out/...

2) Grab groups of records in order of ROWID.

   If your cursor is open against a single table, then you can use a simple
   algorithm to group your records by ROWID.  In this case, you can:

     - open a cursor for each smaller subset of records
     - process all of the records in each subset
     - close the cursor
     - perform a COMMIT

   The process above is repeated for each subset of records.  Thus you are
   no longer fetching past the commit.  Each commit is performed after
   the cursor is closed for that subset of processed records.  Here is some
   sample code to show you the algorithm:

      rows_per_commit NUMBER := 1000;   -- Rows to process between COMMITs
      last_ROWID      ROWID  := '00000000.0000.0000';
      max_ROWID       ROWID;

      -- Get Maximum ROWID from table being processed (EMP)
      select max(ROWID) into max_ROWID from EMP;

      -- Loop Through EVERY subset of records until last record is reached
      WHILE (last_ROWID < max_ROWID) LOOP

            -- Create Cursor for next subset of records:
            -- - Returns ROWID as well
            -- - Begins at record with ROWID greater than last ROW processed
            -- - Uses ROWNUM to select just 'rows_per_commit' records
            CURSOR emp_cur IS
                  ROWID > last_ROWID
                  ROWNUM <= rows_per_commit
            -- Perform Commit before each subset of records is processed

            -- Force database to use largest rollback segment

            -- Loop through all records in subset
            FOR emp_rec IN emp_cur LOOP

               -- Update each record to add COLA adjustment to salary
               update EMP set SAL = SAL*1.03 where ROWID = emp_rec.ROWID;

               -- Update value of last_ROWID to point to last record processed
               last_ROWID := emp_rec.ROWID;
            END LOOP;
      END LOOP;

      -- Perform final commit

3) Grab groups of records in order of Primary Key

   If the cursor you are opening is NOT against a single table, but against a
   group of joined tables and/or views, then you will not be able to use
   Method #2.  After all of the joins are performed, the records are no longer
   returned in any sort of order (even by ROWID.)  You also can't use ROWNUM
   in a consistent fashion since it is calculated BEFORE the execution of
   any ORDER BY statement.  Thus, if you just replace ROWID by, for example,
   EMPNO, and get the next 1000 records by adding 'AND ROWNUM <= 1000', then
   you can't be certain that you will be getting the next largest set of
   EMPNO's (even if you ORDER BY EMPNO.)

   If your table records are distributed evenly with respect to your primary
   key, then you can just guess at where you need to break for commits.  For
      Minimum EMPNO = 1
      Maximum EMPNO = 100,000
     No. of Records = 100,000

     In this easiest of all cases you know that every EMPNO between 1 and
     100,000 is used.  If you wish to process 1000 records between commits
     then you can use this clause in your cursor statement:


     You would initialize low_EMPNO to 1 and high_EMPNO to 1000.
     Each time through the outer loop you would increase low_EMPNO and
     high_EMPNO by 1000 until all of the records had been processed.

   In the worst case, however, your table records will not be evenly
   distributed with respect to your primary key.  If the records are
   sufficiently "clumped" then you could choose a range with more records
   than can be handled by your rollback segment, while other ranges are
   sparse or even empty of records.

   Following is sample code that uses a binary search algorithm to determine
   the "break points" such that each subset of records is approximately
   equal in size no matter how unevenly distributed the table records are:

      rows_per_commit NUMBER := 1000;  -- records to process between commits
      low_EMPNO NUMBER;     -- low EMPNO in range of records to process
      high_EMPNO NUMBER;    -- high EMPNO in range of records to process
      guess_EMPNO NUMBER;   -- guess at high EMPNO for next range of records
      max_EMPNO NUMBER;     -- max EMPNO in EMP table
      low_guess NUMBER;     -- lowest possible value for guess
      high_guess NUMBER;    -- highest possible value for guess
      count_EMPNO NUMBER;   -- number of records in range
      i_group NUMBER;       -- current group being processed
      n_group NUMBER;       -- total number of groups or records to process


      -- Set initial high_EMPNO to -1
      high_EMPNO := -1;

      -- Get Maximum EMPNO
      select max(EMPNO) into max_EMPNO from EMP;

      -- Get number of groups of records to process
      select ceil(count(*)/rows_per_commit) into n_group from EMP;

      -- Loop through each group of records
      FOR i_group IN 1..n_group LOOP
         -- Get next low value as just greater than high value of last group
         select min(EMPNO) into low_EMPNO from EMP
            where EMPNO > high_EMPNO;

         -- IF this is the last group, then set high value to max value
         if (i_group = n_group) then
            high_EMPNO := max_EMPNO;

         -- ELSE this in not the last group
            -- Guess high value by assuming that records are evenly distributed
            guess_EMPNO := ceil((max_EMPNO - low_EMPNO)/(n_group - i_group + 1))
                         + low_EMPNO;

            -- Set worst case limits to range of possible guesses
            low_guess := low_EMPNO;
            high_guess := max_EMPNO;
         end if;
         -- If this is not the last group, then loop until get high value
         WHILE (i_group < n_group) LOOP

            -- Get number of records between low and high values
            select count(*) into count_EMPNO from EMP where EMPNO between
               low_EMPNO and guess_EMPNO;

            -- If count is within 10% of 'rows_per_commit' then
            if (count_EMPNO between 0.9*rows_per_commit
                              and 1.1*rows_per_commit) then

               -- Set high value and EXIT while loop
               high_EMPNO := guess_EMPNO;

            -- Else there are too many records in the range
            elsif (count_EMPNO > 1.1*rows_per_commit) then
               -- Reset highest possible guess
               high_guess := guess_EMPNO;

               -- Use binary search algorithm to reestimate high value
               guess_EMPNO := round((guess_EMPNO - low_guess)/2.0 + low_guess);

            -- Else there are too few records in the range
               -- Reset lowest possible guess
               low_guess := guess_EMPNO;

               -- Use binary search algorithm to reestimate high value
               guess_EMPNO := round((high_guess - guess_EMPNO)/2 + guess_EMPNO);
            end if;
         END LOOP;

         -- Low and High values for range have been determined
            -- Create Cursor for next group of records:
            -- - Uses low and high values to limit records processed
            CURSOR emp_cur IS
                  EMP.DEPTNO = DEPT.DEPTNO
                  EMPNO BETWEEN low_EMPNO and high_EMPNO

            -- Perform Commit before each subset of records is processed

            -- Force database to use largest rollback segment

            -- Loop through all records in subset
            FOR emp_rec IN emp_cur LOOP

               -- Update each record to add COLA adjustment to salary
               if (emp_rec.dname = 'Engineering') then
                  update EMP set SAL = SAL*1.03 where EMPNO = emp_rec.EMPNO;
                  update EMP set SAL = SAL*1.10 where EMPNO = emp_rec.EMPNO;
               end if;
            END LOOP;
      END LOOP;

      -- Perform final commit

   With this binary search algorithm, I seldom had to guess more than about
   four times to get a good value for high_EMPNO.  This algorithm may seem
   rather complex, but I couldn't think of anything simpler.

   If ROWNUM were calculated AFTER an ORDER BY, then I could use an algorithm
   that was very similar to Method #2, but with EMPNO instead of ROWID.


Expert Comment

ID: 1083615
As I don't know exactly what you are trying to do and I dont use crystal reports (yet) here are couple of approaches that may give you some ideas.

You could try using  PL/SQL procedures that finds the primary key value that corresponds with a given record number in the result set and restrict the SQL query by these values. e.g.
SELECT col1, col2, col3
FROM tab1
WHERE col1 < row_value(2000)
AND col1 >= row_value(1000);

Or you could write a function that takes upper and lower range limits of the record span you want and a given value, and then returns Y or N if the the given value is within the limits.  This could be tied back into an SQL query e.g.
SELECT col1, col2, col3
FROM tab1
WHERE in_range(2000,3000,col1) = 'Y';
(i.e. you ask asking if the value of col1 is in the set of rows from 2000 to 3000)

Both approaches would need to order by some known criteria (and indexed column or rowid for example) to ensure consistancy.

Hope this helps

Author Comment

ID: 1083616
Thanks, You have given an excellent solution. I am going to use your second method, using Rowid. that should work.

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
report returning null 21 79
Read only access to a Procedure in oracle? 4 45
Converting a row into a column 2 43
Introduction A previously published article on Experts Exchange ("Joins in Oracle", makes a statement about "Oracle proprietary" joins and mixes the join syntax with gen…
From implementing a password expiration date, to datatype conversions and file export options, these are some useful settings I've found in Jasper Server.
This video explains at a high level with the mandatory Oracle Memory processes are as well as touching on some of the more common optional ones.
Via a live example, show how to restore a database from backup after a simulated disk failure using RMAN.

943 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now