Solved

"WHERE ROWID > ANY (" is too slow

Posted on 2015-01-27
10
161 Views
Last Modified: 2015-01-28
Greetings, esteemed experts!

EE question:

In Oracle 11g, I have a table with numerous columns, but the user has requested to only keep rows that contain the “first” instance of a particular value within a specific column.

So, with this data originally:

myPK  ...  myColor
1          ...  red
2          ...  red
3          ...  blue
4          ...  blue
5          ...  red
6          ...  green

Open in new window


He’d like to keep only the rows with the FIRST instance of a particular color:

myPK  ...  myColor
1          ...  red
3          ...  blue
6          ...  green

Open in new window


Initially, the developer wrote something like this (which appears to work):

delete from MyTable a
 where a.rowid > any 
  (select b.rowid
     from MyTable b
    where a.myColor = b.myColor)

Open in new window


But, since the table now has over 66 million rows, it’s now taking much too long. Outside of ensuring that there’s an index on myColor, what would you suggest to speed this up? Is there a better way to write this?

Thanks in advance!
DaveSlash
0
Comment
Question by:daveslash
  • 6
  • 3
10 Comments
 
LVL 24

Expert Comment

by:chaau
ID: 40574131
Try this one:
delete from 
  (select *, ROW_NUMBER() OVER(PARTITION BY myColor ORDER BY myPK) as rn
     from MyTable) a
 where a.rn > 1 

Open in new window

0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40574210
I'm not sure the above post is Oracle syntax but I'll try it later.

>>He’d like to keep only the rows with the FIRST instance of a particular color:

First instance based on the primary key?

If so, is that primary key like the example and is a simple number column or is it more complex?
0
 
LVL 18

Author Comment

by:daveslash
ID: 40574220
chaau2015: Thanks for your response. I'll try that. I wasn't aware you can delete from a sub-query. (I've been a DB2-guy for a couple decades, and I'm just now forced to get more familiar with Oracle.)

slightwv:

> First instance based on the primary key?

Yes

> is that primary key like the example and is a simple number column or is it more complex?

It's a simple number ... a surrogate key.

Thanks for your help! I appreciate it.
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40574257
The first post is incorrect syntax.  I confirmed it.

Sorry for my mistake.  The PK doesn't appear to matter.  I read it too quick.

>>Outside of ensuring that there’s an index on myColor,

Is there an index on mycolor?

Here are two alternatives that should delete the same rows but without an index, they are full table scans.  Not sure if either is 'better'.

delete from mytable where mypk not in (
	select min(mypk) from mytable group by mycolor
)
/

delete from mytable a
 where a.rowid > 
  (select min(b.rowid)
     from mytable b
    where a.myColor = b.myColor)
/

Open in new window


Out of the 66 million rows, how many will you be deleting?

It might be faster to create a new table with the remaining rows then dropping  the original with a quick rename.  This gets complex if there are foreign keys and/or constraints (more if the constraints are cascading) involved.
0
 
LVL 76

Accepted Solution

by:
slightwv (䄆 Netminder) earned 500 total points
ID: 40574297
Another possibility based on the first post.

From my test it appears it might be slightly better but I don't have 66 million rows in my test...

delete from mytable where mypk in (
select mypk from (
		select mypk, row_number() over(partition by mycolor order by mypk) rn from mytable
)
where rn>1
)
/

Open in new window

0
 
LVL 18

Author Comment

by:daveslash
ID: 40574303
Thanks slightwv! When you say, "The first post is incorrect syntax", are you referring to mine or chaau's?

> Is there an index on mycolor?

Yes.

> Here are two alternatives ...

Thanks! I'll try those out.

> Out of the 66 million rows, how many will you be deleting?

I'm not sure. I'll check and get back to you.

> It might be faster to ...

Good point. I'll try that, too. Thanks!
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40574313
>>re you referring to mine or chaau's?

Chaau's.  You cannot delete directly from a subquery in Oracle (that I know about).

>> Is there an index on mycolor?... yes...

That changes things.  Let me retest and I'll update my posts after creating the index.

Is if a simple index on the single column or a compound index on more than one column?

The closer the provided information to the exact environment, the closer our models.

>>Good point. I'll try that, too. Thanks!

Try my queries I've posted.

You don't need to actually execute them to get a good 'guess' on performance.

You can generate an execution plan w/o actually executing anything. It is a great starting point.

explain plan for
delete ...
;
SELECT * FROM TABLE(dbms_xplan.display);

Open in new window

0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40574326
Even with an index on mycolor (and tried compound index variations) based on my simple test, the last one I posted in http:#a40574297 seems to work the best.

Your mileage may vary.
0
 
LVL 18

Author Closing Comment

by:daveslash
ID: 40576328
You nailed it! Thanks, slightwv!
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40576332
Glad to help!
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This article started out as an Experts-Exchange question, which then grew into a quick tip to go along with an IOUG presentation for the Collaborate confernce and then later grew again into a full blown article with expanded functionality and legacy…
From implementing a password expiration date, to datatype conversions and file export options, these are some useful settings I've found in Jasper Server.
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
This video shows information on the Oracle Data Dictionary, starting with the Oracle documentation, explaining the different types of Data Dictionary views available by group and permissions as well as giving examples on how to retrieve data from th…

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question