"WHERE ROWID > ANY (" is too slow

Greetings, esteemed experts!

EE question:

In Oracle 11g, I have a table with numerous columns, but the user has requested to only keep rows that contain the “first” instance of a particular value within a specific column.

So, with this data originally:

myPK  ...  myColor
1          ...  red
2          ...  red
3          ...  blue
4          ...  blue
5          ...  red
6          ...  green

Open in new window


He’d like to keep only the rows with the FIRST instance of a particular color:

myPK  ...  myColor
1          ...  red
3          ...  blue
6          ...  green

Open in new window


Initially, the developer wrote something like this (which appears to work):

delete from MyTable a
 where a.rowid > any 
  (select b.rowid
     from MyTable b
    where a.myColor = b.myColor)

Open in new window


But, since the table now has over 66 million rows, it’s now taking much too long. Outside of ensuring that there’s an index on myColor, what would you suggest to speed this up? Is there a better way to write this?

Thanks in advance!
DaveSlash
LVL 18
Dave FordSoftware Developer / Database AdministratorAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

chaauCommented:
Try this one:
delete from 
  (select *, ROW_NUMBER() OVER(PARTITION BY myColor ORDER BY myPK) as rn
     from MyTable) a
 where a.rn > 1 

Open in new window

0
slightwv (䄆 Netminder) Commented:
I'm not sure the above post is Oracle syntax but I'll try it later.

>>He’d like to keep only the rows with the FIRST instance of a particular color:

First instance based on the primary key?

If so, is that primary key like the example and is a simple number column or is it more complex?
0
Dave FordSoftware Developer / Database AdministratorAuthor Commented:
chaau2015: Thanks for your response. I'll try that. I wasn't aware you can delete from a sub-query. (I've been a DB2-guy for a couple decades, and I'm just now forced to get more familiar with Oracle.)

slightwv:

> First instance based on the primary key?

Yes

> is that primary key like the example and is a simple number column or is it more complex?

It's a simple number ... a surrogate key.

Thanks for your help! I appreciate it.
0
The Ultimate Tool Kit for Technolgy Solution Provi

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy for valuable how-to assets including sample agreements, checklists, flowcharts, and more!

slightwv (䄆 Netminder) Commented:
The first post is incorrect syntax.  I confirmed it.

Sorry for my mistake.  The PK doesn't appear to matter.  I read it too quick.

>>Outside of ensuring that there’s an index on myColor,

Is there an index on mycolor?

Here are two alternatives that should delete the same rows but without an index, they are full table scans.  Not sure if either is 'better'.

delete from mytable where mypk not in (
	select min(mypk) from mytable group by mycolor
)
/

delete from mytable a
 where a.rowid > 
  (select min(b.rowid)
     from mytable b
    where a.myColor = b.myColor)
/

Open in new window


Out of the 66 million rows, how many will you be deleting?

It might be faster to create a new table with the remaining rows then dropping  the original with a quick rename.  This gets complex if there are foreign keys and/or constraints (more if the constraints are cascading) involved.
0
slightwv (䄆 Netminder) Commented:
Another possibility based on the first post.

From my test it appears it might be slightly better but I don't have 66 million rows in my test...

delete from mytable where mypk in (
select mypk from (
		select mypk, row_number() over(partition by mycolor order by mypk) rn from mytable
)
where rn>1
)
/

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Dave FordSoftware Developer / Database AdministratorAuthor Commented:
Thanks slightwv! When you say, "The first post is incorrect syntax", are you referring to mine or chaau's?

> Is there an index on mycolor?

Yes.

> Here are two alternatives ...

Thanks! I'll try those out.

> Out of the 66 million rows, how many will you be deleting?

I'm not sure. I'll check and get back to you.

> It might be faster to ...

Good point. I'll try that, too. Thanks!
0
slightwv (䄆 Netminder) Commented:
>>re you referring to mine or chaau's?

Chaau's.  You cannot delete directly from a subquery in Oracle (that I know about).

>> Is there an index on mycolor?... yes...

That changes things.  Let me retest and I'll update my posts after creating the index.

Is if a simple index on the single column or a compound index on more than one column?

The closer the provided information to the exact environment, the closer our models.

>>Good point. I'll try that, too. Thanks!

Try my queries I've posted.

You don't need to actually execute them to get a good 'guess' on performance.

You can generate an execution plan w/o actually executing anything. It is a great starting point.

explain plan for
delete ...
;
SELECT * FROM TABLE(dbms_xplan.display);

Open in new window

0
slightwv (䄆 Netminder) Commented:
Even with an index on mycolor (and tried compound index variations) based on my simple test, the last one I posted in http:#a40574297 seems to work the best.

Your mileage may vary.
0
Dave FordSoftware Developer / Database AdministratorAuthor Commented:
You nailed it! Thanks, slightwv!
0
slightwv (䄆 Netminder) Commented:
Glad to help!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Oracle Database

From novice to tech pro — start learning today.