best possible solution

if Im trying to delete duplicates by comparing two columns from two different tables and each table has too many rows to do a simple delete query what would be the best possible solution should I create a function like


rs.open"select * from table"
rs2.open"select distinct column1 from table"

do until rs.eof
if rs.column1 = rs2.colunmn then
delete.addcurrent
rs.movenext
end if
loop
LVL 8
thenoneAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Leo EikelmanDirector, IT and Business DevelopmentCommented:
if you have restraints set up it might be difficult to just delete rows in a table.

Follow this link for instruction on how to properly remove duplicates

http://weblogs.sqlteam.com/jeffs/archive/2004/10/07/2190.aspx


Leo
0
thenoneAuthor Commented:
im not using sql im using mysql
0
thenoneAuthor Commented:
also I have tried just a straight sql statement but the tables are too big for my cpu to cope with the sql statement.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Leo EikelmanDirector, IT and Business DevelopmentCommented:
Well if you can't use SQL functions because of your cpu, then it's going to be quite difficult to get all the entries from one table and see if they exist in another table.

If you wanted to delete all the duplicates from a single table, in mysql you can do

ALTER IGNORE TABLE tableName ADD UNIQUE INDEX(a,b);

I'll can try to do some more digging, but I'm not sure how you are going to get all the entires from table A and comapre to table B without using SQL SELECT statements.

Leo
0
thenoneAuthor Commented:
I can still use sql select statements but when I tried to do lets say delete table from table1.column where union table2.column = table1.column

my cpu goes into a tales spin becasue there are too many records to go thru.
0
thenoneAuthor Commented:
not sure if I exactly wrote that correctly.but in either case it still throws my processor into a tales spin.
0
Arthur_WoodCommented:
how big are your tables?  I find it hard to believe that they could be 'too big for my cpu to cope with the SQL statement' (not that I really understand what that statment means, in the first place)

How about

SQL = "DELETE FROM TABLE2 where Table2.Column1 in (Select Column1 from Table1)"

AW
0
Leo EikelmanDirector, IT and Business DevelopmentCommented:
Well in that case you can do paging to select only a set number of records at a time.

SELECT TOP 200 *
FROM YourTableName
WHERE YourCondition

Try to do this and then run a query that failed before.

Leo
0
thenoneAuthor Commented:
Ive tried this and still recieved too much processing from my cpu.

"DELETE FROM TABLE2 where Table2.Column1 in (Select Column1 from Table1)"


one table I have almost 200,000 records and the other is about the same.
0
Leo EikelmanDirector, IT and Business DevelopmentCommented:
did you try my above comment?

Try this to see if it is actually a cpu issue

"DELETE FROM Table2 WHERE Table2.Column1 IN(
SELECT TOP 200 FROM Table1)"


Leo
0
thenoneAuthor Commented:
I will give that a try but is it possible that there are too many records for the query?
0
thenoneAuthor Commented:
also if that is the case how can I do all of the records not just the top 200?
0
Leo EikelmanDirector, IT and Business DevelopmentCommented:
it gets kinda complicated.

so, if i want records 120 to 150, i"l do this:
DELETE FROM Table2 WHERE Table2.cloumn1 IN
(SELECT TOP 150 * FROM Table1 ORDER BY someOrderField) L
LEFT JOIN
(SELECT TOP 119 * FROM Table1 ORDER BY someOrderField) R
ON L.id = R.id
WHERE R.id IS NULL
ORDER BY L.someOrderField

Here you want to make sure that in your first and second select they are ordered by the same variable.

I haven't actually tried this before but it should work.

Leo
0
thenoneAuthor Commented:
so you have never ran into a problem where you couldn't delete records with a sql statenment becasue there was too many records to delete and it would cause your computer to lock up?
0
Leo EikelmanDirector, IT and Business DevelopmentCommented:
personally no.  Most of the computers I've worked with have 4GB of memory.  But, have you tried my suuggestion up top to see if your computer will process that?

Leo
0
thenoneAuthor Commented:
I wonder if it could be my default settings for mysql.My server that I am running it on is a p4 with 512mb of memory it should be able to handle that query.If not couldn't I just loop through each record one at a time?
0
PSSUserCommented:
The statement
"DELETE FROM TABLE2 where Table2.Column1 in (Select Column1 from Table1)"
doesn't make effecient use of indexes. A better statement would be:
DELETE FROM TABLE2 WHERE EXISTS (SELECT Column1 FROM Table1 WHERE Table1.Column1=Table2.Column1)
provided you have indexes on Column1 in both tables 1 and 2.

Otherwise if you wish to do this in code the best way would be:

rs.open"select distinct column1 from table2"

do until rs2.eof
  Con.Execute "DELETE FROM Table1 WHERE Column1='" & rs.Column1.value & "'"
  rs.movenext
loop

Where Con is an ADO Connection object.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
thenoneAuthor Commented:
so basically what you are saying is,is the reason why my cpu is over killing is because I do not have indexes set up for those columns?
0
PSSUserCommented:
A lack of indexes certainly won't help as the server will have to do table scans.

It may be that you will still have a problem with the tables indexed, but far less likely.

Thanks for the points
0
thenoneAuthor Commented:
I figured it out I changed my columns from blob text to varchar
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Visual Basic Classic

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.