Solved

manipulate duplicate data

Posted on 2014-11-22
15
120 Views
Last Modified: 2014-11-22
My DOCUMENT table has  duplicate entries when I group BY name, size

The result of this query is

SELECT name, COUNT( * ) , id,filename  FROM DOCUMENT GROUP BY name, size HAVING COUNT( * ) >1

name          COUNT(*)      id      filename
--------------------------------------------------------------
docu1             45               33     fname1
docu2             85               59     fname2
docu3             43               33     fname5

I  want to change the  "filename" of all recurring entries to the filename of the first  entry of the group. (That means the first record of the group will be unchanged)

Can any body help me,
Thank you so much
0
Comment
Question by:myyis
  • 8
  • 7
15 Comments
 
LVL 58

Expert Comment

by:Gary
Comment Utility
So where for example the ID is 33 all entries should be filename1 (not filename5 etc)?
0
 

Author Comment

by:myyis
Comment Utility
Like this

name      size        id       filename
---------------------------------------------
docu1      10         33      fname1
docu1      10         34      fname4    change  to  fname1
docu1      10         35      fname6    change  to  fname1
docu2      20         36      fname7
docu2      20         36      fname8    change  to  fname7

Also  I need to have the list of the old (changed) "filename" values ("fname4","fname6","fname8")

Thank you
0
 
LVL 58

Accepted Solution

by:
Gary earned 500 total points
Comment Utility
Try
UPDATE table1 a
INNER JOIN
(SELECT filename,name FROM table1) b
ON a.name = b.name
SET a.filename=b.filename

Open in new window


list of the old (changed) "filename"
a list where?
0
 

Author Comment

by:myyis
Comment Utility
if possible using SELECT somewhere
0
 
LVL 58

Expert Comment

by:Gary
Comment Utility
Yes, but to do what with it?
But you cannot do a recordset select while doing an update - it's one or the other
0
 

Author Comment

by:myyis
Comment Utility
I will use the list to delete the repeating documents
0
 

Author Comment

by:myyis
Comment Utility
I mean delete from directory. So may be I can use a SELECT first, then the 2d query to change
0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 
LVL 58

Expert Comment

by:Gary
Comment Utility
Delete from what directory?
This question is in the MySQL zone.
0
 

Author Comment

by:myyis
Comment Utility
Yeah I know,
if you can provide me also  the SELECT query, I can use the result set of records to will be changed.
Thank you.
0
 
LVL 58

Expert Comment

by:Gary
Comment Utility
You can use this which will give a comma seperated field called dupes which contains all the grouped filenames

SELECT GROUP_CONCAT(filename) AS dupes FROM table GROUP BY name

Open in new window

0
 

Author Comment

by:myyis
Comment Utility
Thank you for the select but also it gives the results that are unique.
I need something like this ("fname4","fname6","fname8"). Please check above
0
 
LVL 58

Expert Comment

by:Gary
Comment Utility
Is ID a unique auto increment field?
0
 

Author Comment

by:myyis
Comment Utility
No, the PK is (ID,ORID)
0
 
LVL 58

Assisted Solution

by:Gary
Gary earned 500 total points
Comment Utility
SELECT filename
FROM table1 a
WHERE filename NOT IN 
(SELECT filename FROM (select name,filename from table1 group by name) b) 

Open in new window

0
 

Author Closing Comment

by:myyis
Comment Utility
Great!
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Suggested Solutions

Fore-Foreword Today (2016) Maxmind has a new approach to the distribution of its data sets.  This article may be obsolete.  Instead of using the examples here, have a look at the MaxMind API (https://www.maxmind.com/en/geolite2-developer-package). …
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
This video shows how to remove a single email address from the Outlook 2010 Auto Suggestion memory. NOTE: For Outlook 2016 and 2013 perform the exact same steps. Open a new email: Click the New email button in Outlook. Start typing the address: …
This video explains how to create simple products associated to Magento configurable product and offers fast way of their generation with Store Manager for Magento tool.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now