Avatar of Sha1395
Sha1395
 asked on

how to remove duplicate rows

I have the following query to retrieve bunch of data but somehow it creates duplicate rows for certain records.I tried distinct but no use

What i am doing wrong here
SELECT Distinct dbo.tblAssessmentEcosystemCredit.ManagementZoneID, AssessmentEcosystemCreditID,dbo.tblAssessmentEcosystemCredit.AssessmentVersionID , 
(COALESCE(dbo.tblManagementZone.SiteValueCurrentScore,0) 
-COALESCE(dbo.tblManagementZone.SiteValueFutureScore,0)) AS LossinSiteValueScore, 
5 AS SaveType, dbo.ufn_varbintohexstr(dbo.tblAssessmentEcosystemCredit.RowTimestamp) AS RowTimestamp, 
dbo.tblVegetationZone.EECID, 
CASE WHEN dbo.tblVegetationZone.EECID > 0 THEN 3.0 
ELSE 1.0 
END AS EECOffSetMultiplier 
FROM dbo.tblAssessmentEcosystemCredit 
INNER JOIN dbo.tblVegetationType 
ON dbo.tblAssessmentEcosystemCredit.VegTypeID = dbo.tblVegetationType.VegTypeID 
INNER JOIN dbo.tblManagementZone 
ON dbo.tblAssessmentEcosystemCredit.ManagementZoneID = dbo.tblManagementZone.ManagementZoneID 
INNER JOIN dbo.tblVegetationZone 
ON dbo.tblVegetationZone.VegetationZoneID = dbo.tblManagementZone.VegetationZoneID 
INNER JOIN dbo.tblAssessmentVersion AV ON 
AV.AssessmentVersionID = dbo.tblAssessmentEcosystemCredit.AssessmentVersionID 
INNER JOIN tblAssessment TBA ON 
TBA.AssessmentID = AV.AssessmentID 
WHERE dbo.tblAssessmentEcosystemCredit.AssessmentVersionID= @AssessmentVersionID

Open in new window

Microsoft SQL Server 2008

Avatar of undefined
Last Comment
Sha1395

8/22/2022 - Mon
ASKER CERTIFIED SOLUTION
Peter Chan

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
Sha1395

ASKER
Thanks HuaMinChen, after modified the query based on your suggestion still populate the duplicate records(same record count)
Peter Chan

Do you mean the same record appears more than once? Can you show the output data?
Sha1395

ASKER
Even i modified the query like below

select distinct a.* from(SELECT Distinct dbo.tblAssessmentEcosystemCredit.ManagementZoneID, AssessmentEcosystemCreditID,dbo.tblAssessmentEcosystemCredit.AssessmentVersionID , 

5 AS SaveType, dbo.ufn_varbintohexstr(dbo.tblAssessmentEcosystemCredit.RowTimestamp) AS RowTimestamp 
FROM dbo.tblAssessmentEcosystemCredit 
INNER JOIN dbo.tblAssessmentVersion AV ON 
AV.AssessmentVersionID = dbo.tblAssessmentEcosystemCredit.AssessmentVersionID 
INNER JOIN tblAssessment TBA ON 
TBA.AssessmentID = AV.AssessmentID 
WHERE dbo.tblAssessmentEcosystemCredit.AssessmentVersionID= 2151)a

Open in new window

Book1.xlsx
Your help has saved me hundreds of hours of internet surfing.
fblack61
Sha1395

ASKER
huh found it, this probably helps me

select distinct ManagementZoneID from tblAssessmentEcosystemCredit where AssessmentVersionID=2151

Open in new window


will return only 14  records, thats the correct count, thats what i am trying to do. Any idea how to modify my main query
Peter Chan

In your Excel, can you point out which rows are duplicated ones?
Sha1395

ASKER
ManagementZoneID is the duplicate ones ,for example ManagementZoneID="14340" keep duplicate
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Peter Chan

But the so-called duplicated rows are having different values of

AssessmentEcosystemCreditID,AssessmentVersionID,...
Sha1395

ASKER
Check the ManagementZOne ID "14340" all the values are same ,so it clearly shows its duplicate,if i delete the record based on ManangementZone Id, i will solve this issue
Book2.xlsx
Peter Chan

if i delete the record based on ManangementZone Id, i will solve this issue

Please do the backup of the records and proceed to next if possible.
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
Sha1395

ASKER
Do you suggest to delete to ? or still am i missing something ?
Sha1395

ASKER
Sorry and what's the best approach to delete those duplicate records based on ManagementZoneID ?
Peter Chan

Can you list out the records of this

SELECT Distinct dbo.tblAssessmentEcosystemCredit.ManagementZoneID, AssessmentEcosystemCreditID,dbo.tblAssessmentEcosystemCredit.AssessmentVersionID , 
(COALESCE(dbo.tblManagementZone.SiteValueCurrentScore,0) 
-COALESCE(dbo.tblManagementZone.SiteValueFutureScore,0)) AS LossinSiteValueScore, 
5 AS SaveType, dbo.ufn_varbintohexstr(dbo.tblAssessmentEcosystemCredit.RowTimestamp) AS RowTimestamp, 
dbo.tblVegetationZone.EECID, 
CASE WHEN dbo.tblVegetationZone.EECID > 0 THEN 3.0 
ELSE 1.0 
END AS EECOffSetMultiplier 
FROM dbo.tblAssessmentEcosystemCredit 
INNER JOIN dbo.tblVegetationType 
ON dbo.tblAssessmentEcosystemCredit.VegTypeID = dbo.tblVegetationType.VegTypeID 
INNER JOIN dbo.tblManagementZone 
ON dbo.tblAssessmentEcosystemCredit.ManagementZoneID = dbo.tblManagementZone.ManagementZoneID 
INNER JOIN dbo.tblVegetationZone 
ON dbo.tblVegetationZone.VegetationZoneID = dbo.tblManagementZone.VegetationZoneID 
INNER JOIN dbo.tblAssessmentVersion AV ON 
AV.AssessmentVersionID = dbo.tblAssessmentEcosystemCredit.AssessmentVersionID 
INNER JOIN tblAssessment TBA ON 
TBA.AssessmentID = AV.AssessmentID 
WHERE dbo.tblAssessmentEcosystemCredit.AssessmentVersionID= @AssessmentVersionID
order by 1

Open in new window


to one list again and show it to me?
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Sha1395

ASKER
I have enclosed the output for your view. Thanks again for your help
Book3.xlsx
Peter Chan

In the following
23
do you mean the records are repeating? but how about the 2nd column is having different values?
Sha1395

ASKER
second column in primary (column)id for that table its called "AssessmentEcoSystemCreditID"
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
Peter Chan

Can you exclude 2nd column and use my original way (by my 1st reply above)?
Sha1395

ASKER
i tried the following

Returns 492 record wich is wrong

select distinct a.* from(SELECT Distinct dbo.tblAssessmentEcosystemCredit.ManagementZoneID, dbo.tblAssessmentEcosystemCredit.AssessmentVersionID , 
(COALESCE(dbo.tblManagementZone.SiteValueCurrentScore,0) 
-COALESCE(dbo.tblManagementZone.SiteValueFutureScore,0)) AS LossinSiteValueScore, 
5 AS SaveType, dbo.ufn_varbintohexstr(dbo.tblAssessmentEcosystemCredit.RowTimestamp) AS RowTimestamp
FROM dbo.tblAssessmentEcosystemCredit 
INNER JOIN dbo.tblVegetationType 
ON dbo.tblAssessmentEcosystemCredit.VegTypeID = dbo.tblVegetationType.VegTypeID 
INNER JOIN dbo.tblManagementZone 
ON dbo.tblAssessmentEcosystemCredit.ManagementZoneID = dbo.tblManagementZone.ManagementZoneID 
INNER JOIN dbo.tblVegetationZone 
ON dbo.tblVegetationZone.VegetationZoneID = dbo.tblManagementZone.VegetationZoneID 
INNER JOIN dbo.tblAssessmentVersion AV ON 
AV.AssessmentVersionID = dbo.tblAssessmentEcosystemCredit.AssessmentVersionID 
INNER JOIN tblAssessment TBA ON 
TBA.AssessmentID = AV.AssessmentID 
WHERE dbo.tblAssessmentEcosystemCredit.AssessmentVersionID= 2151)a

Open in new window


I tried the following query return 14 Record which is correct

Select distinct a.* from (SELECT Distinct dbo.tblAssessmentEcosystemCredit.ManagementZoneID, dbo.tblAssessmentEcosystemCredit.AssessmentVersionID  from tblAssessmentEcosystemCredit where AssessmentVersionID =2151)a

Open in new window

Peter Chan

I told you 2nd column is the reason
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Sha1395

ASKER
even i removed the second column from the main query not returning the correct records
Sha1395

ASKER
So do you think remove the duplicates from the table is a wise idea like below

delete from tblAssessmentEcosystemCredit where AssessmentEcosystemCreditID not in(
select min (AssessmentEcosystemCreditID) from tblAssessmentEcosystemCredit where AssessmentVersionID=2151 group by ManagementZoneID)
and AssessmentVersionID=2151 

Open in new window

Peter Chan

Yes, but please do backup to the tables and ensure you can restore them back if needed!
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
Sha1395

ASKER
Thanks a lot for your help and advice HuaMinChen.
Sha1395

ASKER
I just used  delete query to remove the duplicates.