Link to home
Start Free TrialLog in
Avatar of Sha1395
Sha1395

asked on

how to remove duplicate rows

I have the following query to retrieve bunch of data but somehow it creates duplicate rows for certain records.I tried distinct but no use

What i am doing wrong here
SELECT Distinct dbo.tblAssessmentEcosystemCredit.ManagementZoneID, AssessmentEcosystemCreditID,dbo.tblAssessmentEcosystemCredit.AssessmentVersionID , 
(COALESCE(dbo.tblManagementZone.SiteValueCurrentScore,0) 
-COALESCE(dbo.tblManagementZone.SiteValueFutureScore,0)) AS LossinSiteValueScore, 
5 AS SaveType, dbo.ufn_varbintohexstr(dbo.tblAssessmentEcosystemCredit.RowTimestamp) AS RowTimestamp, 
dbo.tblVegetationZone.EECID, 
CASE WHEN dbo.tblVegetationZone.EECID > 0 THEN 3.0 
ELSE 1.0 
END AS EECOffSetMultiplier 
FROM dbo.tblAssessmentEcosystemCredit 
INNER JOIN dbo.tblVegetationType 
ON dbo.tblAssessmentEcosystemCredit.VegTypeID = dbo.tblVegetationType.VegTypeID 
INNER JOIN dbo.tblManagementZone 
ON dbo.tblAssessmentEcosystemCredit.ManagementZoneID = dbo.tblManagementZone.ManagementZoneID 
INNER JOIN dbo.tblVegetationZone 
ON dbo.tblVegetationZone.VegetationZoneID = dbo.tblManagementZone.VegetationZoneID 
INNER JOIN dbo.tblAssessmentVersion AV ON 
AV.AssessmentVersionID = dbo.tblAssessmentEcosystemCredit.AssessmentVersionID 
INNER JOIN tblAssessment TBA ON 
TBA.AssessmentID = AV.AssessmentID 
WHERE dbo.tblAssessmentEcosystemCredit.AssessmentVersionID= @AssessmentVersionID

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of Peter Chan
Peter Chan
Flag of Hong Kong image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Sha1395
Sha1395

ASKER

Thanks HuaMinChen, after modified the query based on your suggestion still populate the duplicate records(same record count)
Do you mean the same record appears more than once? Can you show the output data?
Avatar of Sha1395

ASKER

Even i modified the query like below

select distinct a.* from(SELECT Distinct dbo.tblAssessmentEcosystemCredit.ManagementZoneID, AssessmentEcosystemCreditID,dbo.tblAssessmentEcosystemCredit.AssessmentVersionID , 

5 AS SaveType, dbo.ufn_varbintohexstr(dbo.tblAssessmentEcosystemCredit.RowTimestamp) AS RowTimestamp 
FROM dbo.tblAssessmentEcosystemCredit 
INNER JOIN dbo.tblAssessmentVersion AV ON 
AV.AssessmentVersionID = dbo.tblAssessmentEcosystemCredit.AssessmentVersionID 
INNER JOIN tblAssessment TBA ON 
TBA.AssessmentID = AV.AssessmentID 
WHERE dbo.tblAssessmentEcosystemCredit.AssessmentVersionID= 2151)a

Open in new window

Book1.xlsx
Avatar of Sha1395

ASKER

huh found it, this probably helps me

select distinct ManagementZoneID from tblAssessmentEcosystemCredit where AssessmentVersionID=2151

Open in new window


will return only 14  records, thats the correct count, thats what i am trying to do. Any idea how to modify my main query
In your Excel, can you point out which rows are duplicated ones?
Avatar of Sha1395

ASKER

ManagementZoneID is the duplicate ones ,for example ManagementZoneID="14340" keep duplicate
But the so-called duplicated rows are having different values of

AssessmentEcosystemCreditID,AssessmentVersionID,...
Avatar of Sha1395

ASKER

Check the ManagementZOne ID "14340" all the values are same ,so it clearly shows its duplicate,if i delete the record based on ManangementZone Id, i will solve this issue
Book2.xlsx
if i delete the record based on ManangementZone Id, i will solve this issue

Please do the backup of the records and proceed to next if possible.
Avatar of Sha1395

ASKER

Do you suggest to delete to ? or still am i missing something ?
Avatar of Sha1395

ASKER

Sorry and what's the best approach to delete those duplicate records based on ManagementZoneID ?
Can you list out the records of this

SELECT Distinct dbo.tblAssessmentEcosystemCredit.ManagementZoneID, AssessmentEcosystemCreditID,dbo.tblAssessmentEcosystemCredit.AssessmentVersionID , 
(COALESCE(dbo.tblManagementZone.SiteValueCurrentScore,0) 
-COALESCE(dbo.tblManagementZone.SiteValueFutureScore,0)) AS LossinSiteValueScore, 
5 AS SaveType, dbo.ufn_varbintohexstr(dbo.tblAssessmentEcosystemCredit.RowTimestamp) AS RowTimestamp, 
dbo.tblVegetationZone.EECID, 
CASE WHEN dbo.tblVegetationZone.EECID > 0 THEN 3.0 
ELSE 1.0 
END AS EECOffSetMultiplier 
FROM dbo.tblAssessmentEcosystemCredit 
INNER JOIN dbo.tblVegetationType 
ON dbo.tblAssessmentEcosystemCredit.VegTypeID = dbo.tblVegetationType.VegTypeID 
INNER JOIN dbo.tblManagementZone 
ON dbo.tblAssessmentEcosystemCredit.ManagementZoneID = dbo.tblManagementZone.ManagementZoneID 
INNER JOIN dbo.tblVegetationZone 
ON dbo.tblVegetationZone.VegetationZoneID = dbo.tblManagementZone.VegetationZoneID 
INNER JOIN dbo.tblAssessmentVersion AV ON 
AV.AssessmentVersionID = dbo.tblAssessmentEcosystemCredit.AssessmentVersionID 
INNER JOIN tblAssessment TBA ON 
TBA.AssessmentID = AV.AssessmentID 
WHERE dbo.tblAssessmentEcosystemCredit.AssessmentVersionID= @AssessmentVersionID
order by 1

Open in new window


to one list again and show it to me?
Avatar of Sha1395

ASKER

I have enclosed the output for your view. Thanks again for your help
Book3.xlsx
In the following
User generated image
do you mean the records are repeating? but how about the 2nd column is having different values?
Avatar of Sha1395

ASKER

second column in primary (column)id for that table its called "AssessmentEcoSystemCreditID"
Can you exclude 2nd column and use my original way (by my 1st reply above)?
Avatar of Sha1395

ASKER

i tried the following

Returns 492 record wich is wrong

select distinct a.* from(SELECT Distinct dbo.tblAssessmentEcosystemCredit.ManagementZoneID, dbo.tblAssessmentEcosystemCredit.AssessmentVersionID , 
(COALESCE(dbo.tblManagementZone.SiteValueCurrentScore,0) 
-COALESCE(dbo.tblManagementZone.SiteValueFutureScore,0)) AS LossinSiteValueScore, 
5 AS SaveType, dbo.ufn_varbintohexstr(dbo.tblAssessmentEcosystemCredit.RowTimestamp) AS RowTimestamp
FROM dbo.tblAssessmentEcosystemCredit 
INNER JOIN dbo.tblVegetationType 
ON dbo.tblAssessmentEcosystemCredit.VegTypeID = dbo.tblVegetationType.VegTypeID 
INNER JOIN dbo.tblManagementZone 
ON dbo.tblAssessmentEcosystemCredit.ManagementZoneID = dbo.tblManagementZone.ManagementZoneID 
INNER JOIN dbo.tblVegetationZone 
ON dbo.tblVegetationZone.VegetationZoneID = dbo.tblManagementZone.VegetationZoneID 
INNER JOIN dbo.tblAssessmentVersion AV ON 
AV.AssessmentVersionID = dbo.tblAssessmentEcosystemCredit.AssessmentVersionID 
INNER JOIN tblAssessment TBA ON 
TBA.AssessmentID = AV.AssessmentID 
WHERE dbo.tblAssessmentEcosystemCredit.AssessmentVersionID= 2151)a

Open in new window


I tried the following query return 14 Record which is correct

Select distinct a.* from (SELECT Distinct dbo.tblAssessmentEcosystemCredit.ManagementZoneID, dbo.tblAssessmentEcosystemCredit.AssessmentVersionID  from tblAssessmentEcosystemCredit where AssessmentVersionID =2151)a

Open in new window

I told you 2nd column is the reason
Avatar of Sha1395

ASKER

even i removed the second column from the main query not returning the correct records
Avatar of Sha1395

ASKER

So do you think remove the duplicates from the table is a wise idea like below

delete from tblAssessmentEcosystemCredit where AssessmentEcosystemCreditID not in(
select min (AssessmentEcosystemCreditID) from tblAssessmentEcosystemCredit where AssessmentVersionID=2151 group by ManagementZoneID)
and AssessmentVersionID=2151 

Open in new window

Yes, but please do backup to the tables and ensure you can restore them back if needed!
Avatar of Sha1395

ASKER

Thanks a lot for your help and advice HuaMinChen.
Avatar of Sha1395

ASKER

I just used  delete query to remove the duplicates.