We help IT Professionals succeed at work.

Check out our new AWS podcast with Certified Expert, Phil Phillips! Listen to "How to Execute a Seamless AWS Migration" on EE or on your favorite podcast platform. Listen Now

x

Perenatge of Duplicates

Anthony Cardullo
on
Medium Priority
246 Views
Last Modified: 2012-05-07
I am trying to find  the Percenatge of duplicates found in a given table e.g I need to know the % of Service calls get a follow up call.
The code below already groups the select result from a view and displays only the records that have a duplicate serial  entry in the view.
THank you
ALTER PROCEDURE [dbo].[usp_CallBackPerSerialNum90Days]
	-- Add the parameters for the stored procedure here
	
AS
 
BEGIN
 
	-- SET NOCOUNT ON added to prevent extra result sets from
	-- interfering with SELECT statements.
	SET ROWCOUNT 0
	SET NOCOUNT ON;
 
SELECT     COUNT(*) AS Recurrance, MODEL, SERIAL, VENDOR, PROGRAM_TYPE, INSTALL_DATE_SEQ, CALL_ID, CALL_STATUS_SCA, CALL_TYPE, 
                      SYMPTOM_CODE, OPEN_DATE_SCA
FROM         ServiceCalls90Days
GROUP BY SERIAL, MODEL, VENDOR, PROGRAM_TYPE, INSTALL_DATE_SEQ, CALL_ID, CALL_STATUS_SCA, CALL_TYPE, SYMPTOM_CODE, 
                      OPEN_DATE_SCA
HAVING      (COUNT(*) > 1)
END

Open in new window

Comment
Watch Question

Anthony CardulloSystem Administrator

Author

Commented:
Upped the points value
Try some of my code from below.  It should be pretty close to what you need, though of course I can't test it.  Let me know!  And if it doesn't work as advertised, let me know specifically which database engine you're using.

Enjoy! :-)



ALTER PROCEDURE [dbo].[usp_CallBackPerSerialNum90Days]
        -- Add the parameters for the stored procedure here
        
AS
 
SET NOCOUNT ON
 
 
/* declare all variables */
DECLARE @numTotal int
DECLARE @numUnique int
DECLARE @numUnqDupes int
DECLARE @numDupes int
DECLARE @numSingles int
DECLARE @pctDupeEntries float
DECLARE @pctUnqDupes float
 
 
/* get counts from database */
 
SET @numTotal = (SELECT COUNT(*) FROM ServiceCalls90Days)
 
SET @numUnique = (SELECT COUNT(DISTINCT SERIAL) FROM ServiceCalls90Days)
 
SET @numUnqDupes = (SELECT COUNT(*)
	FROM (SELECT SERIAL, COUNT(SERIAL) as NumSERIAL FROM ServiceCalls90Days GROUP BY SERIAL) TT
	WHERE TT.NumSERIAL > 1
  )
 
SET @numDupes = (SELECT COUNT(*) FROM ServiceCalls90Days WHERE SERIAL IN (
	SELECT SERIAL
	FROM (SELECT SERIAL, COUNT(SERIAL) as NumSERIAL FROM ServiceCalls90Days GROUP BY SERIAL) TT
	WHERE TT.NumSERIAL > 1
  )
 
SET @numSingles = (SELECT (@numTotal - @numDupes))
/* OR */
--SET @numSingles = (SELECT (@numUnique - @numUnqDupes))
/* OR */
--SET @numSingles = (SELECT COUNT(*) FROM
--	(SELECT SERIAL, COUNT(SERIAL) as NumSERIAL FROM ServiceCalls90Days GROUP BY SERIAL) TT
--	WHERE TT.NumSERIAL = 1
--  )
 
 
/* determine percentage of database entries that are non-unique */
SET @pctDupeEntries = (SELECT (CAST(@numDupes as float)/CAST(@numTotal as float)))
 
/* determine percentage of unique serial numbers that exist more than once in database */
SET @pctUnqDupes = (SELECT (CAST(@numUnqDupes as float)/CAST(@numUnique as float)))
 
 
/* choose the percentage you want, then select as your result */
SELECT @pctDupeEntries as PercentDupes

Open in new window

Anthony CardulloSystem Administrator

Author

Commented:
Msg 156, Level 15, State 1, Procedure usp_CallBackPerSerialNum90Days, Line 45
Incorrect syntax near the keyword 'SET'.
Anthony CardulloSystem Administrator

Author

Commented:
Btw  ServiceCalls90Days  is a view
Anthony CardulloSystem Administrator

Author

Commented:
I think thew problem is in here somwhere
SET @numDupes = (SELECT COUNT(*) FROM ServiceCalls90Days WHERE SERIAL IN (
      SELECT SERIAL
      FROM (SELECT SERIAL, COUNT(SERIAL) as NumSERIAL FROM ServiceCalls90Days GROUP BY SERIAL) TT
      WHERE TT.NumSERIAL > 1
  )
Anthony CardulloSystem Administrator

Author

Commented:
Also  shouldnt there be a * 100  in there somewhere to calculate the actuall % and shouldnt the variables be decimal 10,4 or something?  THanks
Anthony CardulloSystem Administrator

Author

Commented:
SOrry for the many post but i missed your question.  im running 2005 SQL
Sorry, there was a typo-- a parenthesis missing in the block of code you identified.  Try the code below and let me know if you have any further errors.

The fact that ServiceCalls90Days is a view isn't a problem in theory.  Depending on the complexity of the view and its feeder tables, there's always the possibility of efficiency issues.  You'll just have to see how quick your responsiveness is when you implement this stored proc.  However, there's nothing too heavy duty about this code, so any slow-downs you experience will be more the result of the view than anything else.

As for percentages, you can format things however you want.  A percentage is, by definition, a fraction, therefore I formatted this query to return percentages as a decimal number between 0 and 1.  Presumably you're implementing this stored procedure to be called by a program or webpage, and so you should probably do your formatting in the code that calls the stored proc.  If, however, you're going to be displaying these figures exactly as they appear, you could do something like this:

SELECT CAST(@pctDupeEntries*100 as numeric(5,2)) as PercentDupes

The 5 indicates the number of total digits, whereas the 2 indicates the number of digits to the right of the decimal.  Since there won't be a number greater than 100, you only need 3 digits left of the decimal.  However, again, unless you're displaying the number exactly as it's returned by the stored proc, I recommend returning it as a float, because you get greater accuracy that way-- then you can format it in your calling code.

ALTER PROCEDURE [dbo].[usp_CallBackPerSerialNum90Days]
        -- Add the parameters for the stored procedure here
        
AS
 
SET NOCOUNT ON
 
 
/* declare all variables */
DECLARE @numTotal int
DECLARE @numUnique int
DECLARE @numUnqDupes int
DECLARE @numDupes int
DECLARE @numSingles int
DECLARE @pctDupeEntries float
DECLARE @pctUnqDupes float
 
 
/* get counts from database */
 
SET @numTotal = (SELECT COUNT(*) FROM ServiceCalls90Days)
 
SET @numUnique = (SELECT COUNT(DISTINCT SERIAL) FROM ServiceCalls90Days)
 
SET @numUnqDupes = (SELECT COUNT(*)
        FROM (SELECT SERIAL, COUNT(SERIAL) as NumSERIAL FROM ServiceCalls90Days GROUP BY SERIAL) TT
        WHERE TT.NumSERIAL > 1
  )
 
SET @numDupes = (SELECT COUNT(*) FROM ServiceCalls90Days WHERE SERIAL IN (
        SELECT SERIAL
        FROM (SELECT SERIAL, COUNT(SERIAL) as NumSERIAL FROM ServiceCalls90Days GROUP BY SERIAL) TT
        WHERE TT.NumSERIAL > 1
    )
  )
 
SET @numSingles = (SELECT (@numTotal - @numDupes))
/* OR */
--SET @numSingles = (SELECT (@numUnique - @numUnqDupes))
/* OR */
--SET @numSingles = (SELECT COUNT(*) FROM
--      (SELECT SERIAL, COUNT(SERIAL) as NumSERIAL FROM ServiceCalls90Days GROUP BY SERIAL) TT
--      WHERE TT.NumSERIAL = 1
--  )
 
 
/* determine percentage of database entries that are non-unique */
SET @pctDupeEntries = (SELECT (CAST(@numDupes as float)/CAST(@numTotal as float)))
 
/* determine percentage of unique serial numbers that exist more than once in database */
SET @pctUnqDupes = (SELECT (CAST(@numUnqDupes as float)/CAST(@numUnique as float)))
 
 
/* choose the percentage you want, then select as your result */
SELECT @pctDupeEntries as PercentDupes

Open in new window

Anthony CardulloSystem Administrator

Author

Commented:
I am calling this procedure from the reporting services.
Anthony CardulloSystem Administrator

Author

Commented:
OK  this si strange
the Numtotal  and numDupes  are yielding the same acount  and the numUnique and numUnqDupes are also yieliding the same amount?
Anthony CardulloSystem Administrator

Author

Commented:
and the reulst of the query is 1  which is not possible becau ei have way more dupliacte srial numbers in that table
Unlock this solution with a free trial preview.
(No credit card required)
Get Preview
Anthony CardulloSystem Administrator

Author

Commented:
Thank you. Great Contributor
Thank you. :-)
Anthony CardulloSystem Administrator

Author

Commented:
Just wanted to say that i am very happy wiht your solution and the time you took to show me how you derived your comnclusions.  Thanks again!
It was truly my pleasure.  I love this stuff!
Anthony CardulloSystem Administrator

Author

Commented:
Bob, if you dont mind.  How would i alter this SP to reflect the call back %  for every REPRESENTATIVE (field in souce view)  thanks.   ill be happy to open new question if youw ant me too.
Since it's beyond the bounds of your original question, why don't you go ahead and open a new question.  Feel free to repost the code I shared above-- or cross reference this page-- and make sure to specify which percentage calculation you want.  The solution will involve nesting a few queries.  Anyway, once you post the new question, if you want to put a link here, I'll try to take a look at it.  It might be a little while, though, so if someone else can get to it first, that's great too!
Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a free trial preview!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.