Link to home
Start Free TrialLog in
Avatar of drotkopf
drotkopfFlag for United States of America

asked on

Detect Duplicate Values

Hello Experts,
Here's a simple one :-)
I have a table with (simplified) these fields:
DateTime, PhoneNumber, Duration.
I want to find
(A) a phone number that was called 5 times during a datetime interval (let's say, between '2004-09-20 08:00:00' and '2004-09-25 20:00:00' , and know the DateTime value for that. What's the SQL statement I should use?
and
(B) the same as (A), but I want to narrow down the datetime interval per day. For example, the first day between 1pm and 1:30 pm, the second day between 9:15 am and 9:35 am, etc.
I'm not a SQL expert, so please be precise and complete in your syntax.
Thanks!
Avatar of geotiger
geotiger

(A)

select phonenumber, count(*) as cnt
from your_table
where [DATETIME] between cast('2004-09-20 08:00:00' as datetime ) and cast('2004-09-25 20:00:00' as datetime )
having count(*) = 5
group by phonenumber;

(B) the same with specific time period:

select phonenumber, count(*) as cnt
from your_table
where [DATETIME] between cast('2004-09-20 13:00:00' as datetime ) and cast('2004-09-20 13:30:00' as datetime )
Or [DATETIME] between cast('2004-09-21 09:15:00' as datetime ) and cast('2004-09-21 09:35:00' as datetime )
having count(*) = 5
group by phonenumber;
Avatar of Scott Pletcher
I may have misinterpreted your q, but if not, I think the second one is somewhat trickier than that because you could have 5 calls within one time period only and presumably you wouldn't want a match then, that is, only match if there was at least one call in every different period and 5 overall, say.  

If I have misinterpreted, please let me know.

Something like this should guarantee at least one call in each time period and a certain number overall:


SELECT phoneNumber, time1Count + time2Count AS [Total Calls]
FROM (
      SELECT phonenumber,
            SUM(CASE WHEN [datetime] BETWEEN '2004-09-20 13:00:00' AND '2004-09-20 13:30:00' THEN 1 ELSE 0 END) AS time1Count,
            SUM(CASE WHEN [datetime] BETWEEN '2004-09-21 09:15:00' AND '2004-09-21 09:35:00' THEN 1 ELSE 0 END) AS time2Count
      FROM yourTable
      GROUP BY phonenumber
) AS derived
WHERE time1Count > 0 AND time2Count > 0
AND (time1Count + time2Count) >= 5
Avatar of drotkopf

ASKER

geotinger: it works fine, but I'm getting too many answers (750+).

ScottPletcher: you  had a very good idea, however, please check it, because I'm getting 0 records as result, even though I'm entering from 1 am to 11 pm (calls are made 8-8). If I'm getting 750+ using geotinger's, I should get them with you statement also when using the 1 am to 11 pm range for each of the days. Am I wrong?
Also, I think we can do away with (time1Count + time2Count + ...) >= 5 because that comes automatic with WHERE time1Count > 0 AND time2Count > 0... Isn't it?

I'd appreciate you input.
ASKER CERTIFIED SOLUTION
Avatar of Scott Pletcher
Scott Pletcher
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
>> Also, I think we can do away with (time1Count + time2Count + ...) >= 5 because that comes automatic with WHERE time1Count > 0 AND time2Count > 0... Isn't it? <<

You're right, depending on the number of different date ranges.  If you have five different ranges, and only require a total of 5 or more, then yes, you don't need it.  But of course if you have only 2 or 3 different ranges but want a total of 5 calls, then you will need to compare the total also.