asked on

SQL 2012 identify continous numbers

I have the following numbers
Id FirstNumber SecondNumber
1 14978 15135
2 15135 15256
3 15137 15256
4 15154 15256
5 15156 15256
6 15157 15175
7 15175 15256
8 15256 15292
I would like to be able to remove records where the second number is not continuous to the first record.
So the result would be
Id FirstNumber SecondNumber
1 14978 15135
2 15135 15256
3 15256 15292

Here is some of the code to create the table

Create table #NumberExample
(ID identity(1,1)
,FirstNumber int
,SecondNumber int)

insert into (FirstNumber,SecondNumber)
select 14978, 15135 union all
select 15135, 15256 union all
select 15137, 15256 union all
select 15154, 15256 union all
select 15156, 15256 union all
select 15157, 15175 union all
select 15175, 15256 union all
select 15256, 15292

Thanks for all the help!

Jim Horn

That's going to be a little more involved than a simple answer, as it's asking for 'gaps and islands' in a series, but I have an article out there called T-SQL: Identify bad dates in a time series that does exactly that for a date time series, which you can manipulate to handle your situation.

Good luck.

yanci1179

ASKER

I'm sorry if I don't understand, but your example has policy_number, which allows for some level of grouping. In my example, I don't have an identifying number other than the identity key. Also, I don't have null values or the first number will not be greater than the second number.

Thanks again for the help.

yanci1179

ASKER

for this example, I can flag the min and max id's where 15256 exists and then I can get the min of number1 and max of number 2. this works for this example, but it doesn't for the following:

ID Number1 Number2
1 16 40
2 29 339
3 40 41
4 41 339

The result should be
Number1 Number 2
16 40
40 41
41 339

Aaron Tomosky

How about simply looking for a distinct number2? In your example the first distinct number2 is the correct one and the bad ones are all duplicates. So something like this (I don't use partition by often so this may not run as is)

select
    id,
    number1,
    number2

from
(
select 
    *, 
    row_number() over (partition by number2 order by id) as RowNbr 

from YourTable
) source

where RowNbr = 1

Open in new window

Mark Wills

Interesting problem, especially the second example where they are not in sequence and would need to be sorted first.

Is there ever case where they might overlap, or, numbers might go missing ?

for example, in your 2nd example, there really isn't a justifiable starting point for second row, but assuming that is the case, could we also get (say) another row 500 - 600 and hen another overlapping 399-602 ?

While you are thinking of that, I will think of how I might solve...

Thanks for the question, sounds challenging :)

PortletPaul

try this please:

select
       ne.ID, ne.Number1, ne.Number2
from NumberExample2 ne
cross apply (
        SELECT 1
        FROM NumberExample2 ne2
        WHERE ne.Number1 < ne2.Number2 AND ne2.Number1 < ne.Number2
        AND NOT (ne.Number1 = ne2.Number1 AND ne2.Number2 = ne.Number2)
            ) as ca (isLapped)
group by
       ne.ID, ne.Number1, ne.Number2
having count(*) = 1
;

Open in new window

note I was using the smaller second set of numbers to test with.

see: http://sqlfiddle.com/#!6/d48ee/2

    CREATE TABLE NumberExample2
        ([ID] int, [Number1] int, [Number2] int)
    ;
        
    INSERT INTO NumberExample2
        ([ID], [Number1], [Number2])
    VALUES
        (1, 16, 40),
        (2, 29, 339),
        (3, 40, 41),
        (4, 41, 339)
    ;
    
    
**Query 1**:

    select
           ne.ID, ne.Number1, ne.Number2
    from NumberExample2 ne
    cross apply (
            SELECT 1
            FROM NumberExample2 ne2
            WHERE ne.Number1 < ne2.Number2 AND ne2.Number1 < ne.Number2
            AND NOT (ne.Number1 = ne2.Number1 AND ne2.Number2 = ne.Number2)
                ) as ca (isLapped)
    group by
           ne.ID, ne.Number1, ne.Number2
    having count(*) = 1
    

**[Results][2]**:
    | ID | Number1 | Number2 |
    |----|---------|---------|
    |  1 |      16 |      40 |
    |  3 |      40 |      41 |
    |  4 |      41 |     339 |

Open in new window

PortletPaul

mmm, seems I should have used an OUTER APPLY (not CROSS APPLY) but my suggestion doesn't produce the expected result for the first set. sorry.

Nick67

Doing something in VBA in Access, or in SQL Server in a stored procedure wouldn't be that hard
You'd load your table in an ordered fashion.
You'd create a table variable
You'd throw SecondNumber into a variable @TheSecond
Then you'd walk down the ordered table, and toss any record where FirstNumber <= @TheSecond into the table variable. For any record failing that test, the associated SecondNumber becomes @TheSecond
Keep walking to the end of the ordered table
Join the table variable to the table on the PK and delete what's there.

But that's a RBAR solution, and not straight SQL

But @Paul, if I use these numbers in your fiddle

(1, 16, 40),
(2, 29, 339),
(3, 40, 41),
(4, 41, 339),
(5, 15, 21),
(6, 92, 440),
(7, 65, 600)

I would expect to get ID=5 and ID = 2 back
But your solution yields 5 and 3.
I defer to you, though, on set based solutions

Nick67

Kevin Cross

Is this really SQL 2012?
If yes, remember you have the LEAD and LAG functions as well as the ability to do more windowing. Therefore, I can search for the LEAD OR LAG value is one off from my own. Can go into this further if we are dealing with SQL 2012.

EugeneZ

you need something to do with your sample data

1 14978 15135
2 15135 15256
3 15137 15256
4 15154 15256
5 15156 15256
6 15157 15175
7 15175 15256
8 15256, 15292

and as an another idea for the code

select e.id,LEAD(e.firstNumber,0) OVER (ORDER BY e.firstNumber
) Num1,
LAG(e.SecondNumber,0) OVER (ORDER BY e.firstNumber)Num2
from NumberExample e

ASKER CERTIFIED SOLUTION

Mark Wills

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

yanci1179

ASKER

Hi Mark,

it does work when the numbers are separate, but all the records are in one table. I have actually a total of 134 records for this batch. Attached is an insert table for the records. I did a convoluted join....like the following and seemed to work for this example, but it did not work for a different batch of numbers....so I'm missing something

select * from #NumberExample
where ID not in (
select
distinct result1.ID
from #NumberExample result1
inner join #NumberExample result2
on result1.FirstNumber = result2.SecondNumber
) and ID = 1
union
select distinct result1.*
from #NumberExample result1
inner join #NumberExample result2
on result1.FirstNumber = result2.SecondNumber

Thank you everyone for the help! I have been going a little crazy on trying to figure this out. If someone has another solution or sees where I have a flaw in this query please let me know!
SeriesNumbers_v1.sql

SOLUTION

EugeneZ

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

yanci1179

ASKER

OH WOW!! thank you so much!! That solved it!

yanci1179

ASKER

Mark Thank you again for your response. I really appreciate everyone's help.

Mark Wills

Hi yanci1179,

Sorry I didn't respond to your comment earlier, disadvantage of time-zone variations. But very happy that there is great support in Experts-Exchange whereby EugeneZ was able to step-up and help both of us.

And glad the Experts were able to help you find a solution. I for one, enjoyed your question :)

EugeneZ, thanks mate, much appreciated and most welcomed your collaboration and support.

Cheers,
Mark