Query to display missing sequence numbers/range for separate groups/subgroups


I have been struggling with this query for a while now. Table structure is:

Table A : Aid, sequenceNumber, CategoryId, SubCategoryId
TableCategory: CategoryId, CategoryName
TableSubCategory: SubCategoryId, SubCategoryName

For each subcategory the sequence number will reset as well as for each category so for example:
For Category A, subcategory A it will be 1,2,3...n
For Category A, subcategory B it will be 1,2,3...n  
For Category B, subcategory A it will be 1,2,3...n and so on . It will always start from 1 to whatever.

I would like to write a query to find missing values in a sequence which is grouped by category and by subcategory as well as gaps - 'from', 'to' values

So for example, I should be able to tell values 2,3 and 4 through 8 are missing in Category A, subcategory B and so forth... I was able to write it without groupings but I am not sure how to handle 2 tier grouping and make sure each group and subgroup have the correct sequence order.

Let me know if I am not clear and any help would be appreciated.
Who is Participating?
cyberkiwiConnect With a Mentor Commented:
As ranges instead of individual records

select A.CategoryId, A.SubCategoryId, isnull((
      select top(1) c.sequenceNumber from TableA C
      where A.CategoryId=C.CategoryId and A.SubCategoryId=C.SubCategoryId
      and c.sequenceNumber<A.sequenceNumber
      order by c.sequenceNumber desc
      )+1,1) [From], A.sequenceNumber-1 [To]
from TableA A
where A.sequenceNumber>1 and not exists (
      select * from TableA B
      where A.CategoryId=B.CategoryId and A.SubCategoryId=B.SubCategoryId
      and B.sequenceNumber=A.sequenceNumber -1)
Would you care to put out some result rows complete with columns desired?
This gives you the missing numbers
;with A AS (
  select CategoryId, SubCategoryId, max(sequenceNumber) as MaxSeq
  from TableA
  group by CategoryId, SubCategoryId)
select A.CategoryId, A.SubCategoryId, v.number
from A
inner join master..spt_values v on v.type='P' and v.number between 1 and A.MaxSeq
left join TableA B on A.CategoryId=B.CategoryId and A.SubCategoryId=B.SubCategoryId and B.sequenceNumber = v.number
Where B.sequenceNumber is null

Open in new window

2018 Annual Membership Survey

Here at Experts Exchange, we strive to give members the best experience. Help us improve the site by taking this survey today! (Bonus: Be entered to win a great tech prize for participating!)

askaspAuthor Commented:
This was my reference when I started to build up my query.

But I get stuck when I tried to implement 2 levels of grouping. Query is to double check the sequence order so the result set would only contain information about the missing values. An example would be:

Table A_Name  CategoryName SubcategoryName  MissingValueStartAt   MissingValueStopAt
    test                        A                        A1                             20                                 20
    test15                    A                        A2                             3                                   8
    test20                    B                        B3                              123                             123
    test24                    C                        C4                              47                                50

So for Category A, subcategory A1 -  I am missing the value "20".
For Category A , subcategory A2 -  I am missing values 3,4,5,6,7,8
For Category  B, subcategory B3   - I am missing the value "123".

I hope it is a little bit more clear, if not I'd happy to try again.
askaspAuthor Commented:
I think this is working so far at least with my sample data, I will be able to test it with the actual data at work tomorrow.

Have two questions though:
 - What happens if sequenceNumber hasn't set and is NULL for the entire SubCategory? seems like it ignores it?
 - Can you explain the purpose of isnull and "select top(1)" in your query?

The where clause finds sequences that are hanging (has no immediately preceeding sequence number).
So, given a range like   1,3,4,5,8,9,  it will produce 3 and 8.

The top 1.. order by sequencenumber desc goes backwards to find the first (largest) number after the gap.
For 8, it will find 5 and for 3, 1.
If it does find a number, we add 1 to it - that is the first missing number.
If it does not find any, for example a sequence like 4,5,6,7,9  (there is no number before 4), it returns null and we sub that for 1.

Hope that helps.
>   - What happens if sequenceNumber hasn't set and is NULL for the entire SubCategory? seems like it ignores it?

Exactly.  Without a Max sequenceNumber to work with, what should it return?
Do you want to identify this group based on every possible conbinations of CategoryId x SubCategoryId that does not exist in TableA?

union all
select C.CategoryId, S.SubCategoryId, 1 [From], 1 [To]
From TableCategory C
Cross join TableSubCategory S
left join tableA A on A.CategoryId=C.CategoryId and S.SubCategoryId=A.SubCategoryId
askaspAuthor Commented:
Thank you very much. I will try it out and can test better at work tomorrow. I will let you know the first thing in the morning.
askaspAuthor Commented:
Well actually for that I do have a query that double checks for unassigned values and identifies them for each Category and Subcategory.

The application I am developing is at its final stage where user will re-order each entry the way they want to see on the application interface. So it will be the user re-ordering 3k plus entries around under each category and subcategory, and I was trying to implement as many checks for them to make the process a little bit easier :)
askaspAuthor Commented:

It works the way I expected; however there are couple of things I'd like to ask you for improvement.

 - Is it possible to write the same query with "inner/outer join" syntax which could be more efficient?

 - And again since I will be showing the names of CategoryName and SubCategoryName instead of just their id's in TableA, would it be possible to join the tables with their foreign keys so that I can access to the fields in other tables as well instead of using only self join on TableA. (CategoryId and SubCategoryId are FK's in TableA)

So maybe work off of a joined subquery and then loop through the missing values/ranges?

Thank you for your time and effort on this.
askaspAuthor Commented:
Ok so after playing around a little bit , this is what I have so far

select Category.CategoryName, SubCategory.SubCategoryName
, isnull((
select top(1) c.sequenceNumber from TableA C   --inner join
where A.CategoryId=C.CategoryId and A.SubCategoryId=C.SubCategoryId
and c.sequenceNumber1 and not exists (
select * from TableA B       --inner join A on
where A.CategoryId=B.CategoryId and A.SubCategoryId=B.SubCategoryId
and B.sequenceNumber=A.sequenceNumber -1)
I managed to join the tables to pull CategoryName and SubCategoryName.  But I couldn't get the other ones right (the bold commented out text). And at the bottom I can not use inner join A as an alias since it gives an error so this is as far as I could come by tweaking your query. Thought I'd let you know to make myself more clear.
And if we can't get this one I will go with what we have.
askaspAuthor Commented:

I think I am happy with the way it is right now after tweaking it a little bit more. Thank you for your help.

askaspAuthor Commented:
Solution was clear and put me in the right direction.
All Courses

From novice to tech pro — start learning today.