Solved

Advanced SQL Query (MS-SQL 2008)

Posted on 2011-03-08
5
391 Views
Last Modified: 2012-05-11
I am having problems with a query... unfortunately I need to use fictitious data so here goes.

I need to grab data from 4 different tables.  Table A has a primary key called "id".  This is a foreign key in the other 3 tables, B,C and D.

There is a 'date' field in Table A.  There are 'updated' date fields in the 3 other tables.

There are also 'groupnumber' fields in each of the three tables, B, C and D but not all tables may necessarily have a groupnumber for a given id (well, a.id more specifically)

I need to select all of the "id" from table A that are within a date range and the latest updated 'groupnumbers' along with other fields from the corresponding table.   If a groupnumber does not exist in one of the three tables (B, C, and D), it should return NULL for the fields selected in that table.

For example:

Table A (id, date, ...)
1, Mar 8 2011
2, Mar 8 2011
...

Table B (id, table_a_id, updated, groupnumber, fieldx, fieldy, fieldz)
32423, 1, Mar 8 2011 @ 1pm, 55, 1, 12, 13
32424, 1, Mar 8 2011 @ 3pm, 57, 14, 15, 16
32425, 1, Mar 8 2011 @ 2pm, 55, 17, 18, 19
32426, 2, Mar 8 2011 @ 4pm, 55, 20, 21, 21

Table C (id, table_a_id, updated, groupnumber, fielda, fieldb)
43534, 1, Mar 8 2011 @ 12pm, 55, 23, 24
43535, 2, Mar 8 2011 @ 1pm, 55, 26, 27
43536, 2, Mar 8 2011 @ 1pm, 57, 29, 30

Table D (id, table_a_id, updated, groupnumber, fieldm, fieldn)
23345, 1, Mar 8 2011 @ 3pm, 57, 32, 33
23345, 2, Mar 8 2011 @ 3pm, 54, 35, 36

If the where clause was A.date > 'Mar 8 2011' and A.date < 'Mar 9 2011'

The returned data should be:
(table_a_id, groupnumber, fieldx, fieldy, fieldz, fielda, fieldb, fieldm, fieldn)
1, 55, 17, 18, 19, 26, 27, NULL, NULL
1, 57, 14, 15, 16, 29, 30, 32, 33
2, 54, NULL, NULL, NULL, NULL, NULL, 35, 36

Hope this makes sense!

To recap,

- SELECT all id values from table A within date range
- There will be a unique ID, GroupNumber in each returned row
- I need the most recent groupnumber row (most recent 'dateupdated')
- If groupnumber in one of the three tables does not exist, return NULL for the columns that were to be selected.

I believe I may need to use CTE and/or RANK() OVER PARTITION...  but not certain.

Thank you!
Adrian
0
Comment
Question by:adrian78
  • 2
  • 2
5 Comments
 
LVL 50

Expert Comment

by:Lowfatspread
ID: 35073896
yes use row_number()

but i do't see the need for a cte....

you just need to left join (?) (or Inner Join)
to the FULL JOINed set of the other 3 tables B,C,D

like this
Select a.ID
      ,z.groupnumber
      ,fieldx,fieldy,fieldz,fielda,fieldb,fieldm,fieldn
  From TableA as A
 Left Outer Join
(
  Select coalesce(b.tablea_id,c.tablea_id,d,tablea_id) as id
        ,coalesce(b.groupnumber,c.groupnumber,d.groupnumber) as groupnumber
        ,fieldx,fieldy,fieldz,fielda,fieldb,fieldm,fieldn
    from
(
Select y.*
  from (
select x.*
      ,row_number() over (partition by tableA_id
                             ,groupnumber 
                           order by updated desc) as rn
  from tableB as x
       ) as Y
  Where rn=1
) as B
Full Outer Join
(
Select y.*
  from (
select x.*
      ,row_number() over (partition by tableA_id
                             ,groupnumber 
                           order by updated desc) as rn
  from tableC as x
       ) as Y
  Where rn=1
) as C
on b.tablea_id=c.tablea_id
and b.groupnumber=c.groupnumber
full outer join
(
Select y.*
  from (
select x.*
      ,row_number() over (partition by tableA_id
                             ,groupnumber 
                           order by updated desc) as rn
  from tableD as x
       ) as Y
  Where rn=1
) as d
on c.tablea_id=b.tablea_id
and c.groupnumber=b.groupnumber
) as Y
) as Z
on A.id=Z.id
Where a,date between '20110308' and '20110309'
order by 1

Open in new window

0
 
LVL 40

Accepted Solution

by:
Sharath earned 350 total points
ID: 35075032
Try this query. Replace @A, @B, @C and @D with your actual table names. I doubt on your expected result.For table_a_id 1 and groupnumber 57, there is no corresponding records in table C, hence fielda and fieldb should be NULL.
;WITH DriverCTE 
     AS (SELECT t1.*, 
                t2.groupnumber 
           FROM @A t1 
                LEFT JOIN (SELECT table_a_id, 
                                  groupnumber 
                             FROM @B 
                           UNION 
                           SELECT table_a_id, 
                                  groupnumber 
                             FROM @C 
                           UNION 
                           SELECT table_a_id, 
                                  groupnumber 
                             FROM @D) t2 
                  ON t1.id = t2.table_a_id) 
SELECT a.id, a.[date], a.groupnumber, b.fieldx, b.fieldy, b.fieldz, c.fielda, c.fieldb, d.fieldm, d.fieldn 
  FROM DriverCTE a 
       LEFT JOIN (SELECT * 
                    FROM (SELECT *, 
                                 ROW_NUMBER() 
                                   OVER(PARTITION BY table_a_id,groupnumber ORDER BY updated DESC) rn
                            FROM @B) B 
                   WHERE rn = 1) b 
         ON a.id = b.table_a_id 
            AND a.groupnumber = b.groupnumber 
       LEFT JOIN (SELECT * 
                    FROM (SELECT *, 
                                 ROW_NUMBER() 
                                   OVER(PARTITION BY table_a_id,groupnumber ORDER BY updated DESC) rn
                            FROM @C) C 
                   WHERE rn = 1) c 
         ON a.id = c.table_a_id 
            AND a.groupnumber = c.groupnumber 
       LEFT JOIN (SELECT * 
                    FROM (SELECT *, 
                                 ROW_NUMBER() 
                                   OVER(PARTITION BY table_a_id,groupnumber ORDER BY updated DESC) rn
                            FROM @D) D 
                   WHERE rn = 1) d 
         ON a.id = d.table_a_id 
            AND a.groupnumber = d.groupnumber 
 WHERE a.[date] BETWEEN '20110308' AND '20110309'

Open in new window

Tested with your sample data.
DECLARE  @A  TABLE( 
                       id INT, 
                   [date] DATETIME 
                   ) 

INSERT @A 
VALUES(1,'2011-03-08'), 
      (2,'2011-03-08') 

DECLARE  @B  TABLE( 
                            id INT, 
                    table_a_id INT, 
                       updated DATETIME, 
                   groupnumber INT, 
                        fieldx INT, 
                        fieldy INT, 
                        fieldz INT 
                   ) 

INSERT @B 
VALUES(32423,1,'2011-03-08 13:00:00',55,1,12,13), 
      (32424,1,'2011-03-08 15:00:00',57,14,15,16), 
      (32425,1,'2011-03-08 14:00:00',55,17,18,19), 
      (32426,2,'2011-03-08 16:00:00',55,20,21,21) 

DECLARE  @C  TABLE( 
                            id INT, 
                    table_a_id INT, 
                       updated DATETIME, 
                   groupnumber INT, 
                        fielda INT, 
                        fieldb INT 
                   ) 

INSERT @C 
VALUES(43534,1,'2011-03-08 12:00:00',55,23,24), 
      (43535,2,'2011-03-08 13:00:00',55,26,27), 
      (43536,2,'2011-03-08 13:00:00',57,29,30) 

DECLARE  @D  TABLE( 
                            id INT, 
                    table_a_id INT, 
                       updated DATETIME, 
                   groupnumber INT, 
                        fieldm INT, 
                        fieldn INT 
                   ) 

INSERT @D 
VALUES(23345,1,'2011-03-08 15:00:00',57,32,33), 
      (23345,2,'2011-03-08 15:00:00',54,35,36); 

;WITH DriverCTE 
     AS (SELECT t1.*, 
                t2.groupnumber 
           FROM @A t1 
                LEFT JOIN (SELECT table_a_id, 
                                  groupnumber 
                             FROM @B 
                           UNION 
                           SELECT table_a_id, 
                                  groupnumber 
                             FROM @C 
                           UNION 
                           SELECT table_a_id, 
                                  groupnumber 
                             FROM @D) t2 
                  ON t1.id = t2.table_a_id) 
SELECT a.id, a.[date], a.groupnumber, b.fieldx, b.fieldy, b.fieldz, c.fielda, c.fieldb, d.fieldm, d.fieldn 
  FROM DriverCTE a 
       LEFT JOIN (SELECT * 
                    FROM (SELECT *, 
                                 ROW_NUMBER() 
                                   OVER(PARTITION BY table_a_id,groupnumber ORDER BY updated DESC) rn
                            FROM @B) B 
                   WHERE rn = 1) b 
         ON a.id = b.table_a_id 
            AND a.groupnumber = b.groupnumber 
       LEFT JOIN (SELECT * 
                    FROM (SELECT *, 
                                 ROW_NUMBER() 
                                   OVER(PARTITION BY table_a_id,groupnumber ORDER BY updated DESC) rn
                            FROM @C) C 
                   WHERE rn = 1) c 
         ON a.id = c.table_a_id 
            AND a.groupnumber = c.groupnumber 
       LEFT JOIN (SELECT * 
                    FROM (SELECT *, 
                                 ROW_NUMBER() 
                                   OVER(PARTITION BY table_a_id,groupnumber ORDER BY updated DESC) rn
                            FROM @D) D 
                   WHERE rn = 1) d 
         ON a.id = d.table_a_id 
            AND a.groupnumber = d.groupnumber 
 WHERE a.[date] BETWEEN '20110308' AND '20110309'
 /*
 id	date	groupnumber	fieldx	fieldy	fieldz	fielda	fieldb	fieldm	fieldn
1	2011-03-08 00:00:00.000	55	17	18	19	23	24	NULL	NULL
1	2011-03-08 00:00:00.000	57	14	15	16	NULL	NULL	32	33
2	2011-03-08 00:00:00.000	54	NULL	NULL	NULL	NULL	NULL	35	36
2	2011-03-08 00:00:00.000	55	20	21	21	26	27	NULL	NULL
2	2011-03-08 00:00:00.000	57	NULL	NULL	NULL	29	30	NULL	NULL
 */

Open in new window

0
 

Author Comment

by:adrian78
ID: 35075144
Hi LowFat... I'm having a problem getting your translated query to execute.  I think there is a missing bracket along the way so I just need to flush that out first and then hopefully get it to run.

Sharath, thanks for your solution.  I will give it a go as well.

Thanks, guys!
0
 
LVL 50

Assisted Solution

by:Lowfatspread
Lowfatspread earned 150 total points
ID: 35080004
yes sorry had extra closing bracket, also had incorrect join on clause for tabled
see attached....

is there any further differentiation tp be applied to determine the latest row when the updated time is the same?

(if you want all rows in those cases then replace row_number() with Dense_rank()   )
Select a.ID
      ,z.groupnumber
      ,fieldx,fieldy,fieldz,fielda,fieldb,fieldm,fieldn
  From tableA as A
 Left Outer Join
(
  Select coalesce(b.tablea_id,c.tablea_id,d.tablea_id) as id
        ,coalesce(b.groupnumber,c.groupnumber,d.groupnumber) as groupnumber
        ,fieldx,fieldy,fieldz,fielda,fieldb,fieldm,fieldn
    from
(
Select y.*
  from (
select x.*
      ,row_number() over (partition by tableA_id
                             ,groupnumber 
                           order by updated desc) as rn
  from tableB as x
       ) as Y
  Where rn=1
) as B
Full Outer Join
(
Select y.*
  from (
select x.*
      ,row_number() over (partition by tableA_id
                             ,groupnumber 
                           order by updated desc) as rn
  from tableC as x
       ) as Y
  Where rn=1
) as C
on b.tablea_id=c.tablea_id
and b.groupnumber=c.groupnumber
full outer join
(
Select y.*
  from (
select x.*
      ,row_number() over (partition by tableA_id
                             ,groupnumber 
                           order by updated desc) as rn
  from tableD as x
       ) as Y
  Where rn=1
) as d
-- corrected join conditions
on (c.tablea_id=d.tablea_id
and c.groupnumber=d.groupnumber)
or (b.tablea_id=d.tablea_id
and b.groupnumber=d.groupnumber)
--) as Y  
) as Z
on A.id=Z.id
Where a.date between '20110308' and '20110309'
order by 1,2

Open in new window

0
 

Author Comment

by:adrian78
ID: 35092048
Hi Guys,

Lowfat... thank you for your responses.  I tried it out and I did get it to execute but it didn't return the desired result.  It was very close but there were some duplicate entries (ie.  The same id,groupnumber appeared more than once a couple of times out of 100 or so records)... so close but the query needs a bit more tweaking.

Sharath, your solution worked right out of the box and returned the desired result.

I'm not sure who's query is more efficient.. I think if I'm able to get Lowfats solution to work it may be a bit faster but I'm not sure.  Because Sharath's solution worked I'm going to award him with the correct solution but also credit some towards lowfat's solution which is close.
0

Featured Post

Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

Join & Write a Comment

In this article I will describe the Backup & Restore method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
Ever wondered why sometimes your SQL Server is slow or unresponsive with connections spiking up but by the time you go in, all is well? The following article will show you how to install and configure a SQL job that will send you email alerts includ…
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now