Solved

Advanced SQL Query (MS-SQL 2008)

Posted on 2011-03-08
5
398 Views
Last Modified: 2012-05-11
I am having problems with a query... unfortunately I need to use fictitious data so here goes.

I need to grab data from 4 different tables.  Table A has a primary key called "id".  This is a foreign key in the other 3 tables, B,C and D.

There is a 'date' field in Table A.  There are 'updated' date fields in the 3 other tables.

There are also 'groupnumber' fields in each of the three tables, B, C and D but not all tables may necessarily have a groupnumber for a given id (well, a.id more specifically)

I need to select all of the "id" from table A that are within a date range and the latest updated 'groupnumbers' along with other fields from the corresponding table.   If a groupnumber does not exist in one of the three tables (B, C, and D), it should return NULL for the fields selected in that table.

For example:

Table A (id, date, ...)
1, Mar 8 2011
2, Mar 8 2011
...

Table B (id, table_a_id, updated, groupnumber, fieldx, fieldy, fieldz)
32423, 1, Mar 8 2011 @ 1pm, 55, 1, 12, 13
32424, 1, Mar 8 2011 @ 3pm, 57, 14, 15, 16
32425, 1, Mar 8 2011 @ 2pm, 55, 17, 18, 19
32426, 2, Mar 8 2011 @ 4pm, 55, 20, 21, 21

Table C (id, table_a_id, updated, groupnumber, fielda, fieldb)
43534, 1, Mar 8 2011 @ 12pm, 55, 23, 24
43535, 2, Mar 8 2011 @ 1pm, 55, 26, 27
43536, 2, Mar 8 2011 @ 1pm, 57, 29, 30

Table D (id, table_a_id, updated, groupnumber, fieldm, fieldn)
23345, 1, Mar 8 2011 @ 3pm, 57, 32, 33
23345, 2, Mar 8 2011 @ 3pm, 54, 35, 36

If the where clause was A.date > 'Mar 8 2011' and A.date < 'Mar 9 2011'

The returned data should be:
(table_a_id, groupnumber, fieldx, fieldy, fieldz, fielda, fieldb, fieldm, fieldn)
1, 55, 17, 18, 19, 26, 27, NULL, NULL
1, 57, 14, 15, 16, 29, 30, 32, 33
2, 54, NULL, NULL, NULL, NULL, NULL, 35, 36

Hope this makes sense!

To recap,

- SELECT all id values from table A within date range
- There will be a unique ID, GroupNumber in each returned row
- I need the most recent groupnumber row (most recent 'dateupdated')
- If groupnumber in one of the three tables does not exist, return NULL for the columns that were to be selected.

I believe I may need to use CTE and/or RANK() OVER PARTITION...  but not certain.

Thank you!
Adrian
0
Comment
Question by:adrian78
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
5 Comments
 
LVL 50

Expert Comment

by:Lowfatspread
ID: 35073896
yes use row_number()

but i do't see the need for a cte....

you just need to left join (?) (or Inner Join)
to the FULL JOINed set of the other 3 tables B,C,D

like this
Select a.ID
      ,z.groupnumber
      ,fieldx,fieldy,fieldz,fielda,fieldb,fieldm,fieldn
  From TableA as A
 Left Outer Join
(
  Select coalesce(b.tablea_id,c.tablea_id,d,tablea_id) as id
        ,coalesce(b.groupnumber,c.groupnumber,d.groupnumber) as groupnumber
        ,fieldx,fieldy,fieldz,fielda,fieldb,fieldm,fieldn
    from
(
Select y.*
  from (
select x.*
      ,row_number() over (partition by tableA_id
                             ,groupnumber 
                           order by updated desc) as rn
  from tableB as x
       ) as Y
  Where rn=1
) as B
Full Outer Join
(
Select y.*
  from (
select x.*
      ,row_number() over (partition by tableA_id
                             ,groupnumber 
                           order by updated desc) as rn
  from tableC as x
       ) as Y
  Where rn=1
) as C
on b.tablea_id=c.tablea_id
and b.groupnumber=c.groupnumber
full outer join
(
Select y.*
  from (
select x.*
      ,row_number() over (partition by tableA_id
                             ,groupnumber 
                           order by updated desc) as rn
  from tableD as x
       ) as Y
  Where rn=1
) as d
on c.tablea_id=b.tablea_id
and c.groupnumber=b.groupnumber
) as Y
) as Z
on A.id=Z.id
Where a,date between '20110308' and '20110309'
order by 1

Open in new window

0
 
LVL 41

Accepted Solution

by:
Sharath earned 350 total points
ID: 35075032
Try this query. Replace @A, @B, @C and @D with your actual table names. I doubt on your expected result.For table_a_id 1 and groupnumber 57, there is no corresponding records in table C, hence fielda and fieldb should be NULL.
;WITH DriverCTE 
     AS (SELECT t1.*, 
                t2.groupnumber 
           FROM @A t1 
                LEFT JOIN (SELECT table_a_id, 
                                  groupnumber 
                             FROM @B 
                           UNION 
                           SELECT table_a_id, 
                                  groupnumber 
                             FROM @C 
                           UNION 
                           SELECT table_a_id, 
                                  groupnumber 
                             FROM @D) t2 
                  ON t1.id = t2.table_a_id) 
SELECT a.id, a.[date], a.groupnumber, b.fieldx, b.fieldy, b.fieldz, c.fielda, c.fieldb, d.fieldm, d.fieldn 
  FROM DriverCTE a 
       LEFT JOIN (SELECT * 
                    FROM (SELECT *, 
                                 ROW_NUMBER() 
                                   OVER(PARTITION BY table_a_id,groupnumber ORDER BY updated DESC) rn
                            FROM @B) B 
                   WHERE rn = 1) b 
         ON a.id = b.table_a_id 
            AND a.groupnumber = b.groupnumber 
       LEFT JOIN (SELECT * 
                    FROM (SELECT *, 
                                 ROW_NUMBER() 
                                   OVER(PARTITION BY table_a_id,groupnumber ORDER BY updated DESC) rn
                            FROM @C) C 
                   WHERE rn = 1) c 
         ON a.id = c.table_a_id 
            AND a.groupnumber = c.groupnumber 
       LEFT JOIN (SELECT * 
                    FROM (SELECT *, 
                                 ROW_NUMBER() 
                                   OVER(PARTITION BY table_a_id,groupnumber ORDER BY updated DESC) rn
                            FROM @D) D 
                   WHERE rn = 1) d 
         ON a.id = d.table_a_id 
            AND a.groupnumber = d.groupnumber 
 WHERE a.[date] BETWEEN '20110308' AND '20110309'

Open in new window

Tested with your sample data.
DECLARE  @A  TABLE( 
                       id INT, 
                   [date] DATETIME 
                   ) 

INSERT @A 
VALUES(1,'2011-03-08'), 
      (2,'2011-03-08') 

DECLARE  @B  TABLE( 
                            id INT, 
                    table_a_id INT, 
                       updated DATETIME, 
                   groupnumber INT, 
                        fieldx INT, 
                        fieldy INT, 
                        fieldz INT 
                   ) 

INSERT @B 
VALUES(32423,1,'2011-03-08 13:00:00',55,1,12,13), 
      (32424,1,'2011-03-08 15:00:00',57,14,15,16), 
      (32425,1,'2011-03-08 14:00:00',55,17,18,19), 
      (32426,2,'2011-03-08 16:00:00',55,20,21,21) 

DECLARE  @C  TABLE( 
                            id INT, 
                    table_a_id INT, 
                       updated DATETIME, 
                   groupnumber INT, 
                        fielda INT, 
                        fieldb INT 
                   ) 

INSERT @C 
VALUES(43534,1,'2011-03-08 12:00:00',55,23,24), 
      (43535,2,'2011-03-08 13:00:00',55,26,27), 
      (43536,2,'2011-03-08 13:00:00',57,29,30) 

DECLARE  @D  TABLE( 
                            id INT, 
                    table_a_id INT, 
                       updated DATETIME, 
                   groupnumber INT, 
                        fieldm INT, 
                        fieldn INT 
                   ) 

INSERT @D 
VALUES(23345,1,'2011-03-08 15:00:00',57,32,33), 
      (23345,2,'2011-03-08 15:00:00',54,35,36); 

;WITH DriverCTE 
     AS (SELECT t1.*, 
                t2.groupnumber 
           FROM @A t1 
                LEFT JOIN (SELECT table_a_id, 
                                  groupnumber 
                             FROM @B 
                           UNION 
                           SELECT table_a_id, 
                                  groupnumber 
                             FROM @C 
                           UNION 
                           SELECT table_a_id, 
                                  groupnumber 
                             FROM @D) t2 
                  ON t1.id = t2.table_a_id) 
SELECT a.id, a.[date], a.groupnumber, b.fieldx, b.fieldy, b.fieldz, c.fielda, c.fieldb, d.fieldm, d.fieldn 
  FROM DriverCTE a 
       LEFT JOIN (SELECT * 
                    FROM (SELECT *, 
                                 ROW_NUMBER() 
                                   OVER(PARTITION BY table_a_id,groupnumber ORDER BY updated DESC) rn
                            FROM @B) B 
                   WHERE rn = 1) b 
         ON a.id = b.table_a_id 
            AND a.groupnumber = b.groupnumber 
       LEFT JOIN (SELECT * 
                    FROM (SELECT *, 
                                 ROW_NUMBER() 
                                   OVER(PARTITION BY table_a_id,groupnumber ORDER BY updated DESC) rn
                            FROM @C) C 
                   WHERE rn = 1) c 
         ON a.id = c.table_a_id 
            AND a.groupnumber = c.groupnumber 
       LEFT JOIN (SELECT * 
                    FROM (SELECT *, 
                                 ROW_NUMBER() 
                                   OVER(PARTITION BY table_a_id,groupnumber ORDER BY updated DESC) rn
                            FROM @D) D 
                   WHERE rn = 1) d 
         ON a.id = d.table_a_id 
            AND a.groupnumber = d.groupnumber 
 WHERE a.[date] BETWEEN '20110308' AND '20110309'
 /*
 id	date	groupnumber	fieldx	fieldy	fieldz	fielda	fieldb	fieldm	fieldn
1	2011-03-08 00:00:00.000	55	17	18	19	23	24	NULL	NULL
1	2011-03-08 00:00:00.000	57	14	15	16	NULL	NULL	32	33
2	2011-03-08 00:00:00.000	54	NULL	NULL	NULL	NULL	NULL	35	36
2	2011-03-08 00:00:00.000	55	20	21	21	26	27	NULL	NULL
2	2011-03-08 00:00:00.000	57	NULL	NULL	NULL	29	30	NULL	NULL
 */

Open in new window

0
 

Author Comment

by:adrian78
ID: 35075144
Hi LowFat... I'm having a problem getting your translated query to execute.  I think there is a missing bracket along the way so I just need to flush that out first and then hopefully get it to run.

Sharath, thanks for your solution.  I will give it a go as well.

Thanks, guys!
0
 
LVL 50

Assisted Solution

by:Lowfatspread
Lowfatspread earned 150 total points
ID: 35080004
yes sorry had extra closing bracket, also had incorrect join on clause for tabled
see attached....

is there any further differentiation tp be applied to determine the latest row when the updated time is the same?

(if you want all rows in those cases then replace row_number() with Dense_rank()   )
Select a.ID
      ,z.groupnumber
      ,fieldx,fieldy,fieldz,fielda,fieldb,fieldm,fieldn
  From tableA as A
 Left Outer Join
(
  Select coalesce(b.tablea_id,c.tablea_id,d.tablea_id) as id
        ,coalesce(b.groupnumber,c.groupnumber,d.groupnumber) as groupnumber
        ,fieldx,fieldy,fieldz,fielda,fieldb,fieldm,fieldn
    from
(
Select y.*
  from (
select x.*
      ,row_number() over (partition by tableA_id
                             ,groupnumber 
                           order by updated desc) as rn
  from tableB as x
       ) as Y
  Where rn=1
) as B
Full Outer Join
(
Select y.*
  from (
select x.*
      ,row_number() over (partition by tableA_id
                             ,groupnumber 
                           order by updated desc) as rn
  from tableC as x
       ) as Y
  Where rn=1
) as C
on b.tablea_id=c.tablea_id
and b.groupnumber=c.groupnumber
full outer join
(
Select y.*
  from (
select x.*
      ,row_number() over (partition by tableA_id
                             ,groupnumber 
                           order by updated desc) as rn
  from tableD as x
       ) as Y
  Where rn=1
) as d
-- corrected join conditions
on (c.tablea_id=d.tablea_id
and c.groupnumber=d.groupnumber)
or (b.tablea_id=d.tablea_id
and b.groupnumber=d.groupnumber)
--) as Y  
) as Z
on A.id=Z.id
Where a.date between '20110308' and '20110309'
order by 1,2

Open in new window

0
 

Author Comment

by:adrian78
ID: 35092048
Hi Guys,

Lowfat... thank you for your responses.  I tried it out and I did get it to execute but it didn't return the desired result.  It was very close but there were some duplicate entries (ie.  The same id,groupnumber appeared more than once a couple of times out of 100 or so records)... so close but the query needs a bit more tweaking.

Sharath, your solution worked right out of the box and returned the desired result.

I'm not sure who's query is more efficient.. I think if I'm able to get Lowfats solution to work it may be a bit faster but I'm not sure.  Because Sharath's solution worked I'm going to award him with the correct solution but also credit some towards lowfat's solution which is close.
0

Featured Post

Stressed Out?

Watch some penguins on the livecam!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I'm trying, I really am. But I've seen so many wrong approaches involving date(time) boundaries I despair about my inability to explain it. I've seen quite a few recently that define a non-leap year as 364 days, or 366 days and the list goes on. …
International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question