asked on

DB2 loop / union select help

I have code that I run one month at a time to look for events that occur during each period. I subsequently join all the data into one large dataset. I’d like instead to set a couple of variables and accomplish a single dataset via some kind of loop, but I am db2 impaired.

I’ve included an example of the code for a specific month (Aug 2006). The code is unecessarily complex because, when it was developed, DB2 couldn't do group by's on derived column names.

I usually change the two dates (‘2006-08-01’ at the third select and then about ½ way through the code) and rerun for each month. I’d like instead to establish a loop that has a begin month and an end month defined and then loops through each month appending the dataset together.

Appreciate any help!

SELECT
CASE WHEN BU IS NULL THEN '00 - xx' ELSE BU END AS BUSINESS_UNIT,
DIV AS DIVISION,
DEPT AS DEPARTMENT,
CC AS COST_CENTER,
JCODE AS JOB_DESCRIPTION,
TV4 AS DATE,
COUNT(DISTINCT TV5) AS POSITIONS,
COUNT(DISTINCT TV6) AS EMPLOYEES
FROM
(SELECT
UNIT_DEPTIDE CONCAT ' - ' CONCAT UNIT_DESC AS BU,
DIV_DEPTIDE CONCAT ' - ' CONCAT DIV_DESC AS DIV,
DEPT_DEPTIDE CONCAT ' - ' CONCAT DEPT_DESC AS DEPT,
SUBSTR(TV1,07,04) CONCAT ' - ' CONCAT CC_DESC AS CC,
TV2 CONCAT ' - ' CONCAT TV3 AS JCODE,
TV4,TV5, TV6
FROM
(SELECT
A.EMPLID AS TV6,
A.DEPTID AS TV1,
A.JOBCODE AS TV2,
B.DESCR AS TV3,
A.POSITION_NBR AS TV5,
A.EMPL_STATUS,
A.LOCATION,
A.REG_TEMP,
A.FULL_PART_TIME,
A.STD_HOURS,
B.SETID,
B.JOBCODE,
B.EFFDT,
('2006-08-01') AS TV4
FROM
HRQRY.PS_JOB A,
HRQRY.PS_EMPLMT_SRCH_QRY A1,
HRQRY.PS_JOBCODE_TBL B,
HRQRY.PS_SET_CNTRL_REC B2
WHERE
B.JOBCODE = A.JOBCODE
AND B2.SETCNTRLVALUE = A.BUSINESS_UNIT
AND B2.RECNAME = 'JOBCODE_TBL'
AND B2.SETID = B.SETID
AND A.EMPLID = A1.EMPLID
AND A.EMPL_RCD = A1.EMPL_RCD
AND A1.ROWSECCLASS = 'ALLPANLS'
AND (A.EFFDT =(
SELECT
MAX(A_ED.EFFDT)
FROM
HRQRY.PS_JOB A_ED
WHERE
A.EMPLID = A_ED.EMPLID
AND A.EMPL_RCD = A_ED.EMPL_RCD
AND A_ED.EFFDT <= '2006-08-01'
)
AND A.EFFSEQ =(
SELECT
MAX(A_ES.EFFSEQ)
FROM
HRQRY.PS_JOB A_ES
WHERE
A.EMPLID = A_ES.EMPLID
AND A.EMPL_RCD = A_ES.EMPL_RCD
AND A.EFFDT = A_ES.EFFDT
)
AND A.JOBCODE <> '000004'
AND A.REG_TEMP = 'R'
AND A.STD_HOURS > 19.99
AND A.EMPL_STATUS IN ('A',
'L',
'P',
'S')
AND B.EFFDT =(
SELECT
MAX(B_ED.EFFDT)
FROM
HRQRY.PS_JOBCODE_TBL B_ED
WHERE
B.SETID = B_ED.SETID
AND B.JOBCODE = B_ED.JOBCODE
AND B_ED.EFFDT <= A.EFFDT
))
ORDER BY
7,
2 DESC,
4) AS T1
LEFT OUTER JOIN
(SELECT
P.DEPTID AS CC_DEPTIDE, P.DESCR AS CC_DESC,
SUBSTR(P.DEPTID,01,06) AS DEPT_DEPTIDE, QDT.DESCR AS DEPT_DESC,
SUBSTR(P.DEPTID,01,04) AS DIV_DEPTIDE, QDV.DESCR AS DIV_DESC,
SUBSTR(P.DEPTID,01,02) AS UNIT_DEPTIDE, QUN.DESCR AS UNIT_DESC
FROM
HRQRY.PS_DEPT_TBL AS P
LEFT OUTER JOIN
(SELECT SETID, DEPTID, DESCR
FROM HRQRY.PS_DEPT_TBL D1
WHERE EFFDT =
(SELECT MAX(EFFDT)
FROM HRQRY.PS_DEPT_TBL D2
WHERE D2.SETID = D1.SETID
AND D2.DEPTID = D1.DEPTID
AND D2.EFFDT <= CURRENT DATE)) AS QDT
ON P.SETID = QDT.SETID
AND SUBSTR(P.DEPTID,01,06) = QDT.DEPTID
LEFT OUTER JOIN
(SELECT SETID, DEPTID, DESCR
FROM HRQRY.PS_DEPT_TBL D3
WHERE EFFDT =
(SELECT MAX(EFFDT)
FROM HRQRY.PS_DEPT_TBL D4
WHERE D4.SETID = D3.SETID
AND D4.DEPTID = D3.DEPTID
AND D4.EFFDT <= CURRENT DATE)) AS QDV
ON P.SETID = QDV.SETID
AND SUBSTR(P.DEPTID,01,04) = QDV.DEPTID
LEFT OUTER JOIN
(SELECT SETID, DEPTID, DESCR
FROM HRQRY.PS_DEPT_TBL D5
WHERE EFFDT =
(SELECT MAX(EFFDT)
FROM HRQRY.PS_DEPT_TBL D6
WHERE D6.SETID = D5.SETID
AND D6.DEPTID = D5.DEPTID
AND D6.EFFDT <= CURRENT DATE)) AS QUN
ON P.SETID = QUN.SETID
AND SUBSTR(P.DEPTID,01,02) = QUN.DEPTID
WHERE (LENGTH(RTRIM(LTRIM(P.DEPTID))) = 10)
AND SUBSTR(P.DEPTID,01,02) <> 'TE'
AND P.EFFDT = (SELECT MAX(EFFDT)
FROM HRQRY.PS_DEPT_TBL D2
WHERE D2.DEPTID = P.DEPTID
AND D2.EFFDT <= CURRENT DATE)) AS T2

ON T1.TV1=T2.CC_DEPTIDE) AS T3
WHERE SUBSTR(BU,01,02) <> '31' OR BU IS NULL
GROUP BY
BU, DIV, DEPT, CC, JCODE, TV4
WITH UR

Kent Olsen

Hi pflugg,

That's quite a bit of SQL for a single query. :)

My guess is that this can be made quite a bit simpler and/or efficient. But it is probably a fair effort to do so.

Let me think on this for a bit and I'll be back to you. By some change, have you run an explain plan on this?

Kent

momi_sabag

this query actually works ?
you have an order by in the middle,
i don't think db2 will accept that

pflugg

ASKER

yes, it actually works. am happy to make it simpler. it grew and grew over time, and had some has sub selects because we couldn't do groupby's on derived column names.

have not run an explain plan on it.

Kent Olsen

You can group by a derived column, but you need a wrapper around it

select a, sum(b), count(*)
from
(
select a, b from sometable, anothertable where sometable.id = anothertable.id
) t0
group by a;

pflugg

ASKER

yeah, that's what this code is doing. the top part is a wrapper.

Kent Olsen

What version of DB2 are you using and what is your host platform?

Do you have any of the OLAP extensions?

Kent

Kent Olsen

Hi momi,

>you have an order by in the middle,
>i don't think db2 will accept that

Db2 allows this, but several RDBMS don't. Oracle comes to mind....

Kent

Kent Olsen

Hi pflugg,

I reformatted the SQL so that I could make better sense of it and I must say that the SQL really isn't too bad. I don't see any gross inefficiencies in it.

I do have a couple of comments. First -- INDENT! There are quite a few styles for indenting SQL. Find one that you like and stick with it. It makes queries much easier to quickly understand. Second -- left(). DB2 has left() function that returns characters from the left. left (Somestring, 2) is shorthand for substr (Somestring, 1, 2). No significant change to the explain plan here, but it's sometimes easier to grok the SQL.

So back to the original question. You could put a date in a table and query based on this value. Or you could put the query into a stored procedure/function and pass the date to the sp/function. Any preference?

Kent

pflugg

ASKER

can you shoot back the formatted sql? thanks for taking the time to do that!!
on your questions,
1) we're using db2 v9 on unix
2) if i can, i'd like to do it in straight sql.
3) i have limited access to create anything other than views. the dba generally frowns on stored procedures for our reporting purposes.
4) i could have the dba create a temporary table that i can cram something into. we do that in a couple of other cases. he'd be much more inclined to do that rather than go the stored procedure route.

Kent Olsen

I'm just thinking about ways to "pass" a date to the script. A parameter to a SP/function is easy, so is reading the date from another table. Other solutions get a bit more "creative". :)

SELECT CASE WHEN BU IS NULL THEN '00 - xx' ELSE BU END AS BUSINESS_UNIT, DIV AS DIVISION, DEPT AS DEPARTMENT, CC AS COST_CENTER, JCODE AS JOB_DESCRIPTION, TV4 AS DATE, COUNT(DISTINCT TV5) AS POSITIONS, COUNT(DISTINCT TV6) AS EMPLOYEES
FROM
(
SELECT UNIT_DEPTIDE CONCAT ' - ' CONCAT UNIT_DESC AS BU, DIV_DEPTIDE CONCAT ' - ' CONCAT DIV_DESC AS DIV, DEPT_DEPTIDE CONCAT ' - ' CONCAT DEPT_DESC AS DEPT, SUBSTR(TV1,07,04) CONCAT ' - ' CONCAT CC_DESC AS CC, TV2 CONCAT ' - ' CONCAT TV3 AS JCODE, TV4,TV5, TV6
FROM
(
SELECT
A.EMPLID AS TV6, A.DEPTID AS TV1, A.JOBCODE AS TV2, B.DESCR AS TV3, A.POSITION_NBR AS TV5, A.EMPL_STATUS, A.LOCATION, A.REG_TEMP, A.FULL_PART_TIME, A.STD_HOURS, B.SETID, B.JOBCODE, B.EFFDT, ('2006-08-01') AS TV4
FROM
HRQRY.PS_JOB A,
HRQRY.PS_EMPLMT_SRCH_QRY A1,
HRQRY.PS_JOBCODE_TBL B,
HRQRY.PS_SET_CNTRL_REC B2
WHERE B.JOBCODE = A.JOBCODE
AND B2.SETCNTRLVALUE = A.BUSINESS_UNIT
AND B2.RECNAME = 'JOBCODE_TBL'
AND B2.SETID = B.SETID
AND A.EMPLID = A1.EMPLID
AND A.EMPL_RCD = A1.EMPL_RCD
AND A1.ROWSECCLASS = 'ALLPANLS'
AND (A.EFFDT =
(
SELECT MAX(A_ED.EFFDT)
FROM HRQRY.PS_JOB A_ED
WHERE A.EMPLID = A_ED.EMPLID
AND A.EMPL_RCD = A_ED.EMPL_RCD
AND A_ED.EFFDT <= '2006-08-01'
)
AND A.EFFSEQ =
(
SELECT MAX(A_ES.EFFSEQ)
FROM HRQRY.PS_JOB A_ES
WHERE A.EMPLID = A_ES.EMPLID
AND A.EMPL_RCD = A_ES.EMPL_RCD
AND A.EFFDT = A_ES.EFFDT
)
AND A.JOBCODE <> '000004'
AND A.REG_TEMP = 'R'
AND A.STD_HOURS > 19.99
AND A.EMPL_STATUS IN ('A', 'L', 'P', 'S')
AND B.EFFDT =
(
SELECT MAX(B_ED.EFFDT)
FROM HRQRY.PS_JOBCODE_TBL B_ED
WHERE B.SETID = B_ED.SETID
AND B.JOBCODE = B_ED.JOBCODE
AND B_ED.EFFDT <= A.EFFDT
))
ORDER BY 7, 2 DESC, 4
) AS T1
LEFT OUTER JOIN
(
SELECT P.DEPTID AS CC_DEPTIDE, P.DESCR AS CC_DESC, SUBSTR(P.DEPTID,01,06) AS DEPT_DEPTIDE, QDT.DESCR AS DEPT_DESC, SUBSTR(P.DEPTID,01,04) AS DIV_DEPTIDE, QDV.DESCR AS DIV_DESC, SUBSTR(P.DEPTID,01,02) AS UNIT_DEPTIDE, QUN.DESCR AS UNIT_DESC
FROM HRQRY.PS_DEPT_TBL AS P
LEFT OUTER JOIN
(
SELECT SETID, DEPTID, DESCR
FROM HRQRY.PS_DEPT_TBL D1
WHERE EFFDT =
(
SELECT MAX(EFFDT)
FROM HRQRY.PS_DEPT_TBL D2
WHERE D2.SETID = D1.SETID
AND D2.DEPTID = D1.DEPTID
AND D2.EFFDT <= CURRENT DATE
)
) AS QDT
ON P.SETID = QDT.SETID
AND SUBSTR(P.DEPTID,01,06) = QDT.DEPTID
LEFT OUTER JOIN
(
SELECT SETID, DEPTID, DESCR
FROM HRQRY.PS_DEPT_TBL D3
WHERE EFFDT =
(
SELECT MAX(EFFDT)
FROM HRQRY.PS_DEPT_TBL D4
WHERE D4.SETID = D3.SETID
AND D4.DEPTID = D3.DEPTID
AND D4.EFFDT <= CURRENT DATE
)
) AS QDV
ON P.SETID = QDV.SETID
AND SUBSTR(P.DEPTID,01,04) = QDV.DEPTID
LEFT OUTER JOIN
(
SELECT SETID, DEPTID, DESCR
FROM HRQRY.PS_DEPT_TBL D5
WHERE EFFDT =
(
SELECT MAX(EFFDT)
FROM HRQRY.PS_DEPT_TBL D6
WHERE D6.SETID = D5.SETID
AND D6.DEPTID = D5.DEPTID
AND D6.EFFDT <= CURRENT DATE
)
) AS QUN
ON P.SETID = QUN.SETID
AND SUBSTR(P.DEPTID,01,02) = QUN.DEPTID
WHERE (LENGTH(RTRIM(LTRIM(P.DEPTID))) = 10)
AND SUBSTR(P.DEPTID,01,02) <> 'TE'
AND P.EFFDT =
(
SELECT MAX(EFFDT)
FROM HRQRY.PS_DEPT_TBL D2
WHERE D2.DEPTID = P.DEPTID
AND D2.EFFDT <= CURRENT DATE
)
) AS T2
ON T1.TV1=T2.CC_DEPTIDE
) AS T3
WHERE SUBSTR(BU,01,02) <> '31'
OR BU IS NULL
GROUP BY BU, DIV, DEPT, CC, JCODE, TV4
WITH UR

pflugg

ASKER

sounds familiar. we've had many areas were we've had to be creative and do things by ugly brute force.

Kent Olsen

The two occurrences of the date.

...
A.REG_TEMP, A.FULL_PART_TIME, A.STD_HOURS, B.SETID, B.JOBCODE, B.EFFDT, ('2006-08-01') AS TV4
...
AND A_ED.EFFDT <= '2006-08-01'
...

Can this date be programmatically generated or is this an "on demand" kind of report?

Kent

pflugg

ASKER

another thought may be to do something where you select based upon the 1st day of each month. that, in effect, is what's going on -- it's grabbing a count as of the 1st of each month. that would get rid of any looping or such. don't know the syntax, but the logic would be something like the code below. dunno. may be a crazy idea.

select count(id)
from the_big_sql_mess
where
eff_date between '2006-01-01' and '2007-04-01 and
dayofmonth(eff_date) = 1

pflugg

ASKER

it's could be programatically generated, probably. now()-something; now()-2*something for the previous month, and so forth

pflugg

ASKER

kent: have to run out for a little while, but will check back in the early evening. thanks much for your help!

ASKER CERTIFIED SOLUTION

Kent Olsen

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

pflugg

ASKER

i think doing that i'd get all dates, though, not one date per month (specifically the first of each month). sorry, am being obtuse i'm sure.