garyjgs
asked on
How do I write DB2 SQL to return all employee records with proper WHERE clause?
I have the below DB2 SQL that I am trying to retrun all 200,000 employees including those that have a History record pay changes:
SELECT
A.HM_DIST_CO AS "Dist Company"
,VARCHAR(A.HM_ACCT_UNIT) AS "Dist Acct Unit"
,VARCHAR(A.HM_ACCOUNT) AS "Dist Account"
,A.EMP_STATUS AS "Status"
,A.TERM_DATE AS "Term Date"
,Z.COMPANY AS "HRH Company"
,VARCHAR(Z.EMPLOYEE) AS "HRH Employee"
,Z.FLD_NBR AS "HRH Fld Nbr"
,Z.N_VALUE AS "HRH N Value"
,Z.DATE_STAMP AS "HRH Date Stamp"
,Z.ACT_OBJ_ID AS "HRH Act Obj ID"
,C.HM_PHONE_NBR AS "Phone"
,C.SEX AS "Gender"
,C.BIRTHDATE AS "Birth Date"
FROM
DBHREMP A
JOIN DBHRPEM C
ON A.COMPANY = C.COMPANY
AND A.EMPLOYEE = C.EMPLOYEE
JOIN DBHRHRH Z
ON A.COMPANY = Z.COMPANY
AND A.EMPLOYEE = Z.EMPLOYEE
WHERE Z.FLD_NBR =730
AND Z.DATE_STAMP = (SELECT MAX(Y.DATE_STAMP) FROM DBHRHRH Y
WHERE Y.EMPLOYEE = Z.EMPLOYEE)
ORDER BY A.EMPLOYEE
The issue with the above SQL is that with the Z.DATE_STAMP = (SELECT MAX(Y.DATE_STAMP) FROM DBHRHRH Y WHERE Y.EMPLOYEE = Z.EMPLOYEE) statement only 7,100 records are returned.
How do I structure the DB2 SQL retruning records of those with pay changes and those that don't have pay reocrd changes to include all 200,000 employees?
SELECT
A.HM_DIST_CO AS "Dist Company"
,VARCHAR(A.HM_ACCT_UNIT) AS "Dist Acct Unit"
,VARCHAR(A.HM_ACCOUNT) AS "Dist Account"
,A.EMP_STATUS AS "Status"
,A.TERM_DATE AS "Term Date"
,Z.COMPANY AS "HRH Company"
,VARCHAR(Z.EMPLOYEE) AS "HRH Employee"
,Z.FLD_NBR AS "HRH Fld Nbr"
,Z.N_VALUE AS "HRH N Value"
,Z.DATE_STAMP AS "HRH Date Stamp"
,Z.ACT_OBJ_ID AS "HRH Act Obj ID"
,C.HM_PHONE_NBR AS "Phone"
,C.SEX AS "Gender"
,C.BIRTHDATE AS "Birth Date"
FROM
DBHREMP A
JOIN DBHRPEM C
ON A.COMPANY = C.COMPANY
AND A.EMPLOYEE = C.EMPLOYEE
JOIN DBHRHRH Z
ON A.COMPANY = Z.COMPANY
AND A.EMPLOYEE = Z.EMPLOYEE
WHERE Z.FLD_NBR =730
AND Z.DATE_STAMP = (SELECT MAX(Y.DATE_STAMP) FROM DBHRHRH Y
WHERE Y.EMPLOYEE = Z.EMPLOYEE)
ORDER BY A.EMPLOYEE
The issue with the above SQL is that with the Z.DATE_STAMP = (SELECT MAX(Y.DATE_STAMP) FROM DBHRHRH Y WHERE Y.EMPLOYEE = Z.EMPLOYEE) statement only 7,100 records are returned.
How do I structure the DB2 SQL retruning records of those with pay changes and those that don't have pay reocrd changes to include all 200,000 employees?
what happens if you remove Z.DATE_STAMP = (SELECT MAX(Y.DATE_STAMP) FROM DBHRHRH Y WHERE Y.EMPLOYEE = Z.EMPLOYEE) ?
ASKER
DBHREMP A is the mater table which maintains 200,000 employees. When doing an OUTER JOIN DBHRHRH Z, I receive an error: Column qualifier or table Z undefined.
ASKER
what happens if you remove Z.DATE_STAMP = (SELECT MAX(Y.DATE_STAMP) FROM DBHRHRH Y WHERE Y.EMPLOYEE = Z.EMPLOYEE) ? - When this statement is removed, 300,000 plus records are returned since an employee may have more than one pay history record. I am trying to return the most recent pay history record.
try
SELECT
A.HM_DIST_CO AS "Dist Company"
,VARCHAR(A.HM_ACCT_UNIT) AS "Dist Acct Unit"
,VARCHAR(A.HM_ACCOUNT) AS "Dist Account"
,A.EMP_STATUS AS "Status"
,A.TERM_DATE AS "Term Date"
,Z.COMPANY AS "HRH Company"
,VARCHAR(Z.EMPLOYEE) AS "HRH Employee"
,Z.FLD_NBR AS "HRH Fld Nbr"
,Z.N_VALUE AS "HRH N Value"
,Z.DATE_STAMP AS "HRH Date Stamp"
,Z.ACT_OBJ_ID AS "HRH Act Obj ID"
,C.HM_PHONE_NBR AS "Phone"
,C.SEX AS "Gender"
,C.BIRTHDATE AS "Birth Date"
FROM
DBHREMP A
JOIN DBHRPEM C
ON A.COMPANY = C.COMPANY
AND A.EMPLOYEE = C.EMPLOYEE
left outer JOIN DBHRHRH Z
ON A.COMPANY = Z.COMPANY
AND A.EMPLOYEE = Z.EMPLOYEE
WHERE Z.FLD_NBR =730
AND Z.DATE_STAMP = (SELECT MAX(Y.DATE_STAMP) FROM DBHRHRH Y
WHERE A.EMPLOYEE = Z.EMPLOYEE)
ORDER BY A.EMPLOYEE
SELECT
A.HM_DIST_CO AS "Dist Company"
,VARCHAR(A.HM_ACCT_UNIT) AS "Dist Acct Unit"
,VARCHAR(A.HM_ACCOUNT) AS "Dist Account"
,A.EMP_STATUS AS "Status"
,A.TERM_DATE AS "Term Date"
,Z.COMPANY AS "HRH Company"
,VARCHAR(Z.EMPLOYEE) AS "HRH Employee"
,Z.FLD_NBR AS "HRH Fld Nbr"
,Z.N_VALUE AS "HRH N Value"
,Z.DATE_STAMP AS "HRH Date Stamp"
,Z.ACT_OBJ_ID AS "HRH Act Obj ID"
,C.HM_PHONE_NBR AS "Phone"
,C.SEX AS "Gender"
,C.BIRTHDATE AS "Birth Date"
FROM
DBHREMP A
JOIN DBHRPEM C
ON A.COMPANY = C.COMPANY
AND A.EMPLOYEE = C.EMPLOYEE
left outer JOIN DBHRHRH Z
ON A.COMPANY = Z.COMPANY
AND A.EMPLOYEE = Z.EMPLOYEE
WHERE Z.FLD_NBR =730
AND Z.DATE_STAMP = (SELECT MAX(Y.DATE_STAMP) FROM DBHRHRH Y
WHERE A.EMPLOYEE = Z.EMPLOYEE)
ORDER BY A.EMPLOYEE
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Using the syntax:
lleft outer JOIN DBHRHRH Z
ON A.COMPANY = Z.COMPANY
AND A.EMPLOYEE = Z.EMPLOYEE
the query hangs. I am unable to return records.
lleft outer JOIN DBHRHRH Z
ON A.COMPANY = Z.COMPANY
AND A.EMPLOYEE = Z.EMPLOYEE
the query hangs. I am unable to return records.
I suspect that the query is just taking a long time.
How many rows in each of the tables?
How many rows in each of the tables?
ASKER
The primary table DBHREMP A has 212,000 records.
The history table DBHRHRH Z has probably 600,000 records (or maybe more).
The history table DBHRHRH Z has probably 600,000 records (or maybe more).
Ok.
In the primary table, can there be duplicates with regard to fields COMPANY and EMPLOYEE? That is:
SELECT * FROM DBHREMP where COMPANY = x and EMPLOYEE = y
Will that produce exactly 1 row at most?
In the primary table, can there be duplicates with regard to fields COMPANY and EMPLOYEE? That is:
SELECT * FROM DBHREMP where COMPANY = x and EMPLOYEE = y
Will that produce exactly 1 row at most?
ASKER
In the primary table, there are 2 companies, comapany 100 and company 125.
In the primary table, there is only one employee. An employee cannot be entered more than once.
For example, select * from DBHREMP where company = 100 and employee =1 returns only one record.
(Note: Using the syntax:
lleft outer JOIN DBHRHRH Z
ON A.COMPANY = Z.COMPANY
AND A.EMPLOYEE = Z.EMPLOYEE
I have left the query running over 2 hours and no records have returned as of yet)
In the primary table, there is only one employee. An employee cannot be entered more than once.
For example, select * from DBHREMP where company = 100 and employee =1 returns only one record.
(Note: Using the syntax:
lleft outer JOIN DBHRHRH Z
ON A.COMPANY = Z.COMPANY
AND A.EMPLOYEE = Z.EMPLOYEE
I have left the query running over 2 hours and no records have returned as of yet)
Hi Gary,
That's not that big a join, suggesting that there is a cartesian product being generated. i.e. many millions of rows.
Can you post the query that you're running this time? Maybe another set of eyes will help.
Kent
That's not that big a join, suggesting that there is a cartesian product being generated. i.e. many millions of rows.
Can you post the query that you're running this time? Maybe another set of eyes will help.
Kent
ASKER
After working with the DBA, it was identifiied that there are over 2 million history records. The query that I was trying to write was going to be used in a Crystal Report. Having heard the volume of history records another approach is going to have to be identified within the context of the erp software or a view is going to have to be created as not to disrupt the database.
A traditional view won't make this any faster. In fact, depending on usage, it could make it worse.
A materialized view may be the best solution. Check with you DBA.
Kent
A materialized view may be the best solution. Check with you DBA.
Kent
ASKER
This would be the best solution if the number of records were not as they are. The follow up was also very helpful.
I don't understand your data, but I suspect that it lies with the fact that you're doing inner joins, when you need outer joins.
FROM
DBHREMP A
JOIN DBHRPEM C
ON A.COMPANY = C.COMPANY
AND A.EMPLOYEE = C.EMPLOYEE
JOIN DBHRHRH Z
ON A.COMPANY = Z.COMPANY
AND A.EMPLOYEE = Z.EMPLOYEE
In the three tables joined here, which has the "master list" of all employees? That should probably be the table that the other two are "outer joined" to.
Kent