jvonhendy
asked on
sql server 2008
I have two tables one is the employee table and the other is the training table. The training table has the course id(s), employee id who has taken the class and the date the course was taken.
I need to make a report that show what employees haven't taken the course(s). It seems I will have to have a[ not in] subquerry but it is not bringing back the results I need.
Thanks for your help,
Jerry
I need to make a report that show what employees haven't taken the course(s). It seems I will have to have a[ not in] subquerry but it is not bringing back the results I need.
Thanks for your help,
Jerry
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Is this a Crystal question or just SQL?
mlmcc
mlmcc
I'd endorse either of Sharath's suggestions, probably preferring the first one.
I think however you need to include something to cater for this:
" haven't taken the course(s) "
Regarding this statement:
>>It seems I will have to have a[ not in] subquerry
NOT IN() or IN() are both valid syntax and exceptionally handy - however they really are just shortcut methods for lots of OR's
e.g.
where course_id in (1,2,3)
can be rewritten as:
where (course_id = 1 OR course_id = 2 OR course_id = 3)
All that is fine, but IF the number of values inside the brackets gets large then this approach gets slow to execute.
where employ_id IN( select distinct employee_id from training where course_id in (1,2,3) )
is NOT as efficient as the left join, or in many cases as efficient as using NOT EXISTS
long way of saying:
take care when using IN(subquery-here) that the number of results in that subquery isn't huge.
I think however you need to include something to cater for this:
" haven't taken the course(s) "
SELECT e.*
FROM employee e
LEFT JOIN (
SELECT DISTINCT employee_id
FROM training
where course_id in (1,2,3) --<< i.e. the course(s) you are interested in
) t ON e.employee_id = t.employee_id
WHERE t.employee_id IS NULL
-- AND date_of_termination IS NULL --<< i.e. add conditions for being a current employee
A similar set of course id conditions can be added to the NOT EXISTS method too.Regarding this statement:
>>It seems I will have to have a[ not in] subquerry
NOT IN() or IN() are both valid syntax and exceptionally handy - however they really are just shortcut methods for lots of OR's
e.g.
where course_id in (1,2,3)
can be rewritten as:
where (course_id = 1 OR course_id = 2 OR course_id = 3)
All that is fine, but IF the number of values inside the brackets gets large then this approach gets slow to execute.
where employ_id IN( select distinct employee_id from training where course_id in (1,2,3) )
is NOT as efficient as the left join, or in many cases as efficient as using NOT EXISTS
long way of saying:
take care when using IN(subquery-here) that the number of results in that subquery isn't huge.
ASKER
These work but it doesn't show the course name that is not taken; these only show the employee ID. I will need both employee ID and course name.
Thanks for your help,
Jerry
Thanks for your help,
Jerry
Do you have another table called courses which holds all courses? I think trainings table has data only when an employee takes a course.
It would assist all involved if we used the real table and field names, and had some sample data.
Below I have assumed 3 tables:
Employees
Courses
CoursesTaken
and prepared a tiny set of sample data.
Below I have assumed 3 tables:
Employees
Courses
CoursesTaken
and prepared a tiny set of sample data.
CREATE TABLE Employees
([ID] int, [Name] varchar(6))
;
INSERT INTO Employees
([ID], [Name])
VALUES
(1, 'Fred'),
(2, 'Barney'),
(3, 'Wilma'),
(4, 'Betty'),
(5, 'Sleepy'),
(6, 'Sneezy'),
(7, 'Happy')
;
CREATE TABLE Courses
([ID] int, [CourseName] varchar(7))
;
INSERT INTO Courses
([ID], [CourseName])
VALUES
(1, 'SQL'),
(2, 'Crystal'),
(3, 'PHP')
;
CREATE TABLE CoursesTaken
([EmployeeID] int, [CourseID] int)
;
INSERT INTO CoursesTaken
([EmployeeID], [CourseID])
VALUES
(1, 1),
(1, 2),
(1, 3),
(2, 1),
(2, 3),
(3, 1),
(3, 2),
(4, 1),
(5, 2),
(6, 3)
;
**Query 1**:
SELECT
c.CourseName
, e.Name AS "Has Not Been Taken By"
FROM Employees AS e
CROSS JOIN Courses AS c
LEFT JOIN CoursesTaken AS t ON e.id = t.EmployeeID AND c.id = t.CourseID
WHERE t.EmployeeID IS NULL
ORDER BY
c.CourseName
, e.Name
**[Results][2]**:
| COURSENAME | HAS NOT BEEN TAKEN BY |
|------------|-----------------------|
| Crystal | Barney |
| Crystal | Betty |
| Crystal | Happy |
| Crystal | Sneezy |
| PHP | Betty |
| PHP | Happy |
| PHP | Sleepy |
| PHP | Wilma |
| SQL | Happy |
| SQL | Sleepy |
| SQL | Sneezy |
**Query 2**:
SELECT
c.CourseName
, e.Name AS "Has Not Been Taken By"
FROM Employees AS e
CROSS JOIN (
SELECT id, CourseName
FROM Courses
WHERE id IN (1,2) --<< limit range of courses here
) AS c
LEFT JOIN CoursesTaken AS t ON e.id = t.EmployeeID AND c.id = t.CourseID
WHERE t.EmployeeID IS NULL
ORDER BY
c.CourseName
, e.Name
**[Results][3]**:
| COURSENAME | HAS NOT BEEN TAKEN BY |
|------------|-----------------------|
| Crystal | Barney |
| Crystal | Betty |
| Crystal | Happy |
| Crystal | Sneezy |
| SQL | Happy |
| SQL | Sleepy |
| SQL | Sneezy |
[1]:
see: http://sqlfiddle.com/#!3/e9f1a/9
Most likely part of the solution here will be to use a CROSS JOIN which is the method to produce all possible combinations between employees and courses (this is also known as a Cartesian product). When using this method it is good practice to limit the potential size of this by limiting to each member of the Cartesian product first. So here we should limit both Employees and Courses like this:
Hopefully you do have a data model such as I have used here, and it would be helpful if we knew more specifics about the tables actually involved.
SELECT
c.CourseName
, e.Name AS "Has Not Been Taken By"
FROM (
select id, Name
from Employees
where TerminationDate IS NULL --<< limit range of Employees here
) AS e
CROSS JOIN (
SELECT id, CourseName
FROM Courses
WHERE id IN (1,2) --<< limit range of courses here
) AS c
LEFT JOIN CoursesTaken AS t ON e.id = t.EmployeeID AND c.id = t.CourseID
WHERE t.EmployeeID IS NULL
ORDER BY
c.CourseName
, e.Name
;
see: http://sqlfiddle.com/#!3/69983/1Hopefully you do have a data model such as I have used here, and it would be helpful if we knew more specifics about the tables actually involved.
ASKER
This is the code I used from your suggestion. Thanks
SELECT distinct replace(Emp.Name,'~',', ')+' '+ Emp.EmployeeId Name_ID,train.CourseId
FROM WhiteOak.dbo.XHR_Employee Emp,
WhiteOak.dbo.XHR_training train
where emp.Status in ( 'E','C' )
and not exists (select emp.PersonId
from WhiteOak.dbo.XHR_training t
where emp.PersonId = t.PersonId
AND train.CourseId = t.CourseId )
SELECT distinct replace(Emp.Name,'~',', ')+' '+ Emp.EmployeeId Name_ID,train.CourseId
FROM WhiteOak.dbo.XHR_Employee Emp,
WhiteOak.dbo.XHR_training train
where emp.Status in ( 'E','C' )
and not exists (select emp.PersonId
from WhiteOak.dbo.XHR_training t
where emp.PersonId = t.PersonId
AND train.CourseId = t.CourseId )
You really should use "cross join", it clearly articulates that this is deliberate and not accidental (which is what the current query looks like)
SELECT DISTINCT
REPLACE(Emp.Name, '~', ', ') + ' ' + Emp.EmployeeId Name_ID
, train.CourseId
FROM WhiteOak.dbo.XHR_Employee Emp
CROSS JOIN WhiteOak.dbo.XHR_training train
WHERE emp.Status IN ('E', 'C')
AND NOT EXISTS (
SELECT
emp.PersonId
FROM WhiteOak.dbo.XHR_training t
WHERE emp.PersonId = t.PersonId
AND train.CourseId = t.CourseId
)
By the way the accepted answer does not mention cross join (which is essential to your solution).
Open in new window