Link to home
Start Free TrialLog in
Avatar of jvonhendy
jvonhendy

asked on

sql server 2008

I have two tables one is the employee table and the other is the training table. The training table has the course id(s), employee id who has taken the class and the date the course was taken.

I need to make a report that show what employees haven't taken the course(s). It seems I will have to have a[ not in] subquerry but it is not bringing back the results I need.

Thanks for your help,

Jerry
ASKER CERTIFIED SOLUTION
Avatar of Sharath S
Sharath S
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
or this.
select e.*
  from employee e
 where not exists (select 1 from training t where e.employee_id = t.employee_id)

Open in new window

Avatar of Mike McCracken
Mike McCracken

Is this a Crystal question or just SQL?

mlmcc
I'd endorse either of Sharath's suggestions, probably preferring the first one.

I think however you need to include something to cater for this:
" haven't taken the course(s) "
SELECT e.*
FROM employee e
LEFT JOIN (
	SELECT DISTINCT employee_id
	FROM training

        where course_id in (1,2,3) --<< i.e. the course(s) you are interested in

	) t ON e.employee_id = t.employee_id
WHERE t.employee_id IS NULL
-- AND date_of_termination IS NULL --<< i.e. add conditions for being a current employee

Open in new window

A similar set of course id conditions can be added to the NOT EXISTS method too.

Regarding this statement:
>>It seems I will have to have a[ not in] subquerry

NOT IN() or IN() are both valid syntax and exceptionally handy - however they really are just shortcut methods for lots of OR's
e.g.

where course_id in (1,2,3)

can be rewritten as:

where (course_id = 1 OR course_id = 2 OR course_id = 3)

All that is fine, but IF the number of values inside the brackets gets large then this approach gets slow to execute.

where employ_id IN( select distinct employee_id from training where course_id in (1,2,3) )

is NOT as efficient as the left join, or in many cases as efficient as using NOT EXISTS

long way of saying:
take care when using IN(subquery-here) that the number of results in that subquery isn't huge.
Avatar of jvonhendy

ASKER

These work but it doesn't show the course name that is not taken; these only show the employee ID. I will need both employee ID and course name.

         Thanks for your help,

          Jerry
Do you have another table called courses which holds all courses? I think trainings table has data only when an employee takes a course.
It would assist all involved if we used the real table and field names, and had some sample data.
Below I have assumed 3 tables:
Employees
Courses
CoursesTaken
and prepared a tiny set of sample data.
    
    CREATE TABLE Employees
    	([ID] int, [Name] varchar(6))
    ;
    	
    INSERT INTO Employees
    	([ID], [Name])
    VALUES
    	(1, 'Fred'),
    	(2, 'Barney'),
    	(3, 'Wilma'),
    	(4, 'Betty'),
    	(5, 'Sleepy'),
    	(6, 'Sneezy'),
    	(7, 'Happy')
    ;
    
    CREATE TABLE Courses
    	([ID] int, [CourseName] varchar(7))
    ;
    	
    INSERT INTO Courses
    	([ID], [CourseName])
    VALUES
    	(1, 'SQL'),
    	(2, 'Crystal'),
    	(3, 'PHP')
    ;
    
    CREATE TABLE CoursesTaken	
    	([EmployeeID] int, [CourseID] int)
    ;
    	
    INSERT INTO CoursesTaken	
    	([EmployeeID], [CourseID])
    VALUES
    	(1, 1),
    	(1, 2),
    	(1, 3),
    	(2, 1),
    	(2, 3),
    	(3, 1),
    	(3, 2),
    	(4, 1),
    	(5, 2),
    	(6, 3)
    ;

**Query 1**:

    SELECT
          c.CourseName
        , e.Name        AS "Has Not Been Taken By"
    FROM Employees AS e
    CROSS JOIN Courses AS c
    LEFT JOIN CoursesTaken AS t ON e.id = t.EmployeeID AND c.id = t.CourseID
    WHERE t.EmployeeID IS NULL
    ORDER BY
          c.CourseName
        , e.Name
    

**[Results][2]**:
    
    | COURSENAME | HAS NOT BEEN TAKEN BY |
    |------------|-----------------------|
    |    Crystal |                Barney |
    |    Crystal |                 Betty |
    |    Crystal |                 Happy |
    |    Crystal |                Sneezy |
    |        PHP |                 Betty |
    |        PHP |                 Happy |
    |        PHP |                Sleepy |
    |        PHP |                 Wilma |
    |        SQL |                 Happy |
    |        SQL |                Sleepy |
    |        SQL |                Sneezy |


**Query 2**:

    SELECT
          c.CourseName
        , e.Name        AS "Has Not Been Taken By"
    FROM Employees AS e
    
    CROSS JOIN (
                SELECT id, CourseName
                FROM Courses
                WHERE id IN (1,2) --<< limit range of courses here
               ) AS c
    
    LEFT JOIN CoursesTaken AS t ON e.id = t.EmployeeID AND c.id = t.CourseID
    WHERE t.EmployeeID IS NULL
    ORDER BY
          c.CourseName
        , e.Name
    

**[Results][3]**:
    
    | COURSENAME | HAS NOT BEEN TAKEN BY |
    |------------|-----------------------|
    |    Crystal |                Barney |
    |    Crystal |                 Betty |
    |    Crystal |                 Happy |
    |    Crystal |                Sneezy |
    |        SQL |                 Happy |
    |        SQL |                Sleepy |
    |        SQL |                Sneezy |



  [1]: 

Open in new window

see: http://sqlfiddle.com/#!3/e9f1a/9
Most likely part of the solution here will be to use a CROSS JOIN which is the method to produce all possible combinations between employees and courses (this is also known as a Cartesian product). When using this method it is good practice to limit the potential size of this by limiting to each member of the Cartesian product first. So here we should limit both Employees and Courses like this:
SELECT
      c.CourseName
    , e.Name        AS "Has Not Been Taken By"
FROM (
      select id, Name
      from Employees
      where TerminationDate IS NULL --<< limit range of Employees here
     ) AS e

CROSS JOIN (
            SELECT id, CourseName
            FROM Courses
            WHERE id IN (1,2) --<< limit range of courses here
           ) AS c

LEFT JOIN CoursesTaken AS t ON e.id = t.EmployeeID AND c.id = t.CourseID
WHERE t.EmployeeID IS NULL
ORDER BY
      c.CourseName
    , e.Name
;

Open in new window

see: http://sqlfiddle.com/#!3/69983/1
Hopefully you do have a data model such as I have used here, and it would be helpful if we knew more specifics about the tables actually involved.
This is the code I used from your suggestion. Thanks

SELECT distinct  replace(Emp.Name,'~',', ')+'  '+ Emp.EmployeeId Name_ID,train.CourseId
  FROM WhiteOak.dbo.XHR_Employee Emp,
         WhiteOak.dbo.XHR_training train
  where emp.Status in ( 'E','C' )
        and not exists (select emp.PersonId
                        from WhiteOak.dbo.XHR_training t
                        where emp.PersonId = t.PersonId
                                   AND train.CourseId = t.CourseId )
You really should use "cross join", it clearly articulates that this is deliberate and not accidental (which is what the current query looks like)
SELECT DISTINCT
        REPLACE(Emp.Name, '~', ', ') + '  ' + Emp.EmployeeId Name_ID
      , train.CourseId
FROM WhiteOak.dbo.XHR_Employee Emp
        CROSS JOIN WhiteOak.dbo.XHR_training train
WHERE emp.Status IN ('E', 'C')
        AND NOT EXISTS (
                SELECT
                        emp.PersonId
                FROM WhiteOak.dbo.XHR_training t
                WHERE emp.PersonId = t.PersonId
                        AND train.CourseId = t.CourseId
        )

Open in new window

By the way the accepted answer does not mention cross join (which is essential to your solution).