Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

TSQL question

Posted on 2012-04-04
8
Medium Priority
?
367 Views
Last Modified: 2012-04-05
I hope I explain this sufficiently.   I am working on MS SQL Server 2008.  I am looking to code an ANSI TSQL statement(s) that will do the following.

The table being accessed is defined as such (I have cut out non-relevant columns):
CREATE TABLE PSExtract.temp.RJZ_Temp
(KeyID INT NOT NULL
,DedupID INT NOT NULL
,DistinctID VARCHAR(61) NULL
,EmployeeCount INT
,OSLevelID int NULL)

KeyID identifies a unique company,

DedupID defines a unique individual within a unique company

DistinctID = KeyID & ' - ' & DedupID

EmployeeCount is the total # of employees in a unique company (keyed on KeyID)

OSLevelID is the level of the employee in the food chain at the company, the lower the number, the higher the employee is at the company.  

What I am trying to do is extract a list from this table, I want to only select 10,000 rows (the table has 150,000+ rows in it),

I want to select the companies that have the highest # of employees first
BUT I only want 3 employees per company
AND the 3 employees for each company are the highest level employees there  (lowest OSLevelID value) (I realize that there may be employees of the same level who are not selected).

I am only 3 weeks into this job as a junior TSQL programmer so...  don't assume I know anything.

Thanks folks,
Rich
0
Comment
Question by:RichNH
8 Comments
 
LVL 6

Expert Comment

by:Patrick Tallarico
ID: 37808728
So in your table, are listed all the employees by DedupID, and each row has an accurate count of the total number of employees for that company in the EmployeeCount field?

I would suspect you would have to have some sort of process running to update the EmployeeCount field upon entering new data..?

Either way, I would think you could count your KeyIds for each company, group by the company to find your company with the greatest number of employees, then cycle through those results using a cursor.  I realize that cursors could be quite slow, so I suppose you would have to see

Declare @counter int,@rownum int,@empcnt int,@keyid int
/*create a table to dump the final records into*/
Create table temptable(Rank,DedupID,KeyId,{morefields for report})
/*use a counter to stop at 10,000
set @counter = 0
/*begin cursor for the ranked companies*/
Declare c Cursor For select rownumber() as 'rn',count(KeyId) as 'EmployeeCount',KeyId as 'CompanyId'
                         from PSExtract.temp.RJZ_Temp
                         Order by KeyId Desc
/*cycle through the ranked companies using the cursor*/
Open c
Fetch next From c into @rownum,@empcnt,@keyid
While @counter <= 10000
/*Insert into the temp table*/
   Insert Into temptable
   Select top 3 @rownum as 'Rank',DedupID,KeyID,{more fields for report}
   from PSExtract.temp.RJZ_Temp
   Where KeyId like @keyid
/*add one to your counter*/
  set @counter = @counter + 1
/*Move to the next record returned by the ranked query*/
   Fetch next From c into  @rownum,@empcnt,@keyid

End

Forgive me, my syntax may be a bit off, as I haven't had access to my test environment, but I hope what I've written could point you to a possible solution.
0
 
LVL 7

Accepted Solution

by:
Anoo S Pillai earned 2000 total points
ID: 37809367
A co-related subquery will do the trick in this context. The following query will help you for a quick start:-

SELECT      TOP 10000 *
FROM      RJZ_Temp Employee
WHERE      Employee.DistinctID  IN  
            (
            SELECT      TOP 3 DistinctID
            FROM      RJZ_Temp Top3Emp
            WHERE      Top3Emp.KeyID =  Employee.KeyID
            ORDER BY OSLevelID ASC )  
ORDER BY EmployeeCount DESC

The table and data I used to test follows :-

CREATE TABLE RJZ_Temp
(KeyID INT NOT NULL
,DedupID INT NOT NULL
,DistinctID VARCHAR(61) NULL
,EmployeeCount INT
,OSLevelID int NULL)
GO
INSERT INTO RJZ_Temp VALUES ( 1 , 1 , '1-1' , 10 , 1 )
INSERT INTO RJZ_Temp VALUES ( 1 , 2 , '1-2' , 10 , 5 )
INSERT INTO RJZ_Temp VALUES ( 1 , 3 , '1-3' , 10 , 4 )
INSERT INTO RJZ_Temp VALUES ( 1 , 4 , '1-4' , 10 , 2 )
INSERT INTO RJZ_Temp VALUES ( 1 , 5 , '1-5' , 10 , 3 )
GO
INSERT INTO RJZ_Temp VALUES ( 2 , 1 , '2-1' , 20 , 2 )
INSERT INTO RJZ_Temp VALUES ( 2 , 2 , '2-2' , 20 , 1 )
INSERT INTO RJZ_Temp VALUES ( 2 , 3 , '2-3' , 20 , 3 )
INSERT INTO RJZ_Temp VALUES ( 2 , 4 , '2-4' , 20 , 4 )
INSERT INTO RJZ_Temp VALUES ( 2 , 5 , '2-5' , 20 , 5 )
GO
INSERT INTO RJZ_Temp VALUES ( 3 , 1 , '3-1' , 100 , 2 )
INSERT INTO RJZ_Temp VALUES ( 3 , 2 , '3-2' , 100 , 1 )
INSERT INTO RJZ_Temp VALUES ( 3 , 3 , '3-3' , 100 , 3 )
INSERT INTO RJZ_Temp VALUES ( 3 , 4 , '3-4' , 100 , 4 )
INSERT INTO RJZ_Temp VALUES ( 3 , 5 , '3-5' , 100 , 5 )
GO

As the no. of rows are large, I would prefer to convert this query into an equivalent JOIN statement, if you need help on that please post back.
0
 
LVL 32

Expert Comment

by:Ephraim Wangoya
ID: 37809873
try

select top 10000 A.KeyID, B.DedupID, A.DistinctID, A.EmployeeCount, A.OSLevelID
from (select top 100 percent * from RJZ_Temp order by EmployeeCount desc) A
inner join (select top 3 DedupID from RJZ_Temp order by OSLevelID) B on B.DedupID = A.DedupID

Open in new window

0
Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

 
LVL 32

Expert Comment

by:Ephraim Wangoya
ID: 37809888
You could do a join with the DistinctID instead

select top 10000 A.KeyID, A.DedupID, B.DistinctID, A.EmployeeCount, A.OSLevelID
from (select top 100 percent * from RJZ_Temp order by EmployeeCount desc) A
inner join (select top 3 DedupID from RJZ_Temp order by OSLevelID) B on B.DistinctID= A.DistinctID

Open in new window

0
 
LVL 32

Expert Comment

by:Ephraim Wangoya
ID: 37809890
correction

select top 10000 A.KeyID, A.DedupID, B.DistinctID, A.EmployeeCount, A.OSLevelID
from (select top 100 percent * from RJZ_Temp order by EmployeeCount desc) A
inner join (select top 3 DistinctID from RJZ_Temp order by OSLevelID) B on B.DistinctID= A.DistinctID

Open in new window

0
 
LVL 32

Expert Comment

by:Ephraim Wangoya
ID: 37809899
Actually that seems a little bit too complicated but should work

Try this other one

select top 10000 A.KeyID, B.DedupID, A.DistinctID, A.EmployeeCount, A.OSLevelID
from RJZ_Temp A
inner join (select top 3 DedupID from RJZ_Temp order by OSLevelID) B on B.DedupID = A.DedupID
order by A.EmployeeCount DESC, A.KeyID ASC
 

Open in new window

0
 
LVL 32

Expert Comment

by:Ephraim Wangoya
ID: 37809917
Disregard my first three queries,

And here is yet another variation

select * from
(
  select top 10000 A.KeyID, B.DedupID, A.DistinctID, A.EmployeeCount, A.OSLevelID
  from RJZ_Temp A
  inner join (select top 3 DedupID from RJZ_Temp order by OSLevelID) B on B.DedupID = A.DedupID
  order by A.EmployeeCount DESC, A.KeyID ASC
) X
order by KeyID, OSLevelID

Open in new window

0
 
LVL 1

Author Closing Comment

by:RichNH
ID: 37811392
Thank you, Your solution correctly returned what I was looking for.   I realized after the fact that there were other factors I hadn't considered but they were easily surmounted once I was pointed in the correct direction.

If you have time, I'd love to see how this all worked as an inner join.  FWIW, your solution went into a larger query that was three levels deep and .  The first two levels are an inner join, I'll look to make this level an inner join but I sometimes struggle with this stuff.
0

Featured Post

Get your Conversational Ransomware Defense e‑book

This e-book gives you an insight into the ransomware threat and reviews the fundamentals of top-notch ransomware preparedness and recovery. To help you protect yourself and your organization. The initial infection may be inevitable, so the best protection is to be fully prepared.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Recently we ran in to an issue while running some SQL jobs where we were trying to process the cubes.  We got an error saying failure stating 'NT SERVICE\SQLSERVERAGENT does not have access to Analysis Services. So this is a way to automate that wit…
Ready to get certified? Check out some courses that help you prepare for third-party exams.
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.

877 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question