Solved

TSQL question

Posted on 2012-04-04
8
354 Views
Last Modified: 2012-04-05
I hope I explain this sufficiently.   I am working on MS SQL Server 2008.  I am looking to code an ANSI TSQL statement(s) that will do the following.

The table being accessed is defined as such (I have cut out non-relevant columns):
CREATE TABLE PSExtract.temp.RJZ_Temp
(KeyID INT NOT NULL
,DedupID INT NOT NULL
,DistinctID VARCHAR(61) NULL
,EmployeeCount INT
,OSLevelID int NULL)

KeyID identifies a unique company,

DedupID defines a unique individual within a unique company

DistinctID = KeyID & ' - ' & DedupID

EmployeeCount is the total # of employees in a unique company (keyed on KeyID)

OSLevelID is the level of the employee in the food chain at the company, the lower the number, the higher the employee is at the company.  

What I am trying to do is extract a list from this table, I want to only select 10,000 rows (the table has 150,000+ rows in it),

I want to select the companies that have the highest # of employees first
BUT I only want 3 employees per company
AND the 3 employees for each company are the highest level employees there  (lowest OSLevelID value) (I realize that there may be employees of the same level who are not selected).

I am only 3 weeks into this job as a junior TSQL programmer so...  don't assume I know anything.

Thanks folks,
Rich
0
Comment
Question by:RichNH
8 Comments
 
LVL 6

Expert Comment

by:Patrick Tallarico
ID: 37808728
So in your table, are listed all the employees by DedupID, and each row has an accurate count of the total number of employees for that company in the EmployeeCount field?

I would suspect you would have to have some sort of process running to update the EmployeeCount field upon entering new data..?

Either way, I would think you could count your KeyIds for each company, group by the company to find your company with the greatest number of employees, then cycle through those results using a cursor.  I realize that cursors could be quite slow, so I suppose you would have to see

Declare @counter int,@rownum int,@empcnt int,@keyid int
/*create a table to dump the final records into*/
Create table temptable(Rank,DedupID,KeyId,{morefields for report})
/*use a counter to stop at 10,000
set @counter = 0
/*begin cursor for the ranked companies*/
Declare c Cursor For select rownumber() as 'rn',count(KeyId) as 'EmployeeCount',KeyId as 'CompanyId'
                         from PSExtract.temp.RJZ_Temp
                         Order by KeyId Desc
/*cycle through the ranked companies using the cursor*/
Open c
Fetch next From c into @rownum,@empcnt,@keyid
While @counter <= 10000
/*Insert into the temp table*/
   Insert Into temptable
   Select top 3 @rownum as 'Rank',DedupID,KeyID,{more fields for report}
   from PSExtract.temp.RJZ_Temp
   Where KeyId like @keyid
/*add one to your counter*/
  set @counter = @counter + 1
/*Move to the next record returned by the ranked query*/
   Fetch next From c into  @rownum,@empcnt,@keyid

End

Forgive me, my syntax may be a bit off, as I haven't had access to my test environment, but I hope what I've written could point you to a possible solution.
0
 
LVL 7

Accepted Solution

by:
Anoo S Pillai earned 500 total points
ID: 37809367
A co-related subquery will do the trick in this context. The following query will help you for a quick start:-

SELECT      TOP 10000 *
FROM      RJZ_Temp Employee
WHERE      Employee.DistinctID  IN  
            (
            SELECT      TOP 3 DistinctID
            FROM      RJZ_Temp Top3Emp
            WHERE      Top3Emp.KeyID =  Employee.KeyID
            ORDER BY OSLevelID ASC )  
ORDER BY EmployeeCount DESC

The table and data I used to test follows :-

CREATE TABLE RJZ_Temp
(KeyID INT NOT NULL
,DedupID INT NOT NULL
,DistinctID VARCHAR(61) NULL
,EmployeeCount INT
,OSLevelID int NULL)
GO
INSERT INTO RJZ_Temp VALUES ( 1 , 1 , '1-1' , 10 , 1 )
INSERT INTO RJZ_Temp VALUES ( 1 , 2 , '1-2' , 10 , 5 )
INSERT INTO RJZ_Temp VALUES ( 1 , 3 , '1-3' , 10 , 4 )
INSERT INTO RJZ_Temp VALUES ( 1 , 4 , '1-4' , 10 , 2 )
INSERT INTO RJZ_Temp VALUES ( 1 , 5 , '1-5' , 10 , 3 )
GO
INSERT INTO RJZ_Temp VALUES ( 2 , 1 , '2-1' , 20 , 2 )
INSERT INTO RJZ_Temp VALUES ( 2 , 2 , '2-2' , 20 , 1 )
INSERT INTO RJZ_Temp VALUES ( 2 , 3 , '2-3' , 20 , 3 )
INSERT INTO RJZ_Temp VALUES ( 2 , 4 , '2-4' , 20 , 4 )
INSERT INTO RJZ_Temp VALUES ( 2 , 5 , '2-5' , 20 , 5 )
GO
INSERT INTO RJZ_Temp VALUES ( 3 , 1 , '3-1' , 100 , 2 )
INSERT INTO RJZ_Temp VALUES ( 3 , 2 , '3-2' , 100 , 1 )
INSERT INTO RJZ_Temp VALUES ( 3 , 3 , '3-3' , 100 , 3 )
INSERT INTO RJZ_Temp VALUES ( 3 , 4 , '3-4' , 100 , 4 )
INSERT INTO RJZ_Temp VALUES ( 3 , 5 , '3-5' , 100 , 5 )
GO

As the no. of rows are large, I would prefer to convert this query into an equivalent JOIN statement, if you need help on that please post back.
0
 
LVL 32

Expert Comment

by:ewangoya
ID: 37809873
try

select top 10000 A.KeyID, B.DedupID, A.DistinctID, A.EmployeeCount, A.OSLevelID
from (select top 100 percent * from RJZ_Temp order by EmployeeCount desc) A
inner join (select top 3 DedupID from RJZ_Temp order by OSLevelID) B on B.DedupID = A.DedupID

Open in new window

0
 
LVL 32

Expert Comment

by:ewangoya
ID: 37809888
You could do a join with the DistinctID instead

select top 10000 A.KeyID, A.DedupID, B.DistinctID, A.EmployeeCount, A.OSLevelID
from (select top 100 percent * from RJZ_Temp order by EmployeeCount desc) A
inner join (select top 3 DedupID from RJZ_Temp order by OSLevelID) B on B.DistinctID= A.DistinctID

Open in new window

0
What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 32

Expert Comment

by:ewangoya
ID: 37809890
correction

select top 10000 A.KeyID, A.DedupID, B.DistinctID, A.EmployeeCount, A.OSLevelID
from (select top 100 percent * from RJZ_Temp order by EmployeeCount desc) A
inner join (select top 3 DistinctID from RJZ_Temp order by OSLevelID) B on B.DistinctID= A.DistinctID

Open in new window

0
 
LVL 32

Expert Comment

by:ewangoya
ID: 37809899
Actually that seems a little bit too complicated but should work

Try this other one

select top 10000 A.KeyID, B.DedupID, A.DistinctID, A.EmployeeCount, A.OSLevelID
from RJZ_Temp A
inner join (select top 3 DedupID from RJZ_Temp order by OSLevelID) B on B.DedupID = A.DedupID
order by A.EmployeeCount DESC, A.KeyID ASC
 

Open in new window

0
 
LVL 32

Expert Comment

by:ewangoya
ID: 37809917
Disregard my first three queries,

And here is yet another variation

select * from
(
  select top 10000 A.KeyID, B.DedupID, A.DistinctID, A.EmployeeCount, A.OSLevelID
  from RJZ_Temp A
  inner join (select top 3 DedupID from RJZ_Temp order by OSLevelID) B on B.DedupID = A.DedupID
  order by A.EmployeeCount DESC, A.KeyID ASC
) X
order by KeyID, OSLevelID

Open in new window

0
 
LVL 1

Author Closing Comment

by:RichNH
ID: 37811392
Thank you, Your solution correctly returned what I was looking for.   I realized after the fact that there were other factors I hadn't considered but they were easily surmounted once I was pointed in the correct direction.

If you have time, I'd love to see how this all worked as an inner join.  FWIW, your solution went into a larger query that was three levels deep and .  The first two levels are an inner join, I'll look to make this level an inner join but I sometimes struggle with this stuff.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

When you hear the word proxy, you may become apprehensive. This article will help you to understand Proxy and when it is useful. Let's talk Proxy for SQL Server. (Not in terms of Internet access.) Typically, you'll run into this type of problem w…
Ever wondered why sometimes your SQL Server is slow or unresponsive with connections spiking up but by the time you go in, all is well? The following article will show you how to install and configure a SQL job that will send you email alerts includ…
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties
Via a live example, show how to setup several different housekeeping processes for a SQL Server.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now