Brian Pringle
asked on
SQL Denormalize Column into One Row with Many Columns
I need a SQL query that will convert output from a normalized database table into a denormalized table with one row per record with many columns.
What I have:
Invoice_Num Emp_ID
A1234 1
A1234 2
A1234 3
B2345 3
B2345 6
C4567 4
C4567 2
D9876 1
D9876 6
E1234 1
F6543 6
F6543 2
F6543 3
F6543 4
What I need:
Invoice_Num Emp1 Emp2 Emp3 Emp4 --- and so on dynamically
A1234 1 2 3
B2345 3 6
C4567 4 2
D9876 1 6
E1234 1
F6543 6 2 3 4
What I have:
Invoice_Num Emp_ID
A1234 1
A1234 2
A1234 3
B2345 3
B2345 6
C4567 4
C4567 2
D9876 1
D9876 6
E1234 1
F6543 6
F6543 2
F6543 3
F6543 4
What I need:
Invoice_Num Emp1 Emp2 Emp3 Emp4 --- and so on dynamically
A1234 1 2 3
B2345 3 6
C4567 4 2
D9876 1 6
E1234 1
F6543 6 2 3 4
You can use Mitchell Sellers' dynamic pivot technique. It uses a two-step process to get the dynamic new column names and then pivots it using dynamic SQL.
ASKER
Can you provide a working solution for the table that I provided? I was able to get his working, but could not get a list to work. I am wondering if it is because I am not using aggregated data?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I really would like to get this working dynamically instead of having to hard-code all of the case statements.
The hard-coding is just the max number of emp_ids to have in the result.
If you really want to limit it to the existing data, you'd get the max # of emp ids from this query:
DECLARE @Emp_Count int
SELECT @Emp_Count = MAX(Emp_Count)
FROM (
SELECT COUNT(*) AS Emp_Count
FROM table_name
GROUP BY Invoice_Num
) AS derived
Then use that counter to generate the appropriate number of:
MAX(CASE WHEN row_num = nn THEN Emp_ID END) AS Empnn,
columns.
If you really want to limit it to the existing data, you'd get the max # of emp ids from this query:
DECLARE @Emp_Count int
SELECT @Emp_Count = MAX(Emp_Count)
FROM (
SELECT COUNT(*) AS Emp_Count
FROM table_name
GROUP BY Invoice_Num
) AS derived
Then use that counter to generate the appropriate number of:
MAX(CASE WHEN row_num = nn THEN Emp_ID END) AS Empnn,
columns.
ASKER
I know. The problem is that it is possible for 40-50 people to touch a particular invoice before it is being released. I don't want to have to have that many columns if the particular invoice requested only has 2 people. To get this information currently, they use Excel to perform a pivot on the data and it only takes a couple of seconds. I want to be able to do that dynamically in SQL.
Yeah, for a small list of particular invoice numbers, it's probably not too much overhead to get the max count first and then use that info to build the dynamic query/result.
For all or a lot of invoice numbers, having to get the max count first is likely almost as costly as the full retrieval query would be.
Bottom line, in SQL Server, there's not a super-easy way to do what you want to do .
For all or a lot of invoice numbers, having to get the max count first is likely almost as costly as the full retrieval query would be.
Bottom line, in SQL Server, there's not a super-easy way to do what you want to do .
ASKER
That is why I am hoping to build it ahead of time so that I can script it for when they start needing that information. I can use the PIVOT function, but have not figured out a good way to make it work without aggregated data.
Not sure if it's helpful but I use this code to dynamically build tables:
CREATE TABLE #test (Invoice_Num VARCHAR(50))
DECLARE @num INT
DECLARE @end INT
SET @num = 1
SET @end = 100
DECLARE @sql VARCHAR(max)
WHILE @num <= @end
BEGIN
SET @sql = 'ALTER TABLE #test ADD Emp' + CAST(@num AS VARCHAR(50)) + ' INT'
EXEC(@sql)
SET @num = @num+1
END
SELECT * FROM #test
DROP TABLE #test
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
@Russell - that's a lot less code than I wrote! :)
But, it seems as if it doesn't left justify like in the first post and instead puts each empID under a column labeled after itself. I think, if you had 1000 different emp_ids that are used each once you'd have 1001 columns.
But, it seems as if it doesn't left justify like in the first post and instead puts each empID under a column labeled after itself. I think, if you had 1000 different emp_ids that are used each once you'd have 1001 columns.
@Dustin: yes, but I think that's what Brian wants.
For testing I wrote a data generator to match the data Brian has in his sample:
Just change this line to the table you want to put test data in:
and
Where @end is the total number of records to create, @employees total is the total pool of empIDs to pull from, @maxEmployeePerInvoice is the max number of empids that can be on one invoice.
There were 2 issues with not sliding the results to the leftI found, the first being that if there are 1000 unique empIDs used then there are 1000 columns most of which are nulls. The second problem was that if there are more than 1023 unique ids, it hit the column limit.
My results are as such (here's a sample attached but the results here are for 4000 records with up to 40 empIDs per invoice). I think it's exactly what you wanted. But I think an expert might be able to condense my code- I think I just naturally approach these things the same way I would in C# vs an elegant query.
resultsQuery.png
--Generate Utility Table
CREATE TABLE #Alphabet (letter VARCHAR(50))
INSERT INTO #Alphabet VALUES ('A'),('B'),('C'),('D'),('E'),('F'),('G'),('H'),('I');
GO
CREATE TABLE #Length (lgth INT)
INSERT INTO #Length VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9)
GO
--Populate Fake Data
DECLARE @a VARCHAR(50)
DECLARE @b INT
DECLARE @start INT
DECLARE @end INT
DECLARE @employeesTotal INT
DECLARE @maxEmployeePerInvoice INT
DECLARE @thisInvoice VARCHAR(50)
DECLARE @thisEmployee INT
DECLARE @iteration INT
DECLARE @maxIteration INT
SET @start = 1
SET @end = 4000
SET @employeesTotal = 2000
SET @maxEmployeePerInvoice = 40
SET @iteration = 2
SET @maxIteration = 1
WHILE @start <= @end
BEGIN
IF @iteration > @maxIteration
BEGIN
SET @a = (SELECT TOP 1 * FROM #Alphabet ORDER BY NEWID())
SET @b = (SELECT TOP 1 * FROM #Length ORDER BY NEWID()) * 1000 + (SELECT TOP 1 * FROM #Length ORDER BY NEWID()) * 100 + (SELECT TOP 1 * FROM #Length ORDER BY NEWID()) * 10 + (SELECT TOP 1 * FROM #Length ORDER BY NEWID())
SET @thisInvoice = (SELECT @a + CAST(@b AS VARCHAR(50)))
SET @iteration = 1
SET @maxIteration = (SELECT FLOOR(RAND()*(@maxEmployeePerInvoice-1)+1))
END
SET @thisEmployee = (SELECT FLOOR(RAND()*(@employeesTotal-1)+1))
INSERT INTO invoiceemp (Invoice_Num, Emp_ID) VALUES (@thisInvoice, @thisEmployee)
SET @iteration = @iteration + 1
SET @start = @start + 1
END
--Clean Up
DROP TABLE #Length
DROP TABLE #Alphabet
Just change this line to the table you want to put test data in:
INSERT INTO invoiceemp (Invoice_Num, Emp_ID) VALUES (@thisInvoice, @thisEmployee)
and
SET @start = 1
SET @end = 4000
SET @employeesTotal = 2000
SET @maxEmployeePerInvoice = 40
Where @end is the total number of records to create, @employees total is the total pool of empIDs to pull from, @maxEmployeePerInvoice is the max number of empids that can be on one invoice.
There were 2 issues with not sliding the results to the leftI found, the first being that if there are 1000 unique empIDs used then there are 1000 columns most of which are nulls. The second problem was that if there are more than 1023 unique ids, it hit the column limit.
My results are as such (here's a sample attached but the results here are for 4000 records with up to 40 empIDs per invoice). I think it's exactly what you wanted. But I think an expert might be able to condense my code- I think I just naturally approach these things the same way I would in C# vs an elegant query.
resultsQuery.png
ASKER
@Russell Fox - This is what I needed, but I couldn't use it for the project that I was working on yesterday. I am going to save that for in the future, though.
@ScottPletcher - I wanted something dynamic, but ended up using yours for the project that I was working on yesterday.
@Dustin Saunders - I am going to save that for my bag of tricks for later!
Thanks, guys! I really appreciate it!
@ScottPletcher - I wanted something dynamic, but ended up using yours for the project that I was working on yesterday.
@Dustin Saunders - I am going to save that for my bag of tricks for later!
Thanks, guys! I really appreciate it!