Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

SQL Query to pull joined data without a pivot

Posted on 2009-12-22
13
Medium Priority
?
264 Views
Last Modified: 2012-05-08
I need to write a SQL query to pull data from two tables.  This calls for a join, which is the easy part.  The second table has data in rows instead of columns, which could be solved via a pivot except that the data is random (address information) and there is no aggregate function since the data is text not numerical.  Here is an example:

Table1
UID       USER         FNAME         LNAME
1          JohnDoe    John             Doe
2          JaneDoe    Jane             Doe
3          BobDole     Bob              Dole

Table2
ID         UID         FIELD         VALUE
1          1            City             Austin
2          1            State           TX
3          1            Phone         (000) 111-1234
4          2            City             Carmel
5          2            State           IN

I want a resulting set:

Result1
UID         USER         FNAME         LNAME         City          State           Phone
1            JohnDoe    John             Doe               Austin      TX               (000) 111-1234
2            JaneDoe    Jane             Doe               Carmel      IN                NULL
3            BobDole     Bob              Dole               NULL        NULL          NULL

There can be any number of fields in Table2 corresponding to a user in Table1.  The solution can pull in all fields or only a set of fields that I specify, like address, city, state, etc, whichever is easiest because the data can be trimmed once the final result is exported to Excel.  Table1 is the main table for the join and not all records in Table1 will have corresponding records in Table2 (defining the type of JOIN).

My main concern is the pivot, or lack thereof.  If a pivot works, that's great, but in my research pivot requires an aggregate which does not work with text values.  Any solution that works with SQL Server 2005 is fine, including stored procedures that I can easily adapt to the real data.

Thanks in advance for your suggestions.
0
Comment
Question by:dageyra
  • 6
  • 6
13 Comments
 
LVL 41

Expert Comment

by:Sharath
ID: 26106934
You need a PIVOT for this. check this script.

SELECT T1.UID,T1.[USER],T1.FNAME,T1.LNAME,T2.City,T2.[State],T2.Phone
  FROM Table1 AS T1
  LEF JOIN (SELECT UID,[City],[State],[Phone]
            FROM (SELECT UID,FIELD,[VALUE] FROM Table2 ) AS T1
             PIVOT (MAX([VALUE]) FOR FIELD IN ([City],[State],[Phone])) AS P) AS T2
    ON T1.UID = T2.UID


0
 
LVL 41

Expert Comment

by:Sharath
ID: 26106965
In fact, you need a dynamic PIVOT, in case if you have more than 3 Fields in Table2. check this script.

declare @sql varchar(8000)
set @sql = (select distinct ',['+VALUE+']' from Table2 for xml path(''))
set @sql = '
select T1.*, + @sql +
  from Table 1 as T1
  left Join (SELECT UID' + @sql  + '
               FROM  (SELECT UID,FIELD,VALUE FROM Table2) AS T1
              PIVOT (MAX(VALUE) FOR FIELD IN ('+ substring(@sql,2,8000)  +')) AS p) as t2
    on T1.UID = T2.UID'
exec (@sql)


0
 
LVL 9

Expert Comment

by:Stephan_Schrandt
ID: 26107015
Attached a script that allows you to create pivot with every table (limit is only the number of columns a table can have)The instr function is needed by the stored proc, so create it first.
Explanation about the parameters:

@GroupByColumn is the field that will be used as record identifier, in your case [UID]
@FieldNamesColumn defines the columnname which has the attribute names, in your case [FIELD]
@FieldValuesColumn defines the columnname which has the attribute values, in your case [VALUE]
@SourceTable is the source tablename, in your case [Table2]
@FirstRowsFieldNames -> Set it to 1 when you know that every record group contains all possible attributes. Otherwise it will scan the whole table for attribute names in the @FieldNamesColumn column.

All fields are declared as nvarchar(max), you can cast it later in a query if needed. The output table is hardcoded: tblDataPivot, change it by search&replace.
pivot.txt
Instr.txt
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 1

Author Comment

by:dageyra
ID: 26107696
Hello Sharath:

Thanks for your comment.  I have not messed with your second comment on the stored procedure but have been working with your first suggestion.  It does appear to be close to what I need, but there are some issues.  First, there is a third table in the mix that stores the available values for FIELD (like City, State, Telephone).  I incorporated this by creating a JOIN of the inner-select right before the PIVOT.  However, I either get too little information or the right information but with multiple rows.  The query is below.  If I use LEFT JOIN, I get only the City (no Telephone which does exist, the Telephone column is NULL even for records with values) and only one record per user (which I want only one record).  If I use JOIN, I get multiple records per user for each type of data (so one record for City, one record for Telephone), but the data is in columns, which is good.

You will notice that this is all DotNetNuke-specific data, which is what I am trying to get at.  I use the following SELECT to pull the data on which to PIVOT:

SELECT UserID, [PropertyValue], PropertyCategory, PropertyName
                   FROM UserProfile UP
                   JOIN ProfilePropertyDefinition PPD
                   ON UP.PropertyDefinitionID = PPD.PropertyDefinitionID

When I run this query, it does return data similar to my Table2 above, so I know that part is working correctly.  By this I mean there is multiple rows per UID, each with a unique field and value.  So I think the issue is on the LEFT JOIN following the Users table.  Notice that I am only pulling City & Telephone for now just to get a working test case with only a sample of the data.

Does this make sense and is there a way to modify the query below in order to get all of the fields on one row in the final result set?  Also, some of the users are missing regardless of the JOIN type.  Any user that does not have data in the UserProfile table is not included in the final result set, so that also leads me to believe that I am doing something wrong with the JOINs.

Any suggestions?
SELECT U.UserID, U.Username, U.FirstName, U.LastName,
T2.City,T2.Telephone
from Users AS U
LEFT JOIN
(SELECT UserID,[City],[Telephone]
           FROM (SELECT UserID, [PropertyValue], PropertyCategory, PropertyName 
                   FROM UserProfile UP 
                   JOIN ProfilePropertyDefinition PPD 
                   ON UP.PropertyDefinitionID = PPD.PropertyDefinitionID) AS T1
            PIVOT (MAX([PropertyValue]) FOR PropertyName IN ([City],[Telephone])) AS P) AS T2
   ON U.UserID = T2.UserID

Open in new window

0
 
LVL 41

Expert Comment

by:Sharath
ID: 26107801
>> First, there is a third table in the mix that stores the available values for FIELD (like City, State, Telephone).

You did not mention about your third table in the main query. Can you provide some sample data from this table also?
0
 
LVL 41

Accepted Solution

by:
Sharath earned 2000 total points
ID: 26107837
Can you run this query and check if you are getting correct result.


SELECT T1.UID,T1.[USER],T1.FNAME,T1.LNAME,T2.City,T2.[State],T2.Phone
 FROM Users AS T1
 LEF JOIN (SELECT UID,[City],[State],[Phone]
           FROM (SELECT UID,PropertyName,[PropertyValue]
                   FROM (SELECT UserID, [PropertyValue], PropertyCategory, PropertyName
                           FROM UserProfile UP
                           JOIN ProfilePropertyDefinition PPD
                             ON UP.PropertyDefinitionID = PPD.PropertyDefinitionID) AS T4 ) AS T3
            PIVOT (MAX([PropertyValue]) FOR PropertyName IN ([City],[State],[Phone])) AS P) AS T2
   ON T1.UID = T2.UID
0
 
LVL 1

Author Comment

by:dageyra
ID: 26107892
Hello Sharath:

I actually just realized there was a 3rd table when I went to implement your solution, but it does not change things much.  I'll go back to my original example and show how this 3rd table fits in to help make sense of it.

In the original example, Table2 contains the name of the field in the FIELD column.  This is not how it is for obvious reasons.  There is a 3rd table that contains the name of the field and additional field properties (length, etc).  So Table1 would be just the same and there would be a 3rd table:

Table3
FID             NAME                   TYPE                    LENGTH
1                City                       Text                      NULL
2                State                     CHAR                  2
3                Phone                   Phone                  10

This changes the second table because the field name is not there, but there is an ID for the field (FID):

Table2
ID          UID        FID      VALUE
1          1            1         Austin
2          1            2         TX
3          1            3         (000) 111-1234
4          2            1         Carmel
5          2            2         IN

Does that make sense?  This is why my inner-most select is a join of these two tables based on the ID of the profile definition table.  Since that inner-select when ran as a separate query produces a result just like my original Table2, I would not expect that to have much of an impact on your solution.

Also to help, in my actual code that I recently posted Table1 = Users, Table2 = UserProfile, & Table3 = ProfilePropertyDefinition.  As you can see, I can translate the basic example to my actual tables if it's easier for you to continue using the basic nomenclature of Table1, Table2, etc.
0
 
LVL 41

Expert Comment

by:Sharath
ID: 26107928
Did you try my last query?
0
 
LVL 1

Author Comment

by:dageyra
ID: 26107950
Hello Sharath:

You latest post was very helpful, with a few issues.  I made some changes to run the query and it is very close.  There is only one record per user now, which was the problem I was having.  However, some users are still missing (those without data in the profile table).  I included the updated, correct working SQL below, any suggestions on how to get all users regardless of whether they have profile data or not?
SELECT T1.UserID,T2.City,T2.Telephone
FROM Users AS T1
LEFT JOIN (SELECT UserID,[City],[Telephone]
          FROM (SELECT UserID, PropertyName,[PropertyValue]
                   FROM (SELECT UserID, [PropertyValue], PropertyCategory, PropertyName
                          FROM UserProfile UP
                          JOIN ProfilePropertyDefinition PPD
                            ON UP.PropertyDefinitionID = PPD.PropertyDefinitionID) AS T4 ) AS T3
           PIVOT (MAX([PropertyValue]) FOR PropertyName IN ([City],[Telephone])) AS P) AS T2
  ON T1.UserID = T2.UserID

Open in new window

0
 
LVL 1

Author Comment

by:dageyra
ID: 26107952
Also, can you explain to me what you changed to get the data on one row?
0
 
LVL 1

Author Comment

by:dageyra
ID: 26107982
Hello Sharath:

Ignore my last comment about missing data.  I double-checked the row count and they are all there.  I included a ORDER BY UserID because the rows were coming out-of-order, however all users are accounted for in the final result list.  I assume I can just add the additional columns to the query that I want to include (like State for example), and if you could let me know what changed to get the data all on one row?

I will accept your working query as the proper solution, it has worked great.
0
 
LVL 1

Author Closing Comment

by:dageyra
ID: 31669087
Worked great, with a few modifications.
0
 
LVL 41

Expert Comment

by:Sharath
ID: 26108121
glad i could help. Let me know if you are still looking for clarification.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

An alternative to the "For XML" way of pivoting and concatenating result sets into strings, and an easy introduction to "common table expressions" (CTEs). Being someone who is always looking for alternatives to "work your data", I came across this …
Recursive SQL is one of the most fascinating and powerful and yet dangerous feature offered in many modern databases today using a Common Table Expression (CTE) first introduced in the ANSI SQL 99 standard. The first implementations of CTE began ap…
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.
Suggested Courses

810 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question