Denormalize Excel Data

propertytax
propertytax used Ask the Experts™
on
I don't know that "denormalize" is technically the correct term, so to you database guys, I apologize up front, but it's the best I could come up with to describe my need.

I am being given the output of a relational database in Excel format, where there is a one-to-many relationship between "person name" and "position code". For each person, one row is output for each position held. The "key" is the name, which is the "one" side of the relationship.

I'm not bad with Excel, but this is beyond my pay grade: I want to have, for each name, a cell that has the concatenated set of positions to which that person is assigned. An example:

Input
Name   | Position
John   | AA
John   | BB
Tom    | AA
Harry  | BB
Harry  | CC
Harry  | DD

Open in new window

Desired Output:
Name   | Positions
John   | AA, BB
Tom    | AA
Harry  | BB, CC, DD

Open in new window

If at all humanly possible, I do NOT want to use VBA, and stay fully within the Excel environment to solve the problem.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Most Valuable Expert 2011
Awarded 2010

Commented:
Hello,

Excel does not provide that functionality out of the box. You would need a custom formula written with VBA.

The closest you can get with out of the box functionality is to build a pivot table as the attached.

cheers, teylyn
Pivot.xlsx

Author

Commented:
Yeah, I ran down the Pivot Table rabbit hole myself; I've seen some very fancy things done with VLOOKUP and am hoping someone clever can put something together, but if VBA is the only solution I'm going to have to go another route :-(

Thanks for the suggestion, though.
Hi, propertytax.

A few questions, please...
(1) In your source example, all of the entries for an individual are grouped together. Is this true for the actual data?
(2) May we add a helper column (or two!) beside the source data?
(3) May we assume a maximum no. of entries?
(4) How does the source data get from the database to the spreadsheet which will contain the results?

Thanks,
Brian.
Why Diversity in Tech Matters

Kesha Williams, certified professional and software developer, explores the imbalance of diversity in the world of technology -- especially when it comes to hiring women. She showcases ways she's making a difference through the Colors of STEM program.

Author

Commented:
1) The entries for each name are grouped by name; further, they are sorted by name ascending, which is the primary (and only) sort key

2) Yes, we're free to add any helper columns we would care to, the data is "branched" from the database and we can manipulate at will

3) We may assume a maximum of 4 positions per name

4) The data is delivered to me as an Excel spreadsheet; I have no direct access to the database itself.

I've been playing with ARRAY formulas but I am just so terrible at those, I haven't a clue where to start.
Most Valuable Expert 2011
Awarded 2010
Commented:
Hello,

please see attached a suggestion with helper columns that can be hidden.

Column E has the formula

=IF(ISERROR(MATCH($D2&E$1,INDEX($A$1:$A$20&$B$1:$B$20,0),0)),0,E$1)

Column F has
=IF(COUNTA($E2:E2)>0,E2&", ","")&IF(ISERROR(MATCH($D2&F$1,INDEX($A$1:$A$20&$B$1:$B$20,0),0)),0,F$1)

Hide columns E to H. In column I use

=SUBSTITUTE(SUBSTITUTE(H2,", 0",""),"0, ","")

This can be expanded to as many positions as there are in the data.

You can use a pivot table to set up the initial grid of unique names in the rows and unique list of positions in the column headers. Then copy the unique names and use paste special > values to paste to a new table. Copy the unique position names and paste special > values across the top.

cheers, teylyn
NotPivot.xlsx
Thanks, propertytax.

Please see the attached. A few points...
(1) The formulas are in the yellow (and one red) cells. Copy them down for as many as rows as you have data in column A. You can overshoot (as I have in rows 28 to 31) without breaking anything.
(2) If you want to be fancy the formulas in columns F anf H:I only need to go down as many rows as the number of unique names (E2).
(3) The red cell is highlighted because, apologies, its formula is different from the other cells in that column.
(4) If you may have more than 5,000 entries in column A then the formulas in column F need to be changed.

Regards,
Brian.
Denormalize-No-VBA.xlsx

Author

Commented:
Sorry, both - came down with something or other, keeping me on my back for now. I'll get back to this shortly, though. I appreciate the responses.
Thanks for the update, propertytax. All the best for a speedy recovery,
Most Valuable Expert 2011
Awarded 2010

Commented:
Hello, thanks for closing the question, but can you please explain the B grade? If you don't assign an A grade, you should explain what aspect of the answers you are not happy with, so experts can improve their suggestions.

If you are not happy with the way that Excel works, then that does not mean the question gets a B, though. Experts are not responsible for the way Excel works.

Author

Commented:
I've requested that the moderators change the grade on this question.
Thanks, all.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial