Link to home
Start Free TrialLog in
Avatar of Calyx Teren
Calyx TerenFlag for United States of America

asked on

Excel: Concatenate multiple cell values in rows with common ID, and reverse

We prototyped a new taxonomy for files by using pivot tables in Excel. There are several categories in the taxonomy, and some frequently have multiple values for the same file, while others remain the same. To make the pivot tables work, we have to have a separate row for each value. However, to upload the data into the real tool, the multiple values have to be in one cell for each file, separated by a semicolon with no spaces.

We anticipate a few revisions, and so it would be useful to have macros that can go in both directions: one to concatenate the rows and delete duplicates, and the other to split the rows up to make one value per row. There are about 5000 files in the data set.

I have found a number of scripts online that do almost what I need, but my VBA skills aren't up to finishing the job. Thanks in advance for your help. I've attached a mini sample file that contains the common variations. There's one worksheet showing the starting point (split rows), and a second one showing the desired end point (concatenated rows).
Avatar of byundt
byundt
Flag of United States of America image

I've attached a mini sample file that contains the common variations
The attachments didn't appear in your question. When you upload a file, you need to:
1.  Click the Attach File link
2.  Click the Browse... button to find the file in your computer or network
3.  After you have selected the file, click the "Open" button in the file browser
4.  Click the Upload File button in the Experts Exchange Comment (or question)
5.  Add some descriptive text in the field with light gray font that says "Enter a brief description of your file (required)"

Omitting step 5 is a common mistake. I usually say "Sample file" because the field doesn't display very many words.
Avatar of Calyx Teren

ASKER

Thanks for the instructions. That field is subtle indeed. Even after I knew it was there somewhere I had to look around for it.
Sample-File.xlsm
Here is the code for concatenating and deleting rows. Install it in a regular module sheet.

As written, the macro works on the active worksheet, starting in row 2. The macro figures out how many rows of data exist. It concatenates the unique values in columns C and E, putting the concatenated results in the corresponding column of the first row of the Content ID in column A.  Finally, the macro deletes the extra rows.
Sub Concatenater()
Dim Col As Variant, v As Variant
Dim i As Long, iCol As Long, iKeep As Long, j As Long, n As Long
Dim rg As Range
Dim s As String, sConcat As String, sDelimiter As String
Application.ScreenUpdating = False
sDelimiter = ";"    'Delimiter for concatenated string
With ActiveSheet
    Set rg = .Range("A2")
    Set rg = Range(rg, .Cells(.Rows.Count, rg.Column).End(xlUp)) 'All the data in column A
End With
n = rg.Rows.Count
For Each Col In Array("C", "E")     'Columns to be concatenated
    sConcat = sDelimiter
    iCol = Range(Col & "1").Column
    v = ""
    For i = 1 To n
        s = rg.Cells(i, iCol).Value
        If s <> "" Then s = s & sDelimiter
        If v = rg.Cells(i, 1).Value Then
            If s <> "" Then
                If InStr(1, sConcat, sDelimiter & s) = 0 Then sConcat = sConcat & s
            End If
        Else
            If v <> "" Then rg.Cells(iKeep, iCol) = Mid(sConcat, Len(sDelimiter) + 1, Len(sConcat) - 2 * Len(sDelimiter))
            iKeep = i
            sConcat = sDelimiter
            If s <> "" Then sConcat = sConcat & s
            v = rg.Cells(i, 1).Value
        End If
    Next
    If v <> "" Then rg.Cells(iKeep, iCol) = Mid(sConcat, Len(sDelimiter) + 1, Len(sConcat) - 2 * Len(sDelimiter))
Next
v = ""
For i = n To 1 Step -1
    If rg.Cells(i, 1).Value = v Then
        rg.Cells(i + 1).EntireRow.Delete
    Else
        v = rg.Cells(i, 1).Value
    End If
Next

End Sub

Open in new window


It's rather late for me, so I will revert to this thread for the deconcatenation macro tomorrow--unless someone else has already taken care of that for you.

Brad
Sample-FileQ28324185.xlsm
Thank you, Brad! I just ran it and got an "Invalid procedure call or argument" error pointing to line 25, starting with "rg." Haven't got time to try debugging that right at the moment but will see if I can tomorrow or soon. Off on a trip tomorrow so I might not be able to get back to this until tomorrow night or the next day, but I'm definitely going to be on it soon. Again, thank you for your quick help.
ASKER CERTIFIED SOLUTION
Avatar of byundt
byundt
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Here is a macro to deconcatenate your concatenated columns. Please note that the deconcatenated results may not be the same as the original data. This is because duplicates are omitted from the concatenation.

In addition, values in one column to be concatenated may not line up with values in another column. For example, suppose Category values for four rows of a certain Content ID are Poultry, Poultry, Meat, and Meat. Suppose also that Cuisine values for those four rows are Chinese, French, Chinese, and Chinese. The deconcatenated rows (two of them) will be: Poultry/Chinese and Meat/French. Besides having two rows that cannot be reconstructed, the second deconcatenated row has a combination Meat/French that never existed in the original data.

If the deconcatenated data must be exactly the same as the original, then it would be far easier to keep the original data (perhaps on a different worksheet) than to modify the data structure so it can be reconstructed.
Sub Deconcatenater()
Dim Col As Variant, ConcatColumns As Variant, v As Variant
Dim i As Long, iCol As Long, k As Long, nRows As Long, nCols As Long
Dim rg As Range
Dim s As String, sConcat As String, sDelimiter As String
Application.ScreenUpdating = False
sDelimiter = ";"                    'Delimiter for concatenated string
ConcatColumns = Array("C", "E")     'Concatenated columns
With ActiveSheet
    Set rg = .Range("A2")   'First cell with Content ID. Header labels assumed to be in previous row.
    nCols = .Cells(rg.Row - 1, .Columns.Count).End(xlToLeft).Column - rg.Column + 1
    Set rg = Range(rg, .Cells(.Rows.Count, rg.Column).End(xlUp)) 'All the data in column A
    nRows = rg.Rows.Count
    Set rg = rg.Resize(nRows, nCols)    'All the data in table
End With
For i = nRows To 1 Step -1
    k = 0   'Number of rows to be added
    For Each Col In ConcatColumns
        iCol = Range(Col & "1").Column
        s = rg.Cells(i, iCol).Value
        k = Application.Max(k, Len(s) - Len(Replace(s, sDelimiter, "")))
    Next
    If k > 0 Then
        rg.Rows(i + 1).Resize(k).EntireRow.Insert
        rg.Rows(i + 1).Resize(k).Value = rg.Rows(i).Value
        For Each Col In ConcatColumns
            iCol = Range(Col & "1").Column
            s = rg.Cells(i, iCol).Value
            If s <> "" Then rg.Cells(i, iCol).Resize(k + 1, 1).Value = Application.Transpose(Split(s, sDelimiter))
        Next
    End If
Next
End Sub

Open in new window

Sample-FileQ28324185.xlsm
Thank you! Turned out that I didn't have any internet access during my trip, but I will try this tonight or tomorrow and test it. Can't wait to see the results.
The Concatenater script works!!! You are brilliant. Thank you. You can imagine how much time this will save me.

Your suggestion to use the concatenater as a one-way trip, and just maintain the original deconcatenated table separately, is probably the best suggestion. I can work with that.

That said, when I tried the deconcatenater on the example you suggested, it didn't work quite the way I expected even with your caution.

Starting point (though from your description, two are complete duplicates):
Content ID Title      Category      Sodium      Cuisine
123      Title      Poultry      5      Chinese
123      Title      Poultry      5      French
123      Title      Meat      5      Chinese
123      Title      Meat      5      Chinese

Concatenated:
Content ID      Title      Category      Sodium      Cuisine
123      Title      Poultry;Meat      5      Chinese;French

Deconcatenated:
Content ID      Title      Category      Sodium      Cuisine
123      Title      Poultry      5      Chinese
123      Title      Meat      5      French

That gives the combination that doesn't exist in the original (Meat; French), but is missing a combination that does exist in the original--Meat;Chinese. It would be useful if the deconcatenater did the full permutation into Poultry;Chinese - Poultry;French - Meat;Chinese - Meat;French.

Like I said, I can work with the one-way script, using just the concatenater. You've already helped me out to the tune of probably dozens of hours, so no more is necessary. However, if you feel like going farther, let me know. If I don't hear back in a day or so, I'll come back and close this with full points. Again, thank you, and happy new year!
The "lossy" method used for concatenation means that the original can not be reconstructed faithfully, and that artifacts such as you pointed out are likely.

If you would be willing to accept both duplicates and blanks in the concatenation, then the deconcatenation will work perfectly. For example, the Category would be concatenated as Poultry;Poultry;Meat;Meat. Even if this "loss-free" concatenation were stored in a cell comment or an auxiliary column, it could be used to reconstruct the original.

But since I suspect that your real problem is more complex than the cooking example posted in your question, I stand by my previous suggestion that the better alternative is to preserve the original worksheet before concatenation.

Brad
Quick, helpful, willing to explain and even do extra work--I truly appreciate Brad's help.
I decided to store a "lossless" concatenation in a comment for each cell being concatenated. I could then use the text of that comment to perform a perfect reconstruction during deconcatenation. Text in the cells being concatenated is not affected.
Sub Concatenater()
Dim Col As Variant, v As Variant
Dim i As Long, iCol As Long, iKeep As Long, j As Long, n As Long
Dim rg As Range
Dim s As String, sConcat As String, sDelimiter As String, sFull
Application.ScreenUpdating = False
sDelimiter = ";"    'Delimiter for concatenated string
With ActiveSheet
    Set rg = .Range("A2")
    Set rg = Range(rg, .Cells(.Rows.Count, rg.Column).End(xlUp)) 'All the data in column A
End With
n = rg.Rows.Count
For Each Col In Array("C", "E")     'Columns to be concatenated
    sConcat = sDelimiter
    iCol = Range(Col & "1").Column
    v = ""
    For i = 1 To n
        If Not rg.Cells(i, iCol).Comment Is Nothing Then rg.Cells(i, iCol).Comment.Delete
        s = rg.Cells(i, iCol).Value
        If s <> "" Then s = s & sDelimiter
        If v = rg.Cells(i, 1).Value Then
            sFull = sFull & rg.Cells(i, iCol).Value & sDelimiter
            If s <> "" Then
                If InStr(1, sConcat, sDelimiter & s) = 0 Then sConcat = sConcat & s
            End If
        Else
            If (v <> "") And (Len(sConcat) > 2 * Len(sDelimiter)) Then _
                rg.Cells(iKeep, iCol) = Mid(sConcat, Len(sDelimiter) + 1, Len(sConcat) - 2 * Len(sDelimiter))
            If Len(sFull) > 2 * Len(sDelimiter) Then
                rg.Cells(iKeep, iCol).AddComment Mid(sFull, Len(sDelimiter) + 1, Len(sFull) - 2 * Len(sDelimiter))
            End If
            iKeep = i
            sConcat = sDelimiter
            sFull = sDelimiter & rg.Cells(i, iCol).Value & sDelimiter
            If s <> "" Then sConcat = sConcat & s
            v = rg.Cells(i, 1).Value
        End If
    Next
    If Len(sFull) > 2 * Len(sDelimiter) Then
        rg.Cells(iKeep, iCol).AddComment Mid(sFull, Len(sDelimiter) + 1, Len(sFull) - 2 * Len(sDelimiter))
    End If
    If (v <> "") And (Len(sConcat) > 2 * Len(sDelimiter)) Then _
        rg.Cells(iKeep, iCol) = Mid(sConcat, Len(sDelimiter) + 1, Len(sConcat) - 2 * Len(sDelimiter))
Next
v = ""
For i = n To 1 Step -1
    If rg.Cells(i, 1).Value = v Then
        rg.Cells(i + 1).EntireRow.Delete
    Else
        v = rg.Cells(i, 1).Value
    End If
Next

End Sub

Sub Deconcatenater()
Dim Col As Variant, ConcatColumns As Variant, v As Variant
Dim i As Long, iCol As Long, k As Long, nRows As Long, nCols As Long
Dim rg As Range
Dim s As String, sConcat As String, sDelimiter As String
Application.ScreenUpdating = False
sDelimiter = ";"                    'Delimiter for concatenated string
ConcatColumns = Array("C", "E")     'Concatenated columns
With ActiveSheet
    Set rg = .Range("A2")   'First cell with Content ID. Header labels assumed to be in previous row.
    nCols = .Cells(rg.Row - 1, .Columns.Count).End(xlToLeft).Column - rg.Column + 1
    Set rg = Range(rg, .Cells(.Rows.Count, rg.Column).End(xlUp)) 'All the data in column A
    nRows = rg.Rows.Count
    Set rg = rg.Resize(nRows, nCols)    'All the data in table
End With
For i = nRows To 1 Step -1
    k = 0   'Number of rows to be added
    For Each Col In ConcatColumns
        iCol = Range(Col & "1").Column
        If Not rg.Cells(i, iCol).Comment Is Nothing Then
            s = rg.Cells(i, iCol).Comment.Text
            k = Application.Max(k, Len(s) - Len(Replace(s, sDelimiter, "")))
        End If
    Next
    If k > 0 Then
        rg.Rows(i + 1).Resize(k).EntireRow.Insert
        rg.Rows(i + 1).Resize(k).Value = rg.Rows(i).Value
        For Each Col In ConcatColumns
            iCol = Range(Col & "1").Column
            If Not rg.Cells(i, iCol).Comment Is Nothing Then
                s = rg.Cells(i, iCol).Comment.Text
                If s <> "" Then rg.Cells(i, iCol).Resize(k + 1, 1).Value = Application.Transpose(Split(s, sDelimiter))
            End If
        Next
    End If
    For Each Col In ConcatColumns
        iCol = Range(Col & "1").Column
        If Not rg.Cells(i, iCol).Comment Is Nothing Then rg.Cells(i, iCol).Comment.Delete
    Next
Next
End Sub

Open in new window

Brad
Sample-FileQ28324185.xlsm
Wow! Can't wait to test this!
I've been using the non-lossy concatenate and deconcatenate scripts for a couple of days. They work perfectly. Thank you!
calyx_teren,
I'm glad you are finding the lossless concatenate and deconcatenate macros so helpful.

The question you asked is one that is likely to come up again, so I added this thread to my Experts Exchange Knowledgebase. For this reason, I'd appreciate your continuing to post in this thread if issues come up with the code. Doing so will make this question even more useful as a future reference for people.

Thanks for posting back with your results!

Brad