Link to home
Start Free TrialLog in
Avatar of saqibtq
saqibtq

asked on

Formula to Split a Cell Part of a Row and Keep the Rest of the Rows (For a large dataset) - Google Spreadsheets

I am trying to do the following:  I have a large data set with 500+ rows and 3 columns:  names, id, value.  I want to split the names column into separate rows with one name each, but with copying over the same values for id and value.

For example, I have this:

names            id     value
bob, sam, bill	99	10
harris	        95	11
george, john	98	14
tom, harry	97	13

Open in new window


And I need to get this:

bob	99	10
sam	99	10
bill	99	10
harris	95	11
george	98	14
john	98	14
tom	97	13
harry	97	13

Open in new window


I also need to be able to do this in one iteration or two, since I have a total of 500+ rows.  I setup a test sheet here Also I need the solution to be in Google Sheets.

Thanks
Avatar of Saqib Husain
Saqib Husain
Flag of Pakistan image

Try

Sub splitcell2rows()
    Dim cel As Range
    Set cel = Range("A1")
    Do While cel <> ""
        If InStr(cel, ",") <> 0 Then
            cel.Offset(1).EntireRow.Insert
            cel.EntireRow.Copy cel.Offset(1)
            cel.Value = Left(cel, InStr(cel, ",") - 1)
            cel.Offset(1).Value = Trim(Right(cel.Offset(1), Len(cel.Offset(1)) - InStr(cel.Offset(1), ",")))
        End If
        Set cel = cel.Offset(1)
    Loop
End Sub
The above macro works perfectly in Excel, but you'd need to re-write it in Javascript to be able to do the same thing in Google Sheets.

The farthest I've gotten with this is create a vertical listing of all names with one function:
=transpose(split(join(", ",$A$3:$A$6),", "))

But, once the names are listed, an alternate function that emulates a VLOOKUP will be needed.  I'm hoping an array function will work in this case, but I haven't worked out the components yet.
I've got it.

In my example, I'm inserting the calculated values in columns E through G, starting in row 3 (same row as in your example).

In cell E3:
=transpose(split(join(", ",$A$3:$A$500),", "))
In cell F3:
=offset($A$2,match("*"&$E3&"*",$A$3:$A$500,0),1)
In cell G3:
=offset($A$2,match("*"&$E3&"*",$A$3:$A$500,0),2)

Copy down the formulas in cells F3 & G3 (E3 will populate down automatically)

The result looks like the attached screenshot.

You'll enter three equations and copy down two.  Then, if you must replace the original columns A:C with the new values, you'll either copy and paste the values into columns A:C and remove the calculated values, or copy and replace the calculated values and delete the columns to the left.

Caveat:  This will only work as long as there are unique names in column A.  Any name that is repeated will produce erroneous results.  If this the case with your data, you will be forced to write a javascript routine to emulate the macro submitted above.

Regards,
Glenn
GoogleSheet.jpg
PS  I'm trying to work out an algorithm that will phase shift the search array so that non-unique names will still return their correct IDs and Values.  It's rather tricky, but I think it's possible.

 Follow-up:  I'm holding off on this.  If you have cases where the exact same name value occurs more than once and the second column is indeed a unique ID (identifier), then there would be an issue with the data integrity.  I'm going to presume for the moment that this won't occur.

 I've revised the formulas above to work for a 500-row data set.

 -Glenn
Avatar of saqibtq
saqibtq

ASKER

Hi Glenn,

I really appreciate your efforts! Sorry, I think that I should of made this clearer and specified it in the question, but the data set does have alot of repeated names.
Avatar of saqibtq

ASKER

Sorry, but I do need an algorithm so that non-unique names would still return their correct ID's and values
Man, this was really difficult, but I do think I have a solution for you.  Rather than create a very complex pair of formulas for ID and Values, I've broken out the logic with three additional columns that you can hide if you wish.

Following my previous example with a new table adjacent to the existing one, starting on row 3, here are the formulas you'd input:

Cell E3 (Names):
=transpose(split(join(", ",$A$3:$A$20),", "))
This formula will auto-populate down; no need to copy.  Very cool, actually (not a feature available in Excel)

Cell F3 (ID):
=OFFSET($A$2,J3,1)

Cell G3 (Value):
=OFFSET($A$2,J3,2)

Cell H3 (Occurrence of name)
=COUNTIF($E$3:E3,E3)

Cell I3 (Number of repeats of name)
=COUNTIF($A$3:$A$20,"*"&E3&"*")

Cell J3 (Array Row of occurance)
=IF(H3=I3,IF(H3=1,MATCH("*"&E3&"*",$A$3:$A$20,0),IF(ISNA(MATCH("*"&E3&"*",INDIRECT("$A$"&J2+3&":$A$20"),0)),J2,MATCH("*"&E3&"*",INDIRECT("$A$"&J2+3&":$A$20"),0)+J2)),IF(H3=1,MATCH("*"&E3&"*",$A$3:$A$20,0),IF(ISNA(MATCH("*"&E3&"*",INDIRECT("$A$"&J2+3&":$A$20"),0)),J2+1,MATCH("*"&E3&"*",INDIRECT("$A$"&J2+3&":$A$20"),0)+J2)-(IF(MATCH("*"&E3&"*",INDIRECT("$A$"&J2+2&":$A$20"),0)=1,1,0))))

Select cells F3:J3 and copy down same number of rows as values in column E.

I've tried to set up a link to my GoogleSheets example for you to see.  Let me know if there are any issues.

Regards,
Glenn

Google Sheets Example
ASKER CERTIFIED SOLUTION
Avatar of Glenn Ray
Glenn Ray
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of saqibtq

ASKER

Glenn, I really appreciate your hard work and help.  
Although the problem wasn't quite solved (in Google Sheets), I think this still may benefit others.
Thanks again.
Avatar of saqibtq

ASKER

Great, the algorithm does still work to some degree and is useful.
Thanks for the help.
saqibtq...I have a solution: it's significantly simpler than the one I previously submitted AND handles all possible occurrences of names (including sub-names like "Paul/Paula", "Sam/Samantha" "Bert, Robert", etc.).  

I'll create a modified GoogleSheets example later today and post here when done.  

Sometimes, a little gardening work helps me solve problems like this. :-)
Okay, this is a whole new algorithm, but works as you require for your dataset.  See my example GoogleSheet for the results with expanded data (including repeated names and sub-names)

Cell D2 (Interim name-row list):
=SUBSTITUTE(A3,", ","|"&ROW()-2&", ")&"|"&ROW()-2
Copy this formula down as many rows as original

Cell E2 (Transposed name-row):
=transpose(split(join(", ",$D$3:$D$20),", "))
Expands automatically

Cell F2 (New List, Name column):
=LEFT(E3,FIND("|",E3)-1)

Cell G2 (New List, ID column):
=OFFSET($A$2,MID(E3,FIND("|",E3)+1,999),1)

Cell H2 (New List, Values column):
=OFFSET($A$2,MID(E3,FIND("|",E3)+1,999),2)

Copy cells F2:H2 down to align with column E

Hide columns D:E if desired.

The interim step adds an  identifier - the source row position of each name - to the names.  This not only makes each name unique, but passes along its position to use after the data is transposed.

-Glenn

Google Sheets - New Algorithm
Avatar of saqibtq

ASKER

Wow, thank you so much!
This works, thanks again for all of your effort!