saqibtq
asked on
Formula to Split a Cell Part of a Row and Keep the Rest of the Rows (For a large dataset) - Google Spreadsheets
I am trying to do the following: I have a large data set with 500+ rows and 3 columns: names, id, value. I want to split the names column into separate rows with one name each, but with copying over the same values for id and value.
For example, I have this:
And I need to get this:
I also need to be able to do this in one iteration or two, since I have a total of 500+ rows. I setup a test sheet here Also I need the solution to be in Google Sheets.
Thanks
For example, I have this:
names id value
bob, sam, bill 99 10
harris 95 11
george, john 98 14
tom, harry 97 13
And I need to get this:
bob 99 10
sam 99 10
bill 99 10
harris 95 11
george 98 14
john 98 14
tom 97 13
harry 97 13
I also need to be able to do this in one iteration or two, since I have a total of 500+ rows. I setup a test sheet here Also I need the solution to be in Google Sheets.
Thanks
The above macro works perfectly in Excel, but you'd need to re-write it in Javascript to be able to do the same thing in Google Sheets.
The farthest I've gotten with this is create a vertical listing of all names with one function:
=transpose(split(join(", ",$A$3:$A$6),", "))
But, once the names are listed, an alternate function that emulates a VLOOKUP will be needed. I'm hoping an array function will work in this case, but I haven't worked out the components yet.
The farthest I've gotten with this is create a vertical listing of all names with one function:
=transpose(split(join(", ",$A$3:$A$6),", "))
But, once the names are listed, an alternate function that emulates a VLOOKUP will be needed. I'm hoping an array function will work in this case, but I haven't worked out the components yet.
I've got it.
In my example, I'm inserting the calculated values in columns E through G, starting in row 3 (same row as in your example).
In cell E3:
=transpose(split(join(", ",$A$3:$A$500),", "))
In cell F3:
=offset($A$2,match("*"&$E3 &"*",$A$3: $A$500,0), 1)
In cell G3:
=offset($A$2,match("*"&$E3 &"*",$A$3: $A$500,0), 2)
Copy down the formulas in cells F3 & G3 (E3 will populate down automatically)
The result looks like the attached screenshot.
You'll enter three equations and copy down two. Then, if you must replace the original columns A:C with the new values, you'll either copy and paste the values into columns A:C and remove the calculated values, or copy and replace the calculated values and delete the columns to the left.
Caveat: This will only work as long as there are unique names in column A. Any name that is repeated will produce erroneous results. If this the case with your data, you will be forced to write a javascript routine to emulate the macro submitted above.
Regards,
Glenn
GoogleSheet.jpg
In my example, I'm inserting the calculated values in columns E through G, starting in row 3 (same row as in your example).
In cell E3:
=transpose(split(join(", ",$A$3:$A$500),", "))
In cell F3:
=offset($A$2,match("*"&$E3
In cell G3:
=offset($A$2,match("*"&$E3
Copy down the formulas in cells F3 & G3 (E3 will populate down automatically)
The result looks like the attached screenshot.
You'll enter three equations and copy down two. Then, if you must replace the original columns A:C with the new values, you'll either copy and paste the values into columns A:C and remove the calculated values, or copy and replace the calculated values and delete the columns to the left.
Caveat: This will only work as long as there are unique names in column A. Any name that is repeated will produce erroneous results. If this the case with your data, you will be forced to write a javascript routine to emulate the macro submitted above.
Regards,
Glenn
GoogleSheet.jpg
PS I'm trying to work out an algorithm that will phase shift the search array so that non-unique names will still return their correct IDs and Values. It's rather tricky, but I think it's possible.
Follow-up: I'm holding off on this. If you have cases where the exact same name value occurs more than once and the second column is indeed a unique ID (identifier), then there would be an issue with the data integrity. I'm going to presume for the moment that this won't occur.
I've revised the formulas above to work for a 500-row data set.
-Glenn
Follow-up: I'm holding off on this. If you have cases where the exact same name value occurs more than once and the second column is indeed a unique ID (identifier), then there would be an issue with the data integrity. I'm going to presume for the moment that this won't occur.
I've revised the formulas above to work for a 500-row data set.
-Glenn
ASKER
Hi Glenn,
I really appreciate your efforts! Sorry, I think that I should of made this clearer and specified it in the question, but the data set does have alot of repeated names.
I really appreciate your efforts! Sorry, I think that I should of made this clearer and specified it in the question, but the data set does have alot of repeated names.
ASKER
Sorry, but I do need an algorithm so that non-unique names would still return their correct ID's and values
Man, this was really difficult, but I do think I have a solution for you. Rather than create a very complex pair of formulas for ID and Values, I've broken out the logic with three additional columns that you can hide if you wish.
Following my previous example with a new table adjacent to the existing one, starting on row 3, here are the formulas you'd input:
Cell E3 (Names):
=transpose(split(join(", ",$A$3:$A$20),", "))
This formula will auto-populate down; no need to copy. Very cool, actually (not a feature available in Excel)
Cell F3 (ID):
=OFFSET($A$2,J3,1)
Cell G3 (Value):
=OFFSET($A$2,J3,2)
Cell H3 (Occurrence of name)
=COUNTIF($E$3:E3,E3)
Cell I3 (Number of repeats of name)
=COUNTIF($A$3:$A$20,"*"&E3 &"*")
Cell J3 (Array Row of occurance)
=IF(H3=I3,IF(H3=1,MATCH("* "&E3&"*",$ A$3:$A$20, 0),IF(ISNA (MATCH("*" &E3&"*",IN DIRECT("$A $"&J2+3&": $A$20"),0) ),J2,MATCH ("*"&E3&"* ",INDIRECT ("$A$"&J2+ 3&":$A$20" ),0)+J2)), IF(H3=1,MA TCH("*"&E3 &"*",$A$3: $A$20,0),I F(ISNA(MAT CH("*"&E3& "*",INDIRE CT("$A$"&J 2+3&":$A$2 0"),0)),J2 +1,MATCH(" *"&E3&"*", INDIRECT(" $A$"&J2+3& ":$A$20"), 0)+J2)-(IF (MATCH("*" &E3&"*",IN DIRECT("$A $"&J2+2&": $A$20"),0) =1,1,0))))
Select cells F3:J3 and copy down same number of rows as values in column E.
I've tried to set up a link to my GoogleSheets example for you to see. Let me know if there are any issues.
Regards,
Glenn
Google Sheets Example
Following my previous example with a new table adjacent to the existing one, starting on row 3, here are the formulas you'd input:
Cell E3 (Names):
=transpose(split(join(", ",$A$3:$A$20),", "))
This formula will auto-populate down; no need to copy. Very cool, actually (not a feature available in Excel)
Cell F3 (ID):
=OFFSET($A$2,J3,1)
Cell G3 (Value):
=OFFSET($A$2,J3,2)
Cell H3 (Occurrence of name)
=COUNTIF($E$3:E3,E3)
Cell I3 (Number of repeats of name)
=COUNTIF($A$3:$A$20,"*"&E3
Cell J3 (Array Row of occurance)
=IF(H3=I3,IF(H3=1,MATCH("*
Select cells F3:J3 and copy down same number of rows as values in column E.
I've tried to set up a link to my GoogleSheets example for you to see. Let me know if there are any issues.
Regards,
Glenn
Google Sheets Example
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Glenn, I really appreciate your hard work and help.
Although the problem wasn't quite solved (in Google Sheets), I think this still may benefit others.
Thanks again.
Although the problem wasn't quite solved (in Google Sheets), I think this still may benefit others.
Thanks again.
ASKER
Great, the algorithm does still work to some degree and is useful.
Thanks for the help.
Thanks for the help.
saqibtq...I have a solution: it's significantly simpler than the one I previously submitted AND handles all possible occurrences of names (including sub-names like "Paul/Paula", "Sam/Samantha" "Bert, Robert", etc.).
I'll create a modified GoogleSheets example later today and post here when done.
Sometimes, a little gardening work helps me solve problems like this. :-)
I'll create a modified GoogleSheets example later today and post here when done.
Sometimes, a little gardening work helps me solve problems like this. :-)
Okay, this is a whole new algorithm, but works as you require for your dataset. See my example GoogleSheet for the results with expanded data (including repeated names and sub-names)
Cell D2 (Interim name-row list):
=SUBSTITUTE(A3,", ","|"&ROW()-2&", ")&"|"&ROW()-2
Copy this formula down as many rows as original
Cell E2 (Transposed name-row):
=transpose(split(join(", ",$D$3:$D$20),", "))
Expands automatically
Cell F2 (New List, Name column):
=LEFT(E3,FIND("|",E3)-1)
Cell G2 (New List, ID column):
=OFFSET($A$2,MID(E3,FIND(" |",E3)+1,9 99),1)
Cell H2 (New List, Values column):
=OFFSET($A$2,MID(E3,FIND(" |",E3)+1,9 99),2)
Copy cells F2:H2 down to align with column E
Hide columns D:E if desired.
The interim step adds an identifier - the source row position of each name - to the names. This not only makes each name unique, but passes along its position to use after the data is transposed.
-Glenn
Google Sheets - New Algorithm
Cell D2 (Interim name-row list):
=SUBSTITUTE(A3,", ","|"&ROW()-2&", ")&"|"&ROW()-2
Copy this formula down as many rows as original
Cell E2 (Transposed name-row):
=transpose(split(join(", ",$D$3:$D$20),", "))
Expands automatically
Cell F2 (New List, Name column):
=LEFT(E3,FIND("|",E3)-1)
Cell G2 (New List, ID column):
=OFFSET($A$2,MID(E3,FIND("
Cell H2 (New List, Values column):
=OFFSET($A$2,MID(E3,FIND("
Copy cells F2:H2 down to align with column E
Hide columns D:E if desired.
The interim step adds an identifier - the source row position of each name - to the names. This not only makes each name unique, but passes along its position to use after the data is transposed.
-Glenn
Google Sheets - New Algorithm
ASKER
Wow, thank you so much!
This works, thanks again for all of your effort!
This works, thanks again for all of your effort!
Sub splitcell2rows()
Dim cel As Range
Set cel = Range("A1")
Do While cel <> ""
If InStr(cel, ",") <> 0 Then
cel.Offset(1).EntireRow.In
cel.EntireRow.Copy cel.Offset(1)
cel.Value = Left(cel, InStr(cel, ",") - 1)
cel.Offset(1).Value = Trim(Right(cel.Offset(1), Len(cel.Offset(1)) - InStr(cel.Offset(1), ",")))
End If
Set cel = cel.Offset(1)
Loop
End Sub