Solved

A concatenate formula to create a list of all and any possible combinations of rows in excel (OSX 2008)

Posted on 2010-08-31
28
640 Views
Last Modified: 2013-11-13
Hello,

I have looked into every nook and simply cannot find the exit here, so if anybody can, please provide some insight.. (Mac OSX user - MS Excel 2008 (no VBA)

I have a list of content categories (string) from different sources (index) and my task is to create a list for a mapping document that will concatenate (or not) each row from each source in all and any combination(s) (there are about 1000).

This formula is the closest that I have found (assumes 5 columns (not rows, but I can transpose them):

=$A$1 & IF(MOD(ROW(),2)>=1,"+" & B$1,"")& IF(MOD(ROW(),4)>=2,"+" & C$1,"")& IF(MOD(ROW(),8)>=4,"+" & D$1,"")& IF(MOD(ROW(),16)>=8,"+" & E$1,"")

It looks as though the iterations (n^y) are programmed manually, although I could conceivably see these intervals being read form a list elsewhere.

I will need to keep the association between each category and its source index (maybe some kind of delimiter within another) and also apply a new index to each resulting group (unique to group).

I hope that's enough information?

Many thanks in advance!

AF
0
Comment
Question by:jeaniem
  • 12
  • 11
  • 5
28 Comments
 
LVL 11

Expert Comment

by:Runrigger
ID: 33568647
can you attach a brief sample of source data + target list?
0
 

Author Comment

by:jeaniem
ID: 33571006
HI Runrigger,

Extremely kind of you to come back to me - I'm no mathematician except that this is quite an interesting problem (conceptually..).

Find attached - I did this by hand so there could be some errors although I think it's cool as I can see N^Y patters emerging here...

Thank you very much indeed for having a look at this.

AF

NB.  Here we're working with 4 rows (producing 31 in target) - I have roughly 1000!
SRC-TGT-data-.xls
0
 

Author Comment

by:jeaniem
ID: 33572612
Sorry -

Just to add, the sample is on the 2nd tab.. forgot to delete the first.

AF
0
 
LVL 6

Expert Comment

by:sijpie
ID: 33585181
Yes 'transposed' your formula works perfectly for the number of categories you have right now.
=$A$7 & IF(MOD(ROW(),2)>=1,"+" & A$8,"")& IF(MOD(ROW(),4)>=2,"+" & A$9,"")& IF(MOD(ROW(),8)>=4,"+" & A$10,"")& IF(MOD(ROW(),16)>=8,"+" & A$11,"")
If you are looking at up to 1000 categories, you could never get sufficient characters in your formula bar to cope.

So we will need to right an UDF (user defined formula) to do the concatenation, as in VBA we can loop.
I'll play around and see what comes out...
0
 

Author Comment

by:jeaniem
ID: 33585206
Thank you very much!  I just wish that MS provided VBA in Mac Excel as it's restrictive otherwise...
0
 
LVL 6

Expert Comment

by:sijpie
ID: 33585330
I notice in your list that you don't have the permutation
electro + ballet   (as an example)
all of them start with rock.
Do you need these other permutations as well?
0
 

Author Comment

by:jeaniem
ID: 33585352
Yes, sorry - I did it by hand and it's more than possible that I missed some off.

NB.  Are you sure that you're looking at the second tab, pls disregard the first.
0
 
LVL 6

Expert Comment

by:sijpie
ID: 33585407
No I was looking at tab1. I don't understand what I am seeing in Tab2 or the logic behind the CPindex and Groupindex. What does this have to do with permutations?
0
 

Author Comment

by:jeaniem
ID: 33585442
Apologies - my bad:

CP_Index is the index from the source (CP means Content Provider in my private universe!), and group index a new reference created for the target so that I can identify each different group of combinations / permutations after the event.

In this target data, rather than grouping each in a string and delimiting with a "+", I have organised them into rows where each different "string" (in tab 1 terms) is renamed a "group" and given a new index (Group_Index) to link them together.

Have you heard of Real Studio as an alternative to VBA on Mac?  I'm downloading it now.
0
 
LVL 6

Expert Comment

by:sijpie
ID: 33585565
I only know that VBA on Mac is being discontinued by Microsoft. I don't use MacOS so no knowledge about any details.

Back to your problem: take line 19 - theatre 3 12
So I assume this is the 'short notation' for ATMs + theatre
and the number 12 is just a sequential number .

If you were to use binary numbers for your index, then you can read out the permutation straight away:
rock  0001
musicals 0010
theatre 0100
ATM   1000

so: 1010 is ATM & musicals

That will make it far easier to create your list, if you know what I mean.

The only thing I don't understand is how you want to use this list on Tab2. You now have 'musicals' on 9 locations, so what are these going to be linked to?
0
 
LVL 11

Expert Comment

by:Runrigger
ID: 33585579
jeaniem, it is still not very clear what it is that you are trying to achieve.

How exactly do you derive the index, see first 7 rows, what does this mean?

rock      1      1
rock      1      2
musicals      2      2
rock      1      4
theatre      3      4
rock      1      5
ATM's      4      5
0
 
LVL 6

Expert Comment

by:sijpie
ID: 33585633
RunRigger,
attached a kind of commented list:

SRC-TGT-data-.xls
0
 

Author Comment

by:jeaniem
ID: 33585686
Yes, that's basically it - effectively I'm transposing it back into a list:

"line1:  musicals+ATM's+theatre"

can also be described as:

Category             Line
musicals             1
ATM's                 1
theatre                1

...where line 1 represents the group index (ie. a number assigned to the target iteration).

I need to carry over more data however as musicals could conceivably come from more than one provider and I'll need to configure at that level of granularity:  I've therefore added a new column in the target data that inherits the source "CP" (content provider) index:

Category            CP_Index             Line ("Group Index")
musicals             1                           1
ATM's                 2                           1
theatre                3                           1

- - - - - - - - - - - -

Does that help at all?  Honestly though, this is mostly just formatting so I can probably shuffle things around a bit if this takes up too much of your time:  it's generating the results that I'm struggling with.

Thank you both..

0
 
LVL 11

Expert Comment

by:Runrigger
ID: 33585743
OK, now I am beginning to understand and I think that with a little more clarity on the requirements, we may be able to help;

If you have say a 1000 categories, how do you want to proceed with determining the various combinations or groups of categories;

Within your example (rock, musicals,theatres,atms), rock can have any combination of all 4, as could musicals, atms and theatres for that matter.

If you had 20 categories, are you then wanting to expand rock to included any and all combinations with all 20 categories?

I hope not, because if you indicate yes, you are talking millions and millions of combinations!

I think that I can safely assume not therefore! However, what we need to know is how we determine what combinations are possible and how logically we can determine that?
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 6

Expert Comment

by:sijpie
ID: 33585857
For ten items in your list, the number of permutations will be over 1000, and so your target list would be about 3000-4000 items long in the way you've set it up.

for 1000 items you are talking about >10^150 items. More than Excel can deal with, or any program on this earth.

Even if you restrict your permuations to combinations for only 4 items from your source list, than with a 1000 items in the source you will have 41 billion (41 * 10^6) permutations.

Time for a sense check
0
 

Author Comment

by:jeaniem
ID: 33586071
Agree -

So we'll cut our cloth... do you need to know exactly how many rows right now or is it a value that we can change retrospectively?

Incredible maths.
0
 
LVL 11

Expert Comment

by:Runrigger
ID: 33586350
might help us to know your objective, and what you have had to perform manually previous to trying to develop this solution.

0
 
LVL 6

Expert Comment

by:sijpie
ID: 33586514
Like mentioned above, anything over say five items is difficult to capture in a formula, due to the length the formula would become.
Using a 'binary' solution you could make it easy up to nine items in your source list, and possibly a couple more working with strings.

Any more items in the source list and things become unwieldy I think.

Your problem is the way you want to present it. I still don't understand why you would want to present it then way you propose in tab2, as it has so many superfluous lines: lines 19, 20 & 21 all mean the same thing, so why print it out three times?

Your tab1 is easier to achive for any number of source items (with a practical limit at about 10)
0
 

Author Comment

by:jeaniem
ID: 33586585
Okay, I'll take your advice:  

Really I think it's fine to provide the information as in tab 1 as ultimately it's the content that counts rather than the aesthetics!

Is there anything that I can provide to help?  You are both so kind for offering to help with this rubic's cube of a problem...

AF
0
 
LVL 11

Expert Comment

by:Runrigger
ID: 33586991
jeaniem, I truly believe that the problem lies in the huge numbers of permutations that you will be needing to generate (look at the UK lottery, 6 numbers from 49 balls, gives odds of 14mn to 1, so you can imagine the numbers 1000 categories is going to generate).

Without knowing your specific objective, its is going to be virtually impossible to proceed.

I have enjoyed the discussion, but I don't believe I can be of any further help, so good luck (I will continue to monitor and if I can say anything useful, I will).
Dave
0
 

Author Comment

by:jeaniem
ID: 33587045
I totally understand.

Thank you very much indeed for you kind time and help - perhaps the reality is that this approach is simply not realistic and that there's another way to achieve this.

I am in awe of your mathematical ability, sijpie equally - very grateful to you both.

Bon continuation..
0
 
LVL 6

Expert Comment

by:sijpie
ID: 33587495
Have a look here. I've amended your sheet with a little 'userform'

In column A you can add up to 8 categories (more if you amend the formulas)
in column B the index for the category gets generated
in the top entry box (D2:E10) you can select the categories for the song (in columnD) and it will give you the index number for the song in E3

in the bottom entry box (D14:E15) you enter a index number and it will give you the categories.

The lists in I,J & K yo don't need for this example, but it shows how the formula in E15 works.
0
 

Author Comment

by:jeaniem
ID: 33587525
Hi Sijpie,

Wow - that is great:  I'll learn a lot from looking into this:

I'm not sure that I can see anything attached however?

(!) AF
0
 
LVL 6

Expert Comment

by:sijpie
ID: 33587531
Something went wron uploading thefile
SRC-TGT-data-.xls
0
 

Author Comment

by:jeaniem
ID: 33587631
Incredible -

There are formulas in there that I never knew existed!  I would have been stumped for decades had it not been for your help.

I wil deconstruct what I can to learn how to manipulate it and play around with the data to see what it can produce ongoing.

Wizards! Thank you very, very much again.
0
 
LVL 6

Accepted Solution

by:
sijpie earned 500 total points
ID: 33587740
Don't forget to acknowledge a correct answer!
0
 

Author Closing Comment

by:jeaniem
ID: 33587821
Really kind to have spent the time looking into my conundrum - deeply appreciated!
0
 
LVL 6

Expert Comment

by:sijpie
ID: 33587896
In the sheet the formula =DEC2BIN() is used (column J). That is a formula that becomes available by installing the Analysis Toolpack Add-in (Tools, Add-in...)

But it is not used other than for demo here.
0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

Suggested Solutions

Before You Read The Article Please make sure you understand these two concepts: Variable Scope (http://www.php.net/manual/en/language.variables.scope.php) and Property Visibility (http://www.php.net/manual/en/language.oop5.visibility.php).  And to …
Graphs within dashboards are meant to be dynamic, representing data from a period of time that will change each time the dashboard is updated with new data. Rather than update each graph to point to a different set within a static set of data, t…
This Micro Tutorial will demonstrate how to create pivot charts out of a data set. I also added a drop-down menu which allows to choose from different categories in the data set and the chart will automatically update.

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now