Link to home
Start Free TrialLog in
Avatar of Mr_Fulano
Mr_FulanoFlag for United States of America

asked on

Find ALL possible combinations of the elements in an Array.

Hello, I am trying to determine an elegant and efficient manner to find all the possible combinations of the elements in an array of Integers.


I want to do this through code to produce the sets. I do not want to use the Combinations formula, which uses Factorials. I know how to do that calculation, but what I need is to produce the actual combinations or "sets" themselves and be able to visually see the difference, not just know a final result amount.  I use the formula to know if my result  is correct or not.   


The main rule is that combinations, or "sets" cannot be in a different order. Thus {1, 2} and {2, 1} are the same set, but in a different order. Its not a new combination, its simply a set "ordered in a different way."


Lets assume myArray = { 1, 2 }

There are only 3 possible combinations of the array's elements. Those being: {1, 1} , {1,2} and {2,2}.


If myArray = {1,2,3} there are 10 possible combinations. Those being : {1, 1, 1} , {1, 1, 3}. {2, 2, 3}, {1, 2, 2}, {1, 3, 3}, {1, 2, 3}, {3, 3, 3}, {2, 3, 3}, {1, 1, 2} and {2, 2, 2}.

And so on...


I currently have some code that simply takes each element of the array and produces random combinations. Thus, I simply iterate through a FOR Loop and keep adding the random combinations to a HashSet, which of course, accepts only unique / discreet combinations and ultimately, through somewhat of a "brute-force" approach, I arrive at a maximum number of possible combinations,.  It works, but it's a bad approach. 


There are two main problems with my approach...  One, the more elements I add to the array the longer it takes, because even when I arrive at "the maximum", I am still producing random combination to fulfill the requirements of the FOR Loop.  Secondly, another  downside is, the number of iterations though the loop is also nothing more than a guess. The larger the number of elements in the array, the more I have to increase my repetitions (or guess) (i.e. 100, 1,000, 10,000, 1,000,000, etc). This is extremely time-consuming, because even if I found the maximum at 889 repetitions, I still have to go all the way through to - say 10,000 iterations.  If nothing else, its inefficient. Its also ineffective, because if I guess too low at the number of repetitions, I may miss out on other possible combinations. - Again... its a very bad approach. 


There has to be a more elegant and efficient manner to find all possible combinations of a series of number (integers).


If anyone here know of a way, please share. It would be greatly appreciated.


Thank you.


 


Avatar of Kyle Abrahams, PMP
Kyle Abrahams, PMP
Flag of United States of America image

There's a formula for that:
https://www.cs.sfu.ca/~ggbaker/zju/math/perm-comb-more.html#:~:text=If%20we%20are%20selecting%20an,1)%20ways%20to%20do%20so.

(n+r−1)!/r!(n−1)! 

Open in new window

N = number of items in your set.
R = Total slots in the combination.

(3 + 3 - 1)!  / 3!(3-1)!

5! / 3! (2)!

5*4*3*2 / 3*2*2

120 / 12 = 10.









Avatar of Mr_Fulano

ASKER

Hello Kyle, I apologize. I added additional information to my question just as you were answering it. - I added :

I want to do this through code to produce the sets. I do not want to use the Combinations formula, which uses Factorials. I know how to do that calculation, but what I need is to produce the actual combinations or "sets" themselves and be able to visually see the difference, not just know a final result amount.  I use the formula to know if my result  is correct or not.  

I know how to find the combinations result - the final number. What I need to do is to produce each combination.  As in my examples (i.e. If myArray = {1,2,3} there are 10 possible combinations. Those being : {1, 1, 1} , {1, 1, 3}. {2, 2, 3}, {1, 2, 2}, {1, 3, 3}, {1, 2, 3}, {3, 3, 3}, {2, 3, 3}, {1, 1, 2} and {2, 2, 2}.)

Thank you anyway!!!
You want combinations, not permutations. Check.

Lets assume myArray = { 1, 2 } There are only 3 possible combinations of the array's elements. Those being: {1, 1} , {1,2} and {2,2}.
Not really.
As long as you want combinations. What you describe are combinations with repetition. (( n over k)) possible sets, where as n is the number of elements in A and k is the number of elements per subset.
Which brings us back to not really. All combinations with repetition are {},{1}, {2}, {1,1}, {1,2}, {2,2} from your array.

But let us assume you're after a certain k-subset type, then you construct your sets by creating a tree by recursion, e.g. ((4 over 2)):

User generated image
I can see it in my head, below is my pseudo code, but can be further refined.

Pseudo code:

1)  Create an index for each slot:
Let's say a list<int> called counter where counter[0] = first slot, counter[1] = second slot etc.  Where the size of the list is the number of slots you need.

2)  Init the indexes of all to 0.

3) The combination is the list of each of the numbers at their current index.  so
     
set[counter[0]].ToString() + set[counter[1]].ToString() + ....

Open in new window


4)  Add one to the right most counter , then:
 
    
if (counter[index] == the set size)
{
       counter[index] = 0; 
    
       if (index > 0) 
      {
           counter[index-1]++;
     
           
            //recheck index-1 for the same condition.

      //After you are done incrementing to the left as needed.
      //only go back to the numbers that are the same or below the number you are currently on.
          //eg: 1 needs to process 1,2,3,4.  2 only needs to process 2,3,4
           counter[index] = counter[0];  
     }
       else
          //complete.
}

Open in new window


Hope that helps, take a look and let me know.






Hello Ste5an, as you comment:

All combinations with repetition are {},{1}, {2}, {1,1}, {1,2}, {2,2} from your array.
.    ---->    I only want {1,2}, {1.1} and {2,2}.

No... I want to include ALL the elements in the combinations. And yes, I want combinations, which is a N-Choose-R type of problem with NO duplicate sets (i.e. {1,2} and {2,1}), but I don't wan to just the result. I would like help with the code. How would you go about coding that, if you would please?

Thanks.
ASKER CERTIFIED SOLUTION
Avatar of Kyle Abrahams, PMP
Kyle Abrahams, PMP
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hello Kyle, wow... very interesting. Let me give it a glance later tonight and I will let you know. Very nice work finding this!
Just a note you could also make this an extension method by adding the "this" keyword.  Don't have access to a compiler so this is untested.

static IEnumerable<String> CombinationsWithRepetition(this IEnumerable<int> input, int length)
{
    if (length <= 0)
        yield return "";
    else
    {
        foreach(var i in input)
            foreach(var c in input.CombinationsWithRepetition(length-1))
                yield return i.ToString() + c;
    }
}

Open in new window


Usage:
var myArray = new int[]{1,2,3};
foreach (var c in myArray.CombinationsWithRepetition(3))
    Console.WriteLine (c);

Open in new window

Hello Kyle, I can definitely work with the article you provided. I have to make some modifications, but its very workable. Thank you again!!!
Avatar of pepr
pepr

The Kyle's solution does not work. It does the combinations with repetitions; however, you do not want the combinations with repetitions. For {1, 2} as input, it would produced both {1, 2} and {2, 1} -- and you do not want that.
I also don't get it. The function returns the permutations, not the combinations.

I agree with ste5an and pepr! Kyle's solution does NOT work the way you want it! 


How about this one instead:

        static IEnumerable<IEnumerable<T>> GetPermutations<T>(this IEnumerable<T> items, int count)
        {             int i = 0;             foreach (var item in items)             {                 if (count == 1)                     yield return new T[] { item };                 else                 {                     foreach (var result in GetPermutations(items.Skip(i), count - 1))                         yield return new T[] { item }.Concat(result);                 }                         ++i;             }         }

Open in new window


Tested with:

            var list = new List<string> { "1", "2" , "3" };
            var result = GetPermutations(list, 3);             foreach (var perm in result)             {                 foreach (var c in perm)                 {                     Debug.Write(c + " ");                 }             Debug.WriteLine("");             }            

Open in new window


As you know Python, and it is easier to play with, here is the code that may work. The idea explained below:

step_cnt = 0        # only for debugging -- comment it out

def my_combinations(sorted_elements, ge_elements, length):
    global step_cnt             # only for debugging -- comment it out  
    step_cnt += 1               # only for debugging -- comment it out
    print('step:', step_cnt)    # only for debugging -- comment it out
    
    if length <= 0:
        yield [];
    else:
        for e in sorted_elements:
            building_elements = ge_elements[e]
            for lst in my_combinations(building_elements, ge_elements, length-1):
                yield [e] + lst;
                

if __name__ == '__main__':
    # The input elements in whatever order -- a set is formed, the duplicate values
    # are removed.
    ##elements = { 3, 2, 1, 5, 7, 1, 1, 1 }
    elements = { 2, 1 }
    
    # The sorted list of unique elements.
    sorted_elements = sorted(elements)
    
    # Calculate the sublists for each element (n loops where n is equqle
    # to the number of elements).
    ge_elements = {}
    for e in sorted_elements:
        ge_elements[e] = [x for x in sorted_elements if x >= e]
    
    # Start the recursive call with the sorted_elements, pass the auxiliary
    # dictionary with the precomputed sublists of elements for building the tail.
    solution_no = 0     # just to display
    for lst in my_combinations(sorted_elements, ge_elements, len(elements)):
         solution_no += 1
         print(f'Solution {solution_no}: {lst}')

Open in new window


For the {1, 2} it prints:
e:\__Python\Mr_Fulano\ee29256746>py c.py
step: 1
step: 2
step: 3
Solution 1: [1, 1]
step: 4
Solution 2: [1, 2]
step: 5
step: 6
Solution 3: [2, 2]

Open in new window


For the set of elements  = { 2, 1, 4, 5 } it prints:

e:\__Python\Mr_Fulano\ee29256746>py c.py
step: 1
step: 2
step: 3
step: 4
step: 5
Solution 1: [1, 1, 1, 1]
step: 6
Solution 2: [1, 1, 1, 2]
step: 7
Solution 3: [1, 1, 1, 4]
step: 8
Solution 4: [1, 1, 1, 5]
step: 9
step: 10
Solution 5: [1, 1, 2, 2]
step: 11
Solution 6: [1, 1, 2, 4]
step: 12
Solution 7: [1, 1, 2, 5]
step: 13
step: 14
Solution 8: [1, 1, 4, 4]
step: 15
Solution 9: [1, 1, 4, 5]
step: 16
step: 17
Solution 10: [1, 1, 5, 5]
step: 18
step: 19
...
Solution 33: [4, 4, 5, 5]
step: 64
step: 65
step: 66
Solution 34: [4, 5, 5, 5]
step: 67
step: 68
step: 69
step: 70
Solution 35: [5, 5, 5, 5]

Open in new window


The key idea is that you want to eliminate the cases that differ only in the order of elements. In other words, you can pick one special case of those that are considered to be equal. And one of them that is easy to understand, to check... is the one where the elements are sorted.

So, you can prepare the list of sorted elements that is basically sorted sequence of all elements from the input set. (You really do not want to repeat the values in the input; so, defining the set of elements you get rid of duplicated elements. Then you create the sorted list of unique elements.)

Now, the number of steps when generating the cases is reduced by the extra condition, that the elements in the result must be sorted. It means, that the partial result (the prefix) is also sorted. Moreover, all elements that can be used to build the tail of the resulting list consist only of elements that are greater or equal to the last element of the partial solution.

Having the sorted list of all elements, I can pre-compute the dictionary of sublists for each element as a key. (Say, having 4 unique element, I get 4 sublists.) And only the sublist is used in the next recursive call. In other words, the tree of all candidates is pruned early a lot. For example, once I start with the biggest element, the only way to finish the case is to repeat that element to the wanted length.

Update: The precalculation of dictionary of sublists can be improved...
    # Calculate the sublists for each element (n loops where n is equqle
    # to the number of elements).
    ge_elements = {}
    for i, e in enumerate(sorted_elements):
        ge_elements[e] = sorted_elements[i:]

Open in new window


For the ten elements in the input you get 92378 solutions, and to find all you need 184756 steps.

Solution 92376: [9, 9, 10, 10, 10, 10, 10, 10, 10, 10]
step: 184738
step: 184739
step: 184740
step: 184741
step: 184742
step: 184743
step: 184744
step: 184745
step: 184746
Solution 92377: [9, 10, 10, 10, 10, 10, 10, 10, 10, 10]
step: 184747
step: 184748
step: 184749
step: 184750
step: 184751
step: 184752
step: 184753
step: 184754
step: 184755
step: 184756
Solution 92378: [10, 10, 10, 10, 10, 10, 10, 10, 10, 10]

Open in new window

Agreed it doesn't work.

Needed a minor tweak.  Below is the corrected code.

public static IEnumerable<String> CombinationsWithRepetition(this IEnumerable<int> input, int length)
	{
		if (length <= 0)
			yield return "";
		else
		{
					
			int index = 0;
			foreach(var i in input)
			{
				foreach(var c in input.Skip(index).CombinationsWithRepetition(length-1))
					yield return i.ToString() + c;
				index ++;
			}
		}
	}

}

Open in new window


dotNetfiddle here:
https://dotnetfiddle.net/1LIVTq

@Kyle: Your last "tweak" is actually nothing else what I posted aka a clone!

@Alex, the approach is the same - but they are different (try yours with a count of 0).  

FWIW my answer was completed without looking at your solution.  Became aware of the problem from the OPs other question:
https://www.experts-exchange.com/questions/29256789/Need-help-understanding-the-logic-in-the-code-herein.html

and jumped straight into the .Net Fiddle to figure it out.

Hence why the tweak is an adjustment to the code I had posted as the original accepted solution.



The C# solution rewritten directly from the Python above (can probably be improved, I am not that good in C#:

using System;
using System.Collections.Generic;

namespace e // Note: actual namespace depends on the project name.
{
    class Program
    {
        private static int stepCnt = 0; // only for debugging -- comment it out

        static public IEnumerable<List<int>> myCombinations(IEnumerable<int> sortedElements, Dictionary<int, List<int>> sublists, int length)
        {
            //Console.WriteLine($"step: {++stepCnt}");  // only for debugging -- comment it out

            if (length <= 0)
            {
                yield return new List<int>();
            }
            else
            {
                foreach (int e in sortedElements)
                {
                    List<int> otherElements = sublists[e];
                    foreach (List<int> lst in myCombinations(otherElements, sublists, length - 1))
                    {
                        List<int> result = new List<int> { e }; // the single-element list
                        result.AddRange(lst);
                        yield return result;
                    }
                }
            }
        }

        static void Main(string[] args)
        {
            // The input elements in whatever order -- a set is formed, the duplicate values
            // are removed.
            var elements = new HashSet<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

            // The sorted list of unique elements.
            var arr = elements.ToArray();
            Array.Sort(arr);
            List<int> sortedElements = arr.ToList<int>();

            // Calculate the sublists for each element (n loops where n is equal
            // to the number of elements).
            var sublists = new Dictionary<int, List<int>>();
            for (int i = 0; i < arr.Length; ++i)
                sublists[arr[i]] = new List<int>(arr.Skip(i));

            // Start the recursive call with the sortedElements, pass the auxiliary
            // dictionary with the precomputed sublists of elements for building the tail.
            int solutionNo = 0;     // just to display
            foreach (var lst in myCombinations(sortedElements, sublists, sortedElements.Count))
            {
                Console.Write($"Solution {++solutionNo}: [");
                foreach (int e in lst)
                    Console.Write($"{e}, ");
                Console.WriteLine("]");
            }
        }
    }
}

Open in new window

The correct observation by Alex is that the "special case" can just follow the original order of elements. So, one only have to be careful to use unique set of input elements. The sublists need not to be precalculated, because it is easy to get the slice also in C# (the .Skip().
Hello folks... there a lot of confusion here. YES, you are all correct. Kyles link produces PERMUTATIONS. But that's OK... because as I said in an earlier post, I then pass those permutations to a HashSet, which only allows ONE permutation of the set. So, {1,2} is allowed, but not {2,1}, because {2,1} is a permutation of {1,2}.

So, the code works to the point where it will produce the permutations in a much more efficient manner than the "brute-force" method I was attempting at first, which is where Kyle helped me. I don't believe Kyle ever intended to imply that the code would produce only unique "combinations", which are groupings without any permutations.

HOWEVER, there is an even deeper problem here, where the code does NOT work, that I'm not sure any of you have picked up on. -- Since the code produced "strings" it cannot do a two digit integer, such as 10 or 11 or 23, it treats them as a Char  you end up with 0 or 1 or 3 for the previous examples, so that is where the code REALLY fails.

I will review all your comments in deeper detail tonight and reply according if I have questions, but if you wish to give the "double digit" integer issue a try, maybe go at that. I'd be interested in your comments.

Thank you for your comments.

The problem with permutations is not that you can later pick the candidates through HashSet. The problem is the quantity of candidates. Well, it depends on how many unique values you have to form the results.

For the two-digit integers, see the Python solution or my later C#. However, the Alex and Kyle solutions are more C#'ish, better, using simpler interface. So, the best would be to adopt their solution with replacing the strings by List<int> -- as shown in my solution.

Since the code produced "strings" it cannot do a two digit integer, such as 10 or 11 or 23, it treats them as a Char  you end up with 0 or 1 or 3 for the previous examples, so that is where the code REALLY fails. 

My snippet should work with those values, too ;-)


using System;
using System.Collections; using System.Collections.Generic; using System.Linq; using System.Diagnostics; namespace Playground {     static class Program     {         static IEnumerable<IEnumerable<T>> GetPermutations<T>(this IEnumerable<T> items, int count)         {             int i = 0;             foreach (var item in items)             {                 if (count == 1)                     yield return new T[] { item };                 else                 {                     foreach (var result in GetPermutations(items.Skip(i), count - 1))                         yield return new T[] { item }.Concat(result);                 }                         ++i;             }         }         public static void Main(string[] args)         {             var list = new List<string> { "10", "11" , "23" };             var result = GetPermutations(list, 3);             foreach (var perm in result)             {                 foreach (var c in perm)                 {                     Debug.Write(c + " ");                 }             Debug.WriteLine("");             }                     }     } }

Open in new window


Hello ALL, very good work by all of you. I had a bit more time after dinner to review all your comments in more detail and I have some additional comments below.

1). I HAVE to use C#, I know a little bit of Python, but I am using the Unity3D platform for my project and Unity3D only allows for C# and not Python. So, C# is my only option.

2). You are all correct. The original post by Kyle produced Permutations, but that's OK. I said to Kyle earlier... I can work with those to get to "All the possible combinations."

3). I'm not sure why Pepr stated
The problem with permutations is not that you can later pick the candidates through HashSet. The problem is the quantity of candidates.
I don't see how a permutation is not just a re-ordered combination. I also don't see why a HashSet can't simply strip out all the "combinations" and exclude all the "permutations"... unless I'm missing something, which I hope you can explain Pepr, I will continue to consider a permutation a re-ordered combination (i.e. {1,2} and {2,1}) They contains the same elements, just in a different order and that is where the HashSet plays its role.

4). As I mentioned earlier, the original code that Kyle found online didn't work for integers above 9. So, I worked for 1 through 9, but began truncating the forward part of the double digit integers, because of the Char issue.... (as a side note, I don't understand why someone would use <strings> to work on <ints>, but to each their own)...  I see that a couple of you worked out that glitch and I am eager to try out your code.

5). In the end, YES... what I want is "ALL the possible combinations" with no duplicate sets or duplicate sets (i.e. {2,2,2,2} and {2,2,2,2}. However, my approach was - if I get ALL the permutations and then filter out the combinations (the re-ordered) sets I guarantee myself a positive result, meaning a correct result. I used the mathematical Factorial N-Choose-R formula to make sure my coded results matched the mathematically calculated result - and it did in terms of end-values (i.e 10-Choose-4 = 715) and so on.

So, in the end you have ALL been very instructive and I thank you ALL who contributed with such hard work. I don't take your efforts lightly. I think folks like YOU are the ones that make sites like EE such valuable placed to learn and share information.

In closing, I would like to say thank you for having in there with me and helping me with my project -- which is NOT finished yet, so I may have some followup questions, which I will post under separate cover.  I hope to see you all again for such a lively discussion.


Ad 3) Think about a situation when the elements are numerals and you use 10 of them -- simply 0, 1,  2, 3, 4, 5, 6, 7, 8, 9. When generating permutations of 10 elements according the requirement, you can think of them as about all numbers 0000000000, 0000000001, ..., 9999999999. That is 10^10 results. It is a lot. This was for illustration. Say, each result is implemented as one list. Then you use HashSet to remove the order. Actually, there is 92378 solutions and you have to keep them in some structure to have them when removing duplicates.

When using the algorithm that does not generate duplicates, you need only 184756 steps to have the same result. And that is much less than 10^10. This way it runs about 54 000 faster than the naive solution.

If you use only few elements (say 3), you cannot see the difference. (The difference depends on the size of the problem -- the number of elements.) You may think that the simpler solution is better -- and it can be. So, you have to decide based on other requirements.
Hello Pepr, ...hmmm interesting!  OK, I see your point. In my scenario, I do not use 0. However, I can use 1 - 10 or more. So, your point may still apply.  For the most part, I would be considering "usually" less than 10 items, but I don't know if someone will want to try 25 one day and that would cause huge number crunching. So, I will keep your concept in mind.

Thank you for getting back to me with your observations.