asked on
Find ALL possible combinations of the elements in an Array.
Hello, I am trying to determine an elegant and efficient manner to find all the possible combinations of the elements in an array of Integers.
I want to do this through code to produce the sets. I do not want to use the Combinations formula, which uses Factorials. I know how to do that calculation, but what I need is to produce the actual combinations or "sets" themselves and be able to visually see the difference, not just know a final result amount. I use the formula to know if my result is correct or not.
The main rule is that combinations, or "sets" cannot be in a different order. Thus {1, 2} and {2, 1} are the same set, but in a different order. Its not a new combination, its simply a set "ordered in a different way."
Lets assume myArray = { 1, 2 }
There are only 3 possible combinations of the array's elements. Those being: {1, 1} , {1,2} and {2,2}.
If myArray = {1,2,3} there are 10 possible combinations. Those being : {1, 1, 1} , {1, 1, 3}. {2, 2, 3}, {1, 2, 2}, {1, 3, 3}, {1, 2, 3}, {3, 3, 3}, {2, 3, 3}, {1, 1, 2} and {2, 2, 2}.
And so on...
I currently have some code that simply takes each element of the array and produces random combinations. Thus, I simply iterate through a FOR Loop and keep adding the random combinations to a HashSet, which of course, accepts only unique / discreet combinations and ultimately, through somewhat of a "brute-force" approach, I arrive at a maximum number of possible combinations,. It works, but it's a bad approach.
There are two main problems with my approach... One, the more elements I add to the array the longer it takes, because even when I arrive at "the maximum", I am still producing random combination to fulfill the requirements of the FOR Loop. Secondly, another downside is, the number of iterations though the loop is also nothing more than a guess. The larger the number of elements in the array, the more I have to increase my repetitions (or guess) (i.e. 100, 1,000, 10,000, 1,000,000, etc). This is extremely time-consuming, because even if I found the maximum at 889 repetitions, I still have to go all the way through to - say 10,000 iterations. If nothing else, its inefficient. Its also ineffective, because if I guess too low at the number of repetitions, I may miss out on other possible combinations. - Again... its a very bad approach.
There has to be a more elegant and efficient manner to find all possible combinations of a series of number (integers).
If anyone here know of a way, please share. It would be greatly appreciated.
Thank you.
ASKER
I want to do this through code to produce the sets. I do not want to use the Combinations formula, which uses Factorials. I know how to do that calculation, but what I need is to produce the actual combinations or "sets" themselves and be able to visually see the difference, not just know a final result amount. I use the formula to know if my result is correct or not.
I know how to find the combinations result - the final number. What I need to do is to produce each combination. As in my examples (i.e. If myArray = {1,2,3} there are 10 possible combinations. Those being : {1, 1, 1} , {1, 1, 3}. {2, 2, 3}, {1, 2, 2}, {1, 3, 3}, {1, 2, 3}, {3, 3, 3}, {2, 3, 3}, {1, 1, 2} and {2, 2, 2}.)
Thank you anyway!!!
Lets assume myArray = { 1, 2 } There are only 3 possible combinations of the array's elements. Those being: {1, 1} , {1,2} and {2,2}.Not really.
As long as you want combinations. What you describe are combinations with repetition. (( n over k)) possible sets, where as n is the number of elements in A and k is the number of elements per subset.
Which brings us back to not really. All combinations with repetition are {},{1}, {2}, {1,1}, {1,2}, {2,2} from your array.
But let us assume you're after a certain k-subset type, then you construct your sets by creating a tree by recursion, e.g. ((4 over 2)):
Pseudo code:
1) Create an index for each slot:
Let's say a list<int> called counter where counter[0] = first slot, counter[1] = second slot etc. Where the size of the list is the number of slots you need.
2) Init the indexes of all to 0.
3) The combination is the list of each of the numbers at their current index. so
set[counter[0]].ToString() + set[counter[1]].ToString() + ....
4) Add one to the right most counter , then:
if (counter[index] == the set size)
{
counter[index] = 0;
if (index > 0)
{
counter[index-1]++;
//recheck index-1 for the same condition.
//After you are done incrementing to the left as needed.
//only go back to the numbers that are the same or below the number you are currently on.
//eg: 1 needs to process 1,2,3,4. 2 only needs to process 2,3,4
counter[index] = counter[0];
}
else
//complete.
}
Hope that helps, take a look and let me know.
ASKER
All combinations with repetition are {},{1}, {2}, {1,1}, {1,2}, {2,2} from your array.. ----> I only want {1,2}, {1.1} and {2,2}.
No... I want to include ALL the elements in the combinations. And yes, I want combinations, which is a N-Choose-R type of problem with NO duplicate sets (i.e. {1,2} and {2,1}), but I don't wan to just the result. I would like help with the code. How would you go about coding that, if you would please?
Thanks.
ASKER
static IEnumerable<String> CombinationsWithRepetition(this IEnumerable<int> input, int length)
{
if (length <= 0)
yield return "";
else
{
foreach(var i in input)
foreach(var c in input.CombinationsWithRepetition(length-1))
yield return i.ToString() + c;
}
}
Usage:
var myArray = new int[]{1,2,3};
foreach (var c in myArray.CombinationsWithRepetition(3))
Console.WriteLine (c);
ASKER
I agree with ste5an and pepr! Kyle's solution does NOT work the way you want it!
How about this one instead:
static IEnumerable<IEnumerable<T>> GetPermutations<T>(this IEnumerable<T> items, int count)
{
int i = 0;
foreach (var item in items)
{
if (count == 1)
yield return new T[] { item };
else
{
foreach (var result in GetPermutations(items.Skip(i), count - 1))
yield return new T[] { item }.Concat(result);
}
++i;
}
}
Tested with:
var list = new List<string> { "1", "2" , "3" };
var result = GetPermutations(list, 3);
foreach (var perm in result)
{
foreach (var c in perm)
{
Debug.Write(c + " ");
}
Debug.WriteLine("");
}
step_cnt = 0 # only for debugging -- comment it out
def my_combinations(sorted_elements, ge_elements, length):
global step_cnt # only for debugging -- comment it out
step_cnt += 1 # only for debugging -- comment it out
print('step:', step_cnt) # only for debugging -- comment it out
if length <= 0:
yield [];
else:
for e in sorted_elements:
building_elements = ge_elements[e]
for lst in my_combinations(building_elements, ge_elements, length-1):
yield [e] + lst;
if __name__ == '__main__':
# The input elements in whatever order -- a set is formed, the duplicate values
# are removed.
##elements = { 3, 2, 1, 5, 7, 1, 1, 1 }
elements = { 2, 1 }
# The sorted list of unique elements.
sorted_elements = sorted(elements)
# Calculate the sublists for each element (n loops where n is equqle
# to the number of elements).
ge_elements = {}
for e in sorted_elements:
ge_elements[e] = [x for x in sorted_elements if x >= e]
# Start the recursive call with the sorted_elements, pass the auxiliary
# dictionary with the precomputed sublists of elements for building the tail.
solution_no = 0 # just to display
for lst in my_combinations(sorted_elements, ge_elements, len(elements)):
solution_no += 1
print(f'Solution {solution_no}: {lst}')
For the {1, 2} it prints:
e:\__Python\Mr_Fulano\ee29256746>py c.py
step: 1
step: 2
step: 3
Solution 1: [1, 1]
step: 4
Solution 2: [1, 2]
step: 5
step: 6
Solution 3: [2, 2]
For the set of elements = { 2, 1, 4, 5 } it prints:
e:\__Python\Mr_Fulano\ee29256746>py c.py
step: 1
step: 2
step: 3
step: 4
step: 5
Solution 1: [1, 1, 1, 1]
step: 6
Solution 2: [1, 1, 1, 2]
step: 7
Solution 3: [1, 1, 1, 4]
step: 8
Solution 4: [1, 1, 1, 5]
step: 9
step: 10
Solution 5: [1, 1, 2, 2]
step: 11
Solution 6: [1, 1, 2, 4]
step: 12
Solution 7: [1, 1, 2, 5]
step: 13
step: 14
Solution 8: [1, 1, 4, 4]
step: 15
Solution 9: [1, 1, 4, 5]
step: 16
step: 17
Solution 10: [1, 1, 5, 5]
step: 18
step: 19
...
Solution 33: [4, 4, 5, 5]
step: 64
step: 65
step: 66
Solution 34: [4, 5, 5, 5]
step: 67
step: 68
step: 69
step: 70
Solution 35: [5, 5, 5, 5]
The key idea is that you want to eliminate the cases that differ only in the order of elements. In other words, you can pick one special case of those that are considered to be equal. And one of them that is easy to understand, to check... is the one where the elements are sorted.
So, you can prepare the list of sorted elements that is basically sorted sequence of all elements from the input set. (You really do not want to repeat the values in the input; so, defining the set of elements you get rid of duplicated elements. Then you create the sorted list of unique elements.)
Now, the number of steps when generating the cases is reduced by the extra condition, that the elements in the result must be sorted. It means, that the partial result (the prefix) is also sorted. Moreover, all elements that can be used to build the tail of the resulting list consist only of elements that are greater or equal to the last element of the partial solution.
Having the sorted list of all elements, I can pre-compute the dictionary of sublists for each element as a key. (Say, having 4 unique element, I get 4 sublists.) And only the sublist is used in the next recursive call. In other words, the tree of all candidates is pruned early a lot. For example, once I start with the biggest element, the only way to finish the case is to repeat that element to the wanted length.
Update: The precalculation of dictionary of sublists can be improved...
# Calculate the sublists for each element (n loops where n is equqle
# to the number of elements).
ge_elements = {}
for i, e in enumerate(sorted_elements):
ge_elements[e] = sorted_elements[i:]
For the ten elements in the input you get 92378 solutions, and to find all you need 184756 steps.
Solution 92376: [9, 9, 10, 10, 10, 10, 10, 10, 10, 10]
step: 184738
step: 184739
step: 184740
step: 184741
step: 184742
step: 184743
step: 184744
step: 184745
step: 184746
Solution 92377: [9, 10, 10, 10, 10, 10, 10, 10, 10, 10]
step: 184747
step: 184748
step: 184749
step: 184750
step: 184751
step: 184752
step: 184753
step: 184754
step: 184755
step: 184756
Solution 92378: [10, 10, 10, 10, 10, 10, 10, 10, 10, 10]
Needed a minor tweak. Below is the corrected code.
public static IEnumerable<String> CombinationsWithRepetition(this IEnumerable<int> input, int length)
{
if (length <= 0)
yield return "";
else
{
int index = 0;
foreach(var i in input)
{
foreach(var c in input.Skip(index).CombinationsWithRepetition(length-1))
yield return i.ToString() + c;
index ++;
}
}
}
}
dotNetfiddle here:
https://dotnetfiddle.net/1LIVTq
@Kyle: Your last "tweak" is actually nothing else what I posted aka a clone!
FWIW my answer was completed without looking at your solution. Became aware of the problem from the OPs other question:
https://www.experts-exchange.com/questions/29256789/Need-help-understanding-the-logic-in-the-code-herein.html
and jumped straight into the .Net Fiddle to figure it out.
Hence why the tweak is an adjustment to the code I had posted as the original accepted solution.
using System;
using System.Collections.Generic;
namespace e // Note: actual namespace depends on the project name.
{
class Program
{
private static int stepCnt = 0; // only for debugging -- comment it out
static public IEnumerable<List<int>> myCombinations(IEnumerable<int> sortedElements, Dictionary<int, List<int>> sublists, int length)
{
//Console.WriteLine($"step: {++stepCnt}"); // only for debugging -- comment it out
if (length <= 0)
{
yield return new List<int>();
}
else
{
foreach (int e in sortedElements)
{
List<int> otherElements = sublists[e];
foreach (List<int> lst in myCombinations(otherElements, sublists, length - 1))
{
List<int> result = new List<int> { e }; // the single-element list
result.AddRange(lst);
yield return result;
}
}
}
}
static void Main(string[] args)
{
// The input elements in whatever order -- a set is formed, the duplicate values
// are removed.
var elements = new HashSet<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
// The sorted list of unique elements.
var arr = elements.ToArray();
Array.Sort(arr);
List<int> sortedElements = arr.ToList<int>();
// Calculate the sublists for each element (n loops where n is equal
// to the number of elements).
var sublists = new Dictionary<int, List<int>>();
for (int i = 0; i < arr.Length; ++i)
sublists[arr[i]] = new List<int>(arr.Skip(i));
// Start the recursive call with the sortedElements, pass the auxiliary
// dictionary with the precomputed sublists of elements for building the tail.
int solutionNo = 0; // just to display
foreach (var lst in myCombinations(sortedElements, sublists, sortedElements.Count))
{
Console.Write($"Solution {++solutionNo}: [");
foreach (int e in lst)
Console.Write($"{e}, ");
Console.WriteLine("]");
}
}
}
}
ASKER
So, the code works to the point where it will produce the permutations in a much more efficient manner than the "brute-force" method I was attempting at first, which is where Kyle helped me. I don't believe Kyle ever intended to imply that the code would produce only unique "combinations", which are groupings without any permutations.
HOWEVER, there is an even deeper problem here, where the code does NOT work, that I'm not sure any of you have picked up on. -- Since the code produced "strings" it cannot do a two digit integer, such as 10 or 11 or 23, it treats them as a Char you end up with 0 or 1 or 3 for the previous examples, so that is where the code REALLY fails.
I will review all your comments in deeper detail tonight and reply according if I have questions, but if you wish to give the "double digit" integer issue a try, maybe go at that. I'd be interested in your comments.
Thank you for your comments.
For the two-digit integers, see the Python solution or my later C#. However, the Alex and Kyle solutions are more C#'ish, better, using simpler interface. So, the best would be to adopt their solution with replacing the strings by List<int> -- as shown in my solution.
Since the code produced "strings" it cannot do a two digit integer, such as 10 or 11 or 23, it treats them as a Char you end up with 0 or 1 or 3 for the previous examples, so that is where the code REALLY fails.
My snippet should work with those values, too ;-)
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Diagnostics;
namespace Playground
{
static class Program
{
static IEnumerable<IEnumerable<T>> GetPermutations<T>(this IEnumerable<T> items, int count)
{
int i = 0;
foreach (var item in items)
{
if (count == 1)
yield return new T[] { item };
else
{
foreach (var result in GetPermutations(items.Skip(i), count - 1))
yield return new T[] { item }.Concat(result);
}
++i;
}
}
public static void Main(string[] args)
{
var list = new List<string> { "10", "11" , "23" };
var result = GetPermutations(list, 3);
foreach (var perm in result)
{
foreach (var c in perm)
{
Debug.Write(c + " ");
}
Debug.WriteLine("");
}
}
}
}
ASKER
1). I HAVE to use C#, I know a little bit of Python, but I am using the Unity3D platform for my project and Unity3D only allows for C# and not Python. So, C# is my only option.
2). You are all correct. The original post by Kyle produced Permutations, but that's OK. I said to Kyle earlier... I can work with those to get to "All the possible combinations."
3). I'm not sure why Pepr stated
The problem with permutations is not that you can later pick the candidates through HashSet. The problem is the quantity of candidates.I don't see how a permutation is not just a re-ordered combination. I also don't see why a HashSet can't simply strip out all the "combinations" and exclude all the "permutations"... unless I'm missing something, which I hope you can explain Pepr, I will continue to consider a permutation a re-ordered combination (i.e. {1,2} and {2,1}) They contains the same elements, just in a different order and that is where the HashSet plays its role.
4). As I mentioned earlier, the original code that Kyle found online didn't work for integers above 9. So, I worked for 1 through 9, but began truncating the forward part of the double digit integers, because of the Char issue.... (as a side note, I don't understand why someone would use <strings> to work on <ints>, but to each their own)... I see that a couple of you worked out that glitch and I am eager to try out your code.
5). In the end, YES... what I want is "ALL the possible combinations" with no duplicate sets or duplicate sets (i.e. {2,2,2,2} and {2,2,2,2}. However, my approach was - if I get ALL the permutations and then filter out the combinations (the re-ordered) sets I guarantee myself a positive result, meaning a correct result. I used the mathematical Factorial N-Choose-R formula to make sure my coded results matched the mathematically calculated result - and it did in terms of end-values (i.e 10-Choose-4 = 715) and so on.
So, in the end you have ALL been very instructive and I thank you ALL who contributed with such hard work. I don't take your efforts lightly. I think folks like YOU are the ones that make sites like EE such valuable placed to learn and share information.
In closing, I would like to say thank you for having in there with me and helping me with my project -- which is NOT finished yet, so I may have some followup questions, which I will post under separate cover. I hope to see you all again for such a lively discussion.
When using the algorithm that does not generate duplicates, you need only 184756 steps to have the same result. And that is much less than 10^10. This way it runs about 54 000 faster than the naive solution.
If you use only few elements (say 3), you cannot see the difference. (The difference depends on the size of the problem -- the number of elements.) You may think that the simpler solution is better -- and it can be. So, you have to decide based on other requirements.
ASKER
Thank you for getting back to me with your observations.
https://www.cs.sfu.ca/~ggbaker/zju/math/perm-comb-more.html#:~:text=If%20we%20are%20selecting%20an,1)%20ways%20to%20do%20so.
Open in new window
N = number of items in your set.R = Total slots in the combination.
(3 + 3 - 1)! / 3!(3-1)!
5! / 3! (2)!
5*4*3*2 / 3*2*2
120 / 12 = 10.