Link to home
Start Free TrialLog in
Avatar of kostermw
kostermw

asked on

Need an algorithm to create a so called factorial design (not permutations, but sort of combinations)

Imagine having a number of factors (e.g. X,Y,Z) each having a set of values. As an example:

X1,X2,X3,X4 (factor 1 having 4 values)
Y1,Y2           (factor 2 having 2 values)
Z1,Z2,Z3      (factor 3 having 3 values)

This means we can have 4 x 2 x 3 = 24 different combinations. To give the first 6:
X1,Y1,Z1
X1,Y2,Z1
X1,Y1,Z2
X1,Y2,Z2
X1,Y1,Z3
X1,Y2,Z3 etc.etc.

Also Z1,X1,Y1 is regarded the same as X1,Y1,Z1 thus I am looking for unique combinations (I think that this is called 'no permutations')

Now, I tried to devise a generic algorithm that creates the full design set. So the number of Factors and Values is differntly every time. It seemed simple at first glance, but I can't get a grip on it. There's plenty of (statistical and scientific)  background material on the subject, but I cannot find a practical example anywhere ...

Can anyone give me a start or a link to a good example?
Avatar of kostermw
kostermw

ASKER

Hi sunnycoder,

As you can read NO PERMUTATIONS
From what I understand you don't want to genrate permutation at all (in fact you consider differnet permutation of same factor as identical).
So, just to udnerstand the problem better, why simple nested loops are not satisfactory to you?

for x=1 to 4
  for y=1 to 2
    for z=1 to 3
      print x,y,z
    endfor
  endfor
endfor

Or supposing you have a variable number n of factors, and, say, cardinality[0..n-1] express how many values are possible for each factor,  you can do the following.
Initialize a vector factor[] of size n to 0 (assuming you count values starting form 0 to cardinality[i]-1)

do
  print array factor
  i=0
  do
    factor[i]++
    bool carry=(factor[i]==cardinaliy[i])
    if(carry) factor[i++]=0
  loop while (carry and i<n)
loop while not carry
   
 
>> Can anyone give me a start or a link to a good example?

What language are you working in?

Idle_Mind
Dear Ibertacco,

In your example the iterations indeed have a fixed number of factors wheraeas in my problem, both the number of factors and the number of values is variable. Thus thus it would require a construct with a variable number of loops and the loopcounter is variable too.

The latter will definitely cause "late binding" errors, as you cannot have a variable as a loop-counter e.g.

dim a, b as integer

a = Cint(textbox1.text)   'user input
for a = 1 to b
  print a*b
next a
Dear Idle_Mind

My language of preference is Visual Basic.Net, but any language (even plain English) will take me closer to a solution. Pseudo code would also be very usefull here!
kostermw, my second example should work fine. Why don't you think so?
In this case I just start taking the first value of each factor (the "Initialize a vector factor[]  of size n to 0" part), then for each loop iteration, I move to the next value of the first factor. When I complete the first factor, I move to the enxt value of the second factor and start again with the first value for the first factor. And so on. What do you mean with "late binding errors" ?

Myabe my pseudo code is not clear enough?

As a side question: what do you want to print/generate? Just the product of all the values?
kostermw,

Do you only pick one value from each factor for each combination?

Idle_Mind
Here's some very simple code to show how you might do it with arbitrary vectors of terms.  This one uses a fixed array, so you'd have to change the fixed dimension array to a vector of vectors.  The only change would be to the first parameter of "ExpandTerms".  Also, I used strings for illustration.  You could easily use numeric values instead.  To experiment, change the array line to:

        Dim Factors(,) As String = {{"I1", "I2", "I3"}, {"J1", "J2", "J3"}}

to see the difference in the output.

-----------------------------------------
 
   Private Sub Terms()
        '  This is just the driver for "ExpandTerms" which does the real work.
        Dim Factors(,) As String = {{"I1", "I2"}, {"J1", "J2"}, {"K1", "K2"}, {"L1", "L2"}}
        Dim term As String = New String("")
        ExpandTerms(Factors, term, 0)
    End Sub

    Private Sub ExpandTerms(ByVal factors(,) As String, ByVal term As String, ByVal level As Integer)
        If level <= factors.GetUpperBound(0) Then
            For i As Integer = 0 To factors.GetUpperBound(1)
                Dim nextTerm As String = term + "*" + factors(level, i)
                ExpandTerms(factors, nextTerm, level + 1)
            Next
        Else
                System.Console.WriteLine(term)
        End If
    End Sub
SOLUTION
Avatar of drichards
drichards

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
lbertacco,

Im looking into your second example, and indeed do not fully understand your (pseudo) code yet. It's probably the syntax. Also I do not now the word Cardinality (I am Dutch) Can you elaborate a bit on it?

Late binding errors occur if you compile VB code with option strict on, which simply disallows late binding. The compiler wants to now the upper and lower bound values of the loop variable in advance.

for x = [lowbound] to [highbound]
next

but this can be solved by either setting option strict off, or using a do/loop construct which allows to use a comparison with variables

do
loop until [lowbound] = [highbound]

>>As a side question: what do you want to print/generate? Just the product of all the values?
Yes, the product. These are so called (factorial) experimental designs. E.g you have factor Temperature (40,60) and factor Pressure (100,110,120). Then you do experiments using the design and measure e.g. Rinse Effect.

Exp1= Temp 40, Pressure 100
Exp2= Temp 60, Pressure 100
Exp3= Temp 40, Pressure 110 etc. etc.

My experiment sets are however a bit bigger (hundreds of experiments) because having many factors with a few values (it is factorial!)



ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
drichards,

Your code looks promising. I need to adapt to your coding style (declarations mixed with statements ...)

2 comments for now:
-This uses recursion. Dont' say it is bad, but it may be risky giving the sizes of my collections (overflow)
-Late binding (another thing I try to avoid)

My first reaction is that this code will achieve what I want. Also instead of arrays, I could use ArrayLists which have a dynamic size by design
kostermw,
2 things:
a) cardinality means "number of elements". So if e.g. our second factor can have two values (say 40 and 100) then  it has cardinality = 2
b) My algorithm, as I presented it,  doesn't print e.g. "Exp2= Temp 60, Pressure 100" but only "Combination 2: 1, 0" meaning take the second value (index 1) for the first factor , and the first value (index 0) for the second factor. Of course you can change it to use these indices to take values from arrays and obtain output you want.

The difference between my algorithm and drichards is that mine is itertative while his is recursive. In general terms, recursive solutions are usually simpler while iterative are faster.
If you just order found the factors smallest to largest.
e.g.
5 * 3 * 2 * 4  would go to:  2 * 3 * 4 * 5

Do a sort on all the factors
e.g.
1 * 3 * 5
2 * 3 * 4 * 5
2 * 3 * 4 * 5
2 * 3 * 4 * 5
4 * 5 * 6

Remove duplicates.
1 * 3 * 5
2 * 3 * 4 * 5
4 * 5 * 6

You have te result you want
Ibertacco, Drichards

Well, Drichards solution is pretty elegant, but indeed runs into memory trouble. E.g. 10 factors with only 3 values for each factor gives about 60,000 combinations (3^10). Add an extra value and it goes bang! The recursion is so memory expensive that it causes out-of-memory pretty quickly (have 1 Gig of [virtual] memory and it runs out). Nevertheless, the algorithm does what it must do!

I haven't looked into Ibertacco's iterative example yet, but will come back on that tomorrow soon, as I have to go now. If it fits my problem he will earn the points, but I will also reward Drichards with some points because his code is very instructive and inspiring.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
This function will get the prime factors of any number. There is only one combination of prime factors for any number.
Once you have the prime factors other factors can be found by just multiplying some of the prime factors together.
JR

Private Function PrimeFactor(ByVal iNumberToFactor As Long) As String
   
    Dim i As Long
    PrimeFactor = iNumberToFactor 'Initialise
    For i = 2 To Fix(Sqr(iNumberToFactor))
        If iNumberToFactor Mod i = 0 Then
            iNumberToFactor = iNumberToFactor / i
            PrimeFactor = i & " * " & PrimeFactor(iNumberToFactor)
            Exit For
        End If        
    Next i
   
End Function
Ibertacco,
Mind boggling at first, your approach is slick and ingenious. The code is indeed not self explaining (as Idle_Mind already observed). However, I do not understand his remark about you counting back.

For me it took a while to grasp that, i = index to the factor, and factor(i) = index to value of that factor. Once I understood that indexing mechanism, it was clear that this code is versatile. And it is fast!

Idle_Mind's solution is a little less usable for me, as it assumes that the factor values are numeric. Nevertheless, I see a few clever ideas in his (neat) code which can be of use.

JR2003,
You sorting trick is not a straight forward approach but more a sort of work-around of factorial mathematics. Also I don't get what you want me to do with the PrimeFactor function.

Point rewarding:
So I like to reward Ibertacco 350 points, and give 75 to both Idle_Mind and Drichards (seems 500 is the max I can give away ...) All of you, thanks for your inspiring help!







kostermw,

I purposely wrote the FactorCombinations class so that it only returns combinations of whatever was placed in the ArrayLists contained within.  You can put anything in them, simple data types or instances of classes, and the result will be those same objects in all the possible combinations.   All the calculations are done outside the class at the form level, seperating the combination generation from the data manipulation.

I realize your acutal problem is probably much more complex than what you have presented here which is why I wrote the FactorCombinations class with simply generating the combos in mind.

Idle_Mind
Hi Idle_Mind
Agree, I jumped into conclusions too quickly. It's simply the user interface (input part) that dictated me to enter integers. Changing the type of 'value' into string bypasses this. What I like is that you have build in events. With larger sets it can take a while for the algorithm to do it's work. Then the events come in very handy!

Your solution is as good as Ibertacco's but I have a small preference for his approach. Mostly becaue I find the double loop construction with the carry so clever (and efficient).

Can only give the points away once, and thought this was a fair distribution (350,75,75). No hard feelings I hope?

Martin Koster
No hard feelings at all!

Ibertacco's method is definitely the most compact and effiecient.  Perhaps you could place his algorithm into my framework...

Just noticed your comment about the counting direction:
>> However, I do not understand his remark about you counting back. (wards)

Lets say we are couting in binary.

000 = 0
001 = 1
010 = 2
011 = 3
100 = 4
101 = 5
110 = 6
111 = 7

Notice that the rightmost digits (the least significant) change before the leftmost (most significant bits).  This is how we count normally.

Ibertacco's algorithm changes the leftmost digits before the rightmost, producing this sequence instead:

000 = 0
100 = 4
010 = 2
110 = 6
001 = 1
101 = 5
011 = 3
111 = 7

It produces the same set of combinations, just in a different order.  The resulting numbers are not technically "backards" per say, just the way in which he sequences the digits and propogates the "carry over" is backwards.

As the difference in algorithms shows, it is much simpler to count as Ibertacco has done.  His method is completely counter-intuitive to everything we are taught in grade school about what each digit in a number represents however.  This is why I think it was so hard to understand the code at first.

What planet did you learn to count on Ibertacco?    =)

Regards,

Idle_Mind
kostermw
I think I was confused by the term "factor", to me a factor is a whole number that divides into another number leaving no remainder.
I'm still not really the wiser as to what your "factors" mean. Is it some branch of mathematics?
JR