Link to home
Start Free TrialLog in
Avatar of mloricha
mloricha

asked on

Calculate unique combinations

I have a table of alpha/numeric identifiers (eg ABCD123, EFGH5678 ...).  I want to find all unique combinations of these identifiers if only four identifiers can be used in each combination.  
Avatar of imitchie
imitchie
Flag of New Zealand image

Is this a mathematical question? You should add it to that zone.
Also, do you just want to know how many identifiers you can accommodate?
We start with the fact that you must have 1 Aplha, 1 Numeric at least, so that's 26x10.  Now, we can add either a blank or another alpha to the left, or blanks or numeric to the right. So, the total unique ids is

 27x27x27x26x10x11x11x11
Avatar of mloricha
mloricha

ASKER

Hi imitchie,

Sorry, I probably didn't make the question clear.  I need a way to generate all possible UNIQUE combinations in sets of four from a list I have (preferably using MS Access or else Excel).  For example, I know that finding unique combinations of two letters from a choice of ABC will give:  AB, AC and BC.  In my case, I want to find all unique combinations, for example, of four letters from a set of 26 letters.
By unique, do you mean AC is same as CA?
ASKER CERTIFIED SOLUTION
Avatar of byundt
byundt
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
There are 14950 combinations wherein no letter is repeated within a code, and
CAT = TAC = ACT, etc
To check on first, a loop is set up to randomly select a number between 65 and 90, and the character equivalent of that number is checked to see if it in the existing string.  If so, don't increment counter.  otherwise, increment count (of letters in string) and add the letter to the string.
The unique combination of 4 letters is accomplished by calculating a CRC for each letter, and then adding the 4 CRCs together, and placing that value into an indexed field with no dupes.

I created a table (e in this instance) with fields Identifier(text) and newCRC(double)

and then run the following code from a module:
Function buildId()
On Error Resume Next

Dim rs As DAO.Recordset
Dim ident As String
Dim ltr As Integer
Dim i As Integer
Dim j As Integer
Dim vcrc As Double
Set rs = CurrentDb.OpenRecordset("e")
Do Until DCount("*", "e") = 14951
i = 1
Do Until i = 5
Randomize
ltr = Int(((90 - 65 + 1) * Rnd) + 65)
'Debug.Print ltr
If InStr(ident, Chr(ltr)) > 0 Then
i = i
Else
i = i + 1
vcrc = vcrc + CalcCRC32(Chr(ltr))
ident = ident & Chr(ltr)
End If
Loop
rs.AddNew
rs!identifier = ident
rs!newCRC = vcrc
rs.Update
ident = ""
vcrc = 0
i = 1
j = j + 1
Loop
End Function

Function ltrs()
For i = 65 To 90
Debug.Print Chr(i) & "/" & CalcCRC32(Chr(i))
Next i
Debug.Print "-----------------------"
End Function

Function factorial2()
Dim x As Double
Dim y As Double
For i = 1 To 3
x = x * (i * (i + 1))
Debug.Print i + 1 & "/" & x & "/" & i * (i + 1)
Next i
Debug.Print x

End Function

You will need this function also.
Public Function CalcCRC32(str As String) As Long
Dim i As Long
Dim j As Long
Dim Limit As Long
Dim CRC As Long
Dim Temp1 As Long
Dim Temp2 As Long
Dim CRCTable(0 To 255) As Long
 
  Limit = &HEDB88320
  For i = 0 To 255
    CRC = i
    For j = 8 To 1 Step -1
      If CRC < 0 Then
        Temp1 = CRC And &H7FFFFFFF
        Temp1 = Temp1 \ 2
        Temp1 = Temp1 Or &H40000000
      Else
        Temp1 = CRC \ 2
      End If
      If CRC And 1 Then
        CRC = Temp1 Xor Limit
      Else
        CRC = Temp1
      End If
    Next j
    CRCTable(i) = CRC
  Next i
  Limit = Len(str) 'UBound(ByteArray)
  CRC = -1
  For i = 1 To Limit
    If CRC < 0 Then
      Temp1 = CRC And &H7FFFFFFF
      Temp1 = Temp1 \ 256
      Temp1 = (Temp1 Or &H800000) And &HFFFFFF
    Else
      Temp1 = (CRC \ 256) And &HFFFFFF
    End If
    Temp2 = Asc(Mid(str, i, 1)) ' ByteArray(I)   ' get the byte
    Temp2 = CRCTable((CRC Xor Temp2) And &HFF)
    CRC = Temp1 Xor Temp2
  Next i
  CRC = CRC Xor &HFFFFFFFF
  CalcCRC32 = CRC
End Function
It takes awhile to run, because fewer and fewer non-duplicates are generated for each loop.
Combination number came from
http://mathforum.org/dr.math/faq/faq.comb.perm.html

       n!
n_C_k = ----------
            k!(n - k)!

                 26!
26_C_4 = ---------
               4!(22!)

4.03291461126606E+26
---------------------
24*1.12400072777761E+21
I take that back:  It wil take a LONG while to run.  (I admit it is not well-designed, and does not take into account any niceties like building in alphabetical order, i.e. ABCD, ABCE, ABCF, etc.)
I have a table of alpha/numeric identifiers (eg ABCD123, EFGH5678 ...).  I want to find all unique combinations of these identifiers if only four identifiers can be used in each combination.  

 In my case, I want to find all unique combinations, for example, of four letters from a set of 26 letters.

So is it 4 alpha + 4 numeric, and is AB123CD considered valid?
Or is it only 4 from A-Z0-9 combined?
Absolutely BRILLIANT!!!!  Worked on first run - you've just saved me weeks of work, many thanks.  cheers, Lisa
Lisa,
In your actual problem, how many different items were in the list? How long did the code take to run? I'm curious because I had taken such a drastically different approach from jerryb30.

Thanks for the grade!
Brad
Mine took a long time. I tried a few variations, but the task of trying to find fewer and fewer combinations which did not yet exist took longer and longer.  I am thinking of another approach, since it is an interesting programming exercise, but for now it is about an hour for the conditions I stated.
jerryb30,
For 26 letters, my laptop took 55 seconds to list all 14,950 combinations. I hope Lisa wasn't looking for 26 letters and 10 digits.

Since the original question referred to "alphanumeric identifiers", I thought there might be fewer than 26 of them.

Brad
jerryb30,
And by using array transfer, I was able to generate all 14,950 combinations in about one second.

Sub UniqueCombinations()
Dim vTokens As Variant
Dim i1 As Long, i2 As Long, i3 As Long, i4 As Long, j As Long, n As Long, nCombinations As Long
Application.ScreenUpdating = False
vTokens = Array("a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", _
    "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z")
'List as many characters/strings as you like
n = UBound(vTokens)
nCombinations = Application.Combin(UBound(vTokens) - LBound(vTokens) + 1, 4)
ReDim v(1 To nCombinations, 1 To 1)
For i1 = LBound(vTokens) To n - 3
    For i2 = i1 + 1 To n - 2
        For i3 = i2 + 1 To n - 1
            For i4 = i3 + 1 To n
                j = j + 1
                v(j, 1) = vTokens(i1) & vTokens(i2) & vTokens(i3) & vTokens(i4)
            Next
        Next
    Next
Next
[A1].Resize(nCombinations).Value = v
Application.ScreenUpdating = True
End Sub


Brad
Brad:
Ah, a goal!
I did admit bad design.  I once had a program (written by another) which took 48 hours to run.  Once the programmer profiled it and took another logic approach, it took about 40 minutes. The time was worth the effort. If your process only takes 55 sec for what should be a one-time event, it might not be worth it. But, if the possible combos exceed 65k (pre-Office 2007), then Access might be the only way to go. So you did come up with the same number of possible combos?
1 second. Impressive.
Jerry,
The one second surprised me, too. And that was to list everything from "abcd" to "wxyz", 14,950 values in all.

I see that you have a Genius certificate in Access, a program I barely know how to turn on. The carpenter whose only tool was a hammer, saw every problem as a nail. I am equally impaired when it comes to databases, using Excel where it works and running for shelter when it doesn't.

Brad

2 seconds

Sub gen()
    Cells(1, 1) = Now
    r = 2
    For a = 65 To 90
        For b = a To 90
            For c = b To 90
                For d = c To 90
                    Cells(r, 1) = Chr(a) & Chr(b) & Chr(c) & Chr(d)
                    r = r + 1
                Next d
            Next c
        Next b
    Next a
    Cells(r, 1) = Now
End Sub
>>The carpenter whose only tool was a hammer, saw every problem as a nail.
I just heard this adage this year, and now have learned its application. As with you, I am a one-trick pony, but can hopefully learn. Interesting exchange.
imitchie,
Won't your algorithm produce results like "AAAA" and "ZZZZ"?

Also, your computer/version of Excel is much faster than mine. Using Excel 2003 and a Centrion P1500, your code took 137 seconds for me. Are you using Excel 2007?
Brad
Yup, to all questions! I specifically didn't do
for b = a+1 to 90
etc
just to produce the entire range.
And yes, 2007 is faster because I didn't have to turn off ScreenUpdates explicitly.. among other things
imitchie,
I can match your 2 seconds when I use Excel 2007. That's quite an improvement just by changing Excel versions. I may just start using it.

Brad
byundt,

Hi Brad,

I had 26, 34 and 55 items in lists respectively (each item in the list followed this protocol - abcd123 - ie:  4 letters followed by 3 digits).  When I ran the macro using your code all unique combos were generated in:

26 -  1.5 secs (14,950 unique combinations)
34 - 7 secs (46,376 unique combinations)
55 - 26 seconds (341,055 unique combinations)

A quick note, I needed to be able to 'split' the four combo line across four columns (text to columns) , so added ", " & to line 12 of your code - worked like a dream.

Once again - many thanks.  You have no idea how much work you've saved me - That answer was worth thousands - not 500!!
byundt.
Brad,
below is the code for the 26 combos (illustrates the alpha/numeric I was talking about.  

cheers, Lisa.  


Sub UniqueCombinations()
Dim vTokens As Variant
Dim i1 As Long, i2 As Long, i3 As Long, i4 As Long, j As Long, n As Long
Application.ScreenUpdating = False
vTokens = Array("acct101", "bdcd123", "cded555", "dasd698", "eedf256", "ffhf555", "eedu256", "eedy256", "eedt256", "eedr256", "eede256", "eedw256", "eedq256", "eedm256", "eedfn56", "eedb256", "eedv256", "eedc256", "eedx256", "eedf256", "eedz256", "eedf251", "eedf252", "eedf253", "eedf254", "eedf255") 'List as many characters/strings as you like
n = UBound(vTokens)
For i1 = LBound(vTokens) To n - 3
    For i2 = i1 + 1 To n - 2
        For i3 = i2 + 1 To n - 1
            For i4 = i3 + 1 To n
                j = j + 1
                Cells(j, 1) = vTokens(i1) & "," & vTokens(i2) & "," & vTokens(i3) & "," & vTokens(i4)
            Next
        Next
    Next
Next
Application.ScreenUpdating = True
End Sub

Open in new window