mloricha
asked on
Calculate unique combinations
I have a table of alpha/numeric identifiers (eg ABCD123, EFGH5678 ...). I want to find all unique combinations of these identifiers if only four identifiers can be used in each combination.
We start with the fact that you must have 1 Aplha, 1 Numeric at least, so that's 26x10. Now, we can add either a blank or another alpha to the left, or blanks or numeric to the right. So, the total unique ids is
27x27x27x26x10x11x11x11
27x27x27x26x10x11x11x11
ASKER
Hi imitchie,
Sorry, I probably didn't make the question clear. I need a way to generate all possible UNIQUE combinations in sets of four from a list I have (preferably using MS Access or else Excel). For example, I know that finding unique combinations of two letters from a choice of ABC will give: AB, AC and BC. In my case, I want to find all unique combinations, for example, of four letters from a set of 26 letters.
Sorry, I probably didn't make the question clear. I need a way to generate all possible UNIQUE combinations in sets of four from a list I have (preferably using MS Access or else Excel). For example, I know that finding unique combinations of two letters from a choice of ABC will give: AB, AC and BC. In my case, I want to find all unique combinations, for example, of four letters from a set of 26 letters.
By unique, do you mean AC is same as CA?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
There are 14950 combinations wherein no letter is repeated within a code, and
CAT = TAC = ACT, etc
To check on first, a loop is set up to randomly select a number between 65 and 90, and the character equivalent of that number is checked to see if it in the existing string. If so, don't increment counter. otherwise, increment count (of letters in string) and add the letter to the string.
The unique combination of 4 letters is accomplished by calculating a CRC for each letter, and then adding the 4 CRCs together, and placing that value into an indexed field with no dupes.
I created a table (e in this instance) with fields Identifier(text) and newCRC(double)
and then run the following code from a module:
Function buildId()
On Error Resume Next
Dim rs As DAO.Recordset
Dim ident As String
Dim ltr As Integer
Dim i As Integer
Dim j As Integer
Dim vcrc As Double
Set rs = CurrentDb.OpenRecordset("e ")
Do Until DCount("*", "e") = 14951
i = 1
Do Until i = 5
Randomize
ltr = Int(((90 - 65 + 1) * Rnd) + 65)
'Debug.Print ltr
If InStr(ident, Chr(ltr)) > 0 Then
i = i
Else
i = i + 1
vcrc = vcrc + CalcCRC32(Chr(ltr))
ident = ident & Chr(ltr)
End If
Loop
rs.AddNew
rs!identifier = ident
rs!newCRC = vcrc
rs.Update
ident = ""
vcrc = 0
i = 1
j = j + 1
Loop
End Function
Function ltrs()
For i = 65 To 90
Debug.Print Chr(i) & "/" & CalcCRC32(Chr(i))
Next i
Debug.Print "-----------------------"
End Function
Function factorial2()
Dim x As Double
Dim y As Double
For i = 1 To 3
x = x * (i * (i + 1))
Debug.Print i + 1 & "/" & x & "/" & i * (i + 1)
Next i
Debug.Print x
End Function
You will need this function also.
Public Function CalcCRC32(str As String) As Long
Dim i As Long
Dim j As Long
Dim Limit As Long
Dim CRC As Long
Dim Temp1 As Long
Dim Temp2 As Long
Dim CRCTable(0 To 255) As Long
Limit = &HEDB88320
For i = 0 To 255
CRC = i
For j = 8 To 1 Step -1
If CRC < 0 Then
Temp1 = CRC And &H7FFFFFFF
Temp1 = Temp1 \ 2
Temp1 = Temp1 Or &H40000000
Else
Temp1 = CRC \ 2
End If
If CRC And 1 Then
CRC = Temp1 Xor Limit
Else
CRC = Temp1
End If
Next j
CRCTable(i) = CRC
Next i
Limit = Len(str) 'UBound(ByteArray)
CRC = -1
For i = 1 To Limit
If CRC < 0 Then
Temp1 = CRC And &H7FFFFFFF
Temp1 = Temp1 \ 256
Temp1 = (Temp1 Or &H800000) And &HFFFFFF
Else
Temp1 = (CRC \ 256) And &HFFFFFF
End If
Temp2 = Asc(Mid(str, i, 1)) ' ByteArray(I) ' get the byte
Temp2 = CRCTable((CRC Xor Temp2) And &HFF)
CRC = Temp1 Xor Temp2
Next i
CRC = CRC Xor &HFFFFFFFF
CalcCRC32 = CRC
End Function
CAT = TAC = ACT, etc
To check on first, a loop is set up to randomly select a number between 65 and 90, and the character equivalent of that number is checked to see if it in the existing string. If so, don't increment counter. otherwise, increment count (of letters in string) and add the letter to the string.
The unique combination of 4 letters is accomplished by calculating a CRC for each letter, and then adding the 4 CRCs together, and placing that value into an indexed field with no dupes.
I created a table (e in this instance) with fields Identifier(text) and newCRC(double)
and then run the following code from a module:
Function buildId()
On Error Resume Next
Dim rs As DAO.Recordset
Dim ident As String
Dim ltr As Integer
Dim i As Integer
Dim j As Integer
Dim vcrc As Double
Set rs = CurrentDb.OpenRecordset("e
Do Until DCount("*", "e") = 14951
i = 1
Do Until i = 5
Randomize
ltr = Int(((90 - 65 + 1) * Rnd) + 65)
'Debug.Print ltr
If InStr(ident, Chr(ltr)) > 0 Then
i = i
Else
i = i + 1
vcrc = vcrc + CalcCRC32(Chr(ltr))
ident = ident & Chr(ltr)
End If
Loop
rs.AddNew
rs!identifier = ident
rs!newCRC = vcrc
rs.Update
ident = ""
vcrc = 0
i = 1
j = j + 1
Loop
End Function
Function ltrs()
For i = 65 To 90
Debug.Print Chr(i) & "/" & CalcCRC32(Chr(i))
Next i
Debug.Print "-----------------------"
End Function
Function factorial2()
Dim x As Double
Dim y As Double
For i = 1 To 3
x = x * (i * (i + 1))
Debug.Print i + 1 & "/" & x & "/" & i * (i + 1)
Next i
Debug.Print x
End Function
You will need this function also.
Public Function CalcCRC32(str As String) As Long
Dim i As Long
Dim j As Long
Dim Limit As Long
Dim CRC As Long
Dim Temp1 As Long
Dim Temp2 As Long
Dim CRCTable(0 To 255) As Long
Limit = &HEDB88320
For i = 0 To 255
CRC = i
For j = 8 To 1 Step -1
If CRC < 0 Then
Temp1 = CRC And &H7FFFFFFF
Temp1 = Temp1 \ 2
Temp1 = Temp1 Or &H40000000
Else
Temp1 = CRC \ 2
End If
If CRC And 1 Then
CRC = Temp1 Xor Limit
Else
CRC = Temp1
End If
Next j
CRCTable(i) = CRC
Next i
Limit = Len(str) 'UBound(ByteArray)
CRC = -1
For i = 1 To Limit
If CRC < 0 Then
Temp1 = CRC And &H7FFFFFFF
Temp1 = Temp1 \ 256
Temp1 = (Temp1 Or &H800000) And &HFFFFFF
Else
Temp1 = (CRC \ 256) And &HFFFFFF
End If
Temp2 = Asc(Mid(str, i, 1)) ' ByteArray(I) ' get the byte
Temp2 = CRCTable((CRC Xor Temp2) And &HFF)
CRC = Temp1 Xor Temp2
Next i
CRC = CRC Xor &HFFFFFFFF
CalcCRC32 = CRC
End Function
It takes awhile to run, because fewer and fewer non-duplicates are generated for each loop.
Combination number came from
http://mathforum.org/dr.math/faq/faq.comb.perm.html
n!
n_C_k = ----------
k!(n - k)!
26!
26_C_4 = ---------
4!(22!)
4.03291461126606E+26
---------------------
24*1.12400072777761E+21
http://mathforum.org/dr.math/faq/faq.comb.perm.html
n!
n_C_k = ----------
k!(n - k)!
26!
26_C_4 = ---------
4!(22!)
4.03291461126606E+26
---------------------
24*1.12400072777761E+21
I take that back: It wil take a LONG while to run. (I admit it is not well-designed, and does not take into account any niceties like building in alphabetical order, i.e. ABCD, ABCE, ABCF, etc.)
I have a table of alpha/numeric identifiers (eg ABCD123, EFGH5678 ...). I want to find all unique combinations of these identifiers if only four identifiers can be used in each combination.
In my case, I want to find all unique combinations, for example, of four letters from a set of 26 letters.
So is it 4 alpha + 4 numeric, and is AB123CD considered valid?
Or is it only 4 from A-Z0-9 combined?
In my case, I want to find all unique combinations, for example, of four letters from a set of 26 letters.
So is it 4 alpha + 4 numeric, and is AB123CD considered valid?
Or is it only 4 from A-Z0-9 combined?
ASKER
Absolutely BRILLIANT!!!! Worked on first run - you've just saved me weeks of work, many thanks. cheers, Lisa
Lisa,
In your actual problem, how many different items were in the list? How long did the code take to run? I'm curious because I had taken such a drastically different approach from jerryb30.
Thanks for the grade!
Brad
In your actual problem, how many different items were in the list? How long did the code take to run? I'm curious because I had taken such a drastically different approach from jerryb30.
Thanks for the grade!
Brad
Mine took a long time. I tried a few variations, but the task of trying to find fewer and fewer combinations which did not yet exist took longer and longer. I am thinking of another approach, since it is an interesting programming exercise, but for now it is about an hour for the conditions I stated.
jerryb30,
For 26 letters, my laptop took 55 seconds to list all 14,950 combinations. I hope Lisa wasn't looking for 26 letters and 10 digits.
Since the original question referred to "alphanumeric identifiers", I thought there might be fewer than 26 of them.
Brad
For 26 letters, my laptop took 55 seconds to list all 14,950 combinations. I hope Lisa wasn't looking for 26 letters and 10 digits.
Since the original question referred to "alphanumeric identifiers", I thought there might be fewer than 26 of them.
Brad
jerryb30,
And by using array transfer, I was able to generate all 14,950 combinations in about one second.
Sub UniqueCombinations()
Dim vTokens As Variant
Dim i1 As Long, i2 As Long, i3 As Long, i4 As Long, j As Long, n As Long, nCombinations As Long
Application.ScreenUpdating = False
vTokens = Array("a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", _
"o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z")
'List as many characters/strings as you like
n = UBound(vTokens)
nCombinations = Application.Combin(UBound( vTokens) - LBound(vTokens) + 1, 4)
ReDim v(1 To nCombinations, 1 To 1)
For i1 = LBound(vTokens) To n - 3
For i2 = i1 + 1 To n - 2
For i3 = i2 + 1 To n - 1
For i4 = i3 + 1 To n
j = j + 1
v(j, 1) = vTokens(i1) & vTokens(i2) & vTokens(i3) & vTokens(i4)
Next
Next
Next
Next
[A1].Resize(nCombinations) .Value = v
Application.ScreenUpdating = True
End Sub
Brad
And by using array transfer, I was able to generate all 14,950 combinations in about one second.
Sub UniqueCombinations()
Dim vTokens As Variant
Dim i1 As Long, i2 As Long, i3 As Long, i4 As Long, j As Long, n As Long, nCombinations As Long
Application.ScreenUpdating
vTokens = Array("a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", _
"o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z")
'List as many characters/strings as you like
n = UBound(vTokens)
nCombinations = Application.Combin(UBound(
ReDim v(1 To nCombinations, 1 To 1)
For i1 = LBound(vTokens) To n - 3
For i2 = i1 + 1 To n - 2
For i3 = i2 + 1 To n - 1
For i4 = i3 + 1 To n
j = j + 1
v(j, 1) = vTokens(i1) & vTokens(i2) & vTokens(i3) & vTokens(i4)
Next
Next
Next
Next
[A1].Resize(nCombinations)
Application.ScreenUpdating
End Sub
Brad
Brad:
Ah, a goal!
I did admit bad design. I once had a program (written by another) which took 48 hours to run. Once the programmer profiled it and took another logic approach, it took about 40 minutes. The time was worth the effort. If your process only takes 55 sec for what should be a one-time event, it might not be worth it. But, if the possible combos exceed 65k (pre-Office 2007), then Access might be the only way to go. So you did come up with the same number of possible combos?
Ah, a goal!
I did admit bad design. I once had a program (written by another) which took 48 hours to run. Once the programmer profiled it and took another logic approach, it took about 40 minutes. The time was worth the effort. If your process only takes 55 sec for what should be a one-time event, it might not be worth it. But, if the possible combos exceed 65k (pre-Office 2007), then Access might be the only way to go. So you did come up with the same number of possible combos?
1 second. Impressive.
Jerry,
The one second surprised me, too. And that was to list everything from "abcd" to "wxyz", 14,950 values in all.
I see that you have a Genius certificate in Access, a program I barely know how to turn on. The carpenter whose only tool was a hammer, saw every problem as a nail. I am equally impaired when it comes to databases, using Excel where it works and running for shelter when it doesn't.
Brad
The one second surprised me, too. And that was to list everything from "abcd" to "wxyz", 14,950 values in all.
I see that you have a Genius certificate in Access, a program I barely know how to turn on. The carpenter whose only tool was a hammer, saw every problem as a nail. I am equally impaired when it comes to databases, using Excel where it works and running for shelter when it doesn't.
Brad
2 seconds
Sub gen()
Cells(1, 1) = Now
r = 2
For a = 65 To 90
For b = a To 90
For c = b To 90
For d = c To 90
Cells(r, 1) = Chr(a) & Chr(b) & Chr(c) & Chr(d)
r = r + 1
Next d
Next c
Next b
Next a
Cells(r, 1) = Now
End Sub
Sub gen()
Cells(1, 1) = Now
r = 2
For a = 65 To 90
For b = a To 90
For c = b To 90
For d = c To 90
Cells(r, 1) = Chr(a) & Chr(b) & Chr(c) & Chr(d)
r = r + 1
Next d
Next c
Next b
Next a
Cells(r, 1) = Now
End Sub
>>The carpenter whose only tool was a hammer, saw every problem as a nail.
I just heard this adage this year, and now have learned its application. As with you, I am a one-trick pony, but can hopefully learn. Interesting exchange.
I just heard this adage this year, and now have learned its application. As with you, I am a one-trick pony, but can hopefully learn. Interesting exchange.
imitchie,
Won't your algorithm produce results like "AAAA" and "ZZZZ"?
Also, your computer/version of Excel is much faster than mine. Using Excel 2003 and a Centrion P1500, your code took 137 seconds for me. Are you using Excel 2007?
Brad
Won't your algorithm produce results like "AAAA" and "ZZZZ"?
Also, your computer/version of Excel is much faster than mine. Using Excel 2003 and a Centrion P1500, your code took 137 seconds for me. Are you using Excel 2007?
Brad
Yup, to all questions! I specifically didn't do
for b = a+1 to 90
etc
just to produce the entire range.
And yes, 2007 is faster because I didn't have to turn off ScreenUpdates explicitly.. among other things
for b = a+1 to 90
etc
just to produce the entire range.
And yes, 2007 is faster because I didn't have to turn off ScreenUpdates explicitly.. among other things
imitchie,
I can match your 2 seconds when I use Excel 2007. That's quite an improvement just by changing Excel versions. I may just start using it.
Brad
I can match your 2 seconds when I use Excel 2007. That's quite an improvement just by changing Excel versions. I may just start using it.
Brad
ASKER
byundt,
Hi Brad,
I had 26, 34 and 55 items in lists respectively (each item in the list followed this protocol - abcd123 - ie: 4 letters followed by 3 digits). When I ran the macro using your code all unique combos were generated in:
26 - 1.5 secs (14,950 unique combinations)
34 - 7 secs (46,376 unique combinations)
55 - 26 seconds (341,055 unique combinations)
A quick note, I needed to be able to 'split' the four combo line across four columns (text to columns) , so added ", " & to line 12 of your code - worked like a dream.
Once again - many thanks. You have no idea how much work you've saved me - That answer was worth thousands - not 500!!
Hi Brad,
I had 26, 34 and 55 items in lists respectively (each item in the list followed this protocol - abcd123 - ie: 4 letters followed by 3 digits). When I ran the macro using your code all unique combos were generated in:
26 - 1.5 secs (14,950 unique combinations)
34 - 7 secs (46,376 unique combinations)
55 - 26 seconds (341,055 unique combinations)
A quick note, I needed to be able to 'split' the four combo line across four columns (text to columns) , so added ", " & to line 12 of your code - worked like a dream.
Once again - many thanks. You have no idea how much work you've saved me - That answer was worth thousands - not 500!!
ASKER
byundt.
Brad,
below is the code for the 26 combos (illustrates the alpha/numeric I was talking about.
cheers, Lisa.
Brad,
below is the code for the 26 combos (illustrates the alpha/numeric I was talking about.
cheers, Lisa.
Sub UniqueCombinations()
Dim vTokens As Variant
Dim i1 As Long, i2 As Long, i3 As Long, i4 As Long, j As Long, n As Long
Application.ScreenUpdating = False
vTokens = Array("acct101", "bdcd123", "cded555", "dasd698", "eedf256", "ffhf555", "eedu256", "eedy256", "eedt256", "eedr256", "eede256", "eedw256", "eedq256", "eedm256", "eedfn56", "eedb256", "eedv256", "eedc256", "eedx256", "eedf256", "eedz256", "eedf251", "eedf252", "eedf253", "eedf254", "eedf255") 'List as many characters/strings as you like
n = UBound(vTokens)
For i1 = LBound(vTokens) To n - 3
For i2 = i1 + 1 To n - 2
For i3 = i2 + 1 To n - 1
For i4 = i3 + 1 To n
j = j + 1
Cells(j, 1) = vTokens(i1) & "," & vTokens(i2) & "," & vTokens(i3) & "," & vTokens(i4)
Next
Next
Next
Next
Application.ScreenUpdating = True
End Sub
Also, do you just want to know how many identifiers you can accommodate?