# How do I create a large array of unique strings?

Hello!  What i need is a function that creates a large array of unique strings.  The array will need to be around 30,000 elements.  each string in each element needs to be unique and composed of A-Za-z0-9
so lets just say each string was something like:

A1F3kj87DE1385    //something like that.  i'd like each string to not be over 32 chars but if that isn't possible then it's okay.  Basically i'll have another loop in my program that adds things to a file and one part of each line in the file will be one of these unique strings in this array.

Can someone provide me with a function that will achieve this goal?
thanks!!
###### Who is Participating?

Commented:
you can also use a collection in place of an array.
``````dim somearray(30000) as string
for x = 0 to 30000
somearray(x) = RandString
next x

Public Function RandString(Optional parmLen As Long = 30) As String
Const chars As String = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
Const charsLen As Long = 62     '=Len(chars)
Dim strTemp As String
Dim lngPosn As Long
strTemp = Space(parmLen)
For lngPosn = 1 To parmLen
Mid\$(strTemp, lngPosn, 1) = Mid\$(chars, Int(Rnd * charsLen) + 1, 1)
Next
RandString = strTemp
End Function
``````
0

Middle School Assistant TeacherCommented:
Do they have to be "random"...or would you be happy with a sequence of values that "count" using your specified characters?

Try this out...just make a Form with a Button and a TextBox.  Click the TextBox repeatedly and see what happens to the value:
``````Option Explicit

Text1.Text = "A1F3kj87DE1385"
End Sub

Private Sub Command1_Click()
Text1.Text = NextRevision(Text1.Text)
End Sub

Private Function NextRevision(Optional ByVal curRevision As String) As String
Dim i As Integer
Dim char As String
Dim chars As String
Dim index As Integer
Dim charArray() As String

' modify the line below to include the characters you
' want in the sequence
chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

' blank revision passed in, return first letter in sequence
curRevision = Trim(UCase(curRevision))
If curRevision = "" Then
NextRevision = Left(chars, 1)
Exit Function
End If

' make sure current revision has only valid characters in it
' otherwise...return first letter in sequence
For i = 1 To Len(curRevision)
char = Mid(curRevision, i, 1)
If InStr(chars, char) = 0 Then
MsgBox "Returning: " & Left(chars, 1), vbCritical, "Invalid Starting Revision: " & curRevision
NextRevision = Left(chars, 1)
Exit Function
End If
Next i

' build a character array from the current revision
ReDim charArray(Len(curRevision) - 1)
For i = 1 To Len(curRevision)
charArray(i - 1) = Mid(curRevision, i, 1)
Next i

' see if the just the last letter needs to be "incremented"
char = charArray(UBound(charArray))
index = InStr(chars, char)
If index < Len(chars) Then
charArray(UBound(charArray)) = Mid(chars, index + 1, 1)
NextRevision = Join(charArray, "")
Exit Function
End If

' last letter "rolled over"
' propagate the "carry over" as far as it needs to go
charArray(UBound(charArray)) = Left(chars, 1)
For i = UBound(charArray) - 1 To 0 Step -1
char = charArray(i)
index = InStr(chars, char)
If index < Len(chars) Then
index = index + 1
charArray(i) = Mid(chars, index, 1)
NextRevision = Join(charArray, "")
Exit Function
Else
charArray(i) = Left(chars, 1)
End If
Next i

' "carry over" propagated all the way thru sequence
' add first letter of sequence to the beginning of the revision
NextRevision = Left(chars, 1) & Join(charArray, "")
End Function
``````
0

Author Commented:
that's pretty cool.  honestly i'd rather it be random.
like some function i throw in a loop like:

for x = 0 to 30000
somearray() = unique_function()  //which makes somearray() have 30,000 unique strings

next x

your code is nifty though and i like the idea!!  i have some code i was using but i have some problems with it...let me see if i can get it..
0

Author Commented:
Try this code below and you'll see that the file that it writes "num.txt" contains duplicates.  after it creates the file just use a text editor like editpad pro or something and sort the results Ascending.

you can then show the duplicates in editpad pro by putting this in the searchbox at the bottom:

^(.*)(\r?\n\1)+\$  click "regular expression" at right bottom on editpad pro and it will highlight the duplicates.

Here's what i don't understand about this, if i don't put the values into an array (in this case the array is called id1) and lets say i just print the value from num1 = GenerateCode(32), then i don't get duplicates!  i don't understand that at all!
``````Option Explicit
Dim num As Long
Dim num1 As String
Dim id1(2000) As String
For num = 0 To 2000

num1 = GenerateCode(32)
id1(num) = num1
num1 = ""
Next num
Dim item As Variant
For Each item In id1
Dim nFileNum3 As Integer
nFileNum3 = FreeFile
Open App.Path & "\num.txt" For Append As nFileNum3
Print #nFileNum3, item
Close nFileNum3
Next
End
end sub

Public Function GenerateCode(length As Integer)
Dim strInputString As String
Dim intLength As Integer
Dim intNameLength As Integer
Dim strName As String
Dim intStep As Integer
Dim intRnd As Integer
Call Randomize
strInputString = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" 'these are the characters which will be in the password

intLength = Len(strInputString)

intNameLength = length 'edit this according to how long u want ur password to be

'Call Randomize ' jus to make it random :D
' Call Randomize

strName = ""

For intStep = 0 To intNameLength
intRnd = Int((intLength * Rnd) + 1)
strName = strName & Mid(strInputString, intRnd, 1)
Next

GenerateCode = strName
End Function
``````
0

Author Commented:
if i just do this below then i don't get duplicates....makes no sense...but i can't do this because i need the values in an array..
``````Dim num As Long
Dim num1 As String
Dim id1(2000) As String
For num = 0 To 2000
num1 = GenerateCode(32)
Dim nFileNum4 As Integer
nFileNum4 = FreeFile
Open App.Path & "\num.txt" For Append As nFileNum4
Print #nFileNum4, num1
Close nFileNum4
id1(num) = num1
num1 = ""
Next num
``````
0

Middle School Assistant TeacherCommented:
You will get duplicates with that code...if you're not then it is just a matter of not generating enough values and you just got lucky.  =)

0

Author Commented:
a collection would be fine as well...as long as i can pull a value out of the collection while in another loop you know?

what is it about that code that causes there to be duplicates?  can that code be modified so it wouldn't have them?
0

Middle School Assistant TeacherCommented:
Your GenerateCode() function simply generates a string of random characters of a specified length.  It has no knowledge of what codes were generated BEFORE.  Given a small sample set, it is possible to generate a list with no repeats but there is no guarantee that it will be so.

To make a list with NO repeats you have to check ALL of the values already in the list to make sure you don't add a repeat.

There are two generic algorithms to do this:

(1) Brute Force - Generate a value.  Is it in the list?  Yes - Pick another value.  No - Add it to the List.  Repeat until you have the desired number of values.  Obviously as your list gets bigger and bigger it will take longer and longer to determine if a value has already been "chosen".  An advantage of the Collection object over an Array is that you can add items with a Key and then quickly tell if a given Key already Exists in the Collection.  (A Dictionary object is even  better in this respect.)

(2) Shuffle - Generate a list of "ordered" values.  Iterate over the list and swap each item with a random item from another slot in the list.  This is just like a card deck shuffle.
0

Author Commented:
i see.  i ran aikimark's code and i don't see any duplicates.
do you all think this would be a reliable way to do this?
``````Private Sub Form_Load()
Dim somearray(30000) As String
Dim j As Long
For j = 0 To 30000
somearray(j) = RandString
Next j
MsgBox UBound(somearray)
Dim item1 As Variant
For Each item1 In somearray
Dim nFileNum5 As Integer
nFileNum5 = FreeFile
Open App.Path & "\num.txt" For Append As nFileNum5
Print #nFileNum5, item1
Close nFileNum5
Next
End
``````
0

Author Commented:
Idle_Mind:

why though would i not get any duplicates when i just run the loop and write the file bypassing putting the values into an array?
0

Middle School Assistant TeacherCommented:
Simply put...you got lucky!

Standby...I'll post code that generates a collection with no duplicates.
0

Author Commented:
Idle_Mind:  hehe, well honestly, i've never had any dup's when it wasn't in an array!! : )
0

Commented:
@linuxrox

The VB Rnd function has a period of 2^24, so you can expect to get 559240 unique 30-character strings.  You will get more unique strings if their lengths are reduced.

If you need more than that:
There are 64k unique seeds available via the Randomize statement and you can start the (PRNG) sequence at some offset other than the first pseudo-random number as long as the offset is not equal to your string output length.

Reference (my analysis of the VB PRNG):
http://www.15seconds.com/issue/051110.htm
0

Middle School Assistant TeacherCommented:
Here is a sample that prevents repeat codes:
``````Option Explicit

Private chars As String
Private charsLen As Integer
Private codes As New Collection

Call Randomize
chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" 'these are the characters which will be in the password
charsLen = Len(chars)
End Sub

Private Sub Command1_Click()
Set codes = New Collection
Call GenCodes

' print a small sample
Dim i As Integer
For i = 1 To 10
Debug.Print codes(i)
Next i
End Sub

Public Sub GenCodes()

Dim i As Long
Dim code As String
For i = 1 To 30000
GetAnotherCode:
code = GenerateCode(32)
DoEvents
Next
Exit Sub

Debug.Print "Collision! Generating New Code..."
Resume GetAnotherCode
End Sub

Public Function GenerateCode(ByVal length As Integer) As String
If length > 0 Then
While Len(GenerateCode) <> length
GenerateCode = GenerateCode & Mid(chars, Int(charsLen * Rnd + 1), 1)
Wend
End If
End Function
``````
0

Commented:
If you want to ensure that you have no duplicates, you can use a collection, whose items' key property are unique.
``````Dim somearray(30000) As String
Dim colThing As New Collection
Dim j As Long
Dim strTemp As String
On Error Resume next
j = 0
Do
strTemp = RandString
If Err = 0 Then
somearray(j) = strTemp
j = + 1
Else
Err.Clear
End If
Loop Until J > 30000
``````
0

Author Commented:
This is great guys!  i truly appreciate the help on this!! very nice jobs!!
you guys rock!
0

Commented:
@linuxrox

Thanks for the points and the interesting question.

* If you are using a collection to guarantee string uniqueness, you might as well use the string to hold the strings as well, replacing the somearray.

* If speed is a consideration, you might use the code in the snippet.  It is about 25% faster than the code in the accepted comment.
``````Public Function RandString_Fast(Optional parmLen As Long = 30) As String
Static chars(0 To 61) As Byte
Static b() As Byte
Static lngPosn As Long
Static lngLoop As Long
If chars(0) = 0 Then
For lngLoop = 0 To 9
chars(lngLoop) = Asc(lngLoop)
Next
lngPosn = 10
For lngLoop = Asc("a") To Asc("z")
chars(lngPosn) = lngLoop
lngPosn = lngPosn + 1
Next
For lngLoop = Asc("A") To Asc("Z")
chars(lngPosn) = lngLoop
lngPosn = lngPosn + 1
Next
lngLoop = 62
'Note: move the ReDim statement outside this If...EndIf
'       construct if different lengths will be requested
'       in a single execution
ReDim b(0 To parmLen - 1)
End If

For lngPosn = 0 To parmLen - 1
b(lngPosn) = chars(Int(Rnd * lngLoop))
Next

RandString_Fast = StrConv(b, vbUnicode)
End Function
``````
0

Commented:
I get another performance boost (35% better than accepted solution) with:

Dim lngPosn As Long
Dim lngLoop As Long
0

Author Commented:
AWESOME!!!  performance boosts rock!!! thanks once again!!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.