Solved

How do I create a large array of unique strings?

Posted on 2008-10-15
19
431 Views
Last Modified: 2013-11-05
Hello!  What i need is a function that creates a large array of unique strings.  The array will need to be around 30,000 elements.  each string in each element needs to be unique and composed of A-Za-z0-9
so lets just say each string was something like:

A1F3kj87DE1385    //something like that.  i'd like each string to not be over 32 chars but if that isn't possible then it's okay.  Basically i'll have another loop in my program that adds things to a file and one part of each line in the file will be one of these unique strings in this array.

Can someone provide me with a function that will achieve this goal?
thanks!!
0
Comment
Question by:linuxrox
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 9
  • 5
  • 5
19 Comments
 
LVL 86

Expert Comment

by:Mike Tomlinson
ID: 22726830
Do they have to be "random"...or would you be happy with a sequence of values that "count" using your specified characters?

Try this out...just make a Form with a Button and a TextBox.  Click the TextBox repeatedly and see what happens to the value:
Option Explicit
 
Private Sub Form_Load()
    Text1.Text = "A1F3kj87DE1385"
End Sub
 
Private Sub Command1_Click()
    Text1.Text = NextRevision(Text1.Text)
End Sub
 
Private Function NextRevision(Optional ByVal curRevision As String) As String
    Dim i As Integer
    Dim char As String
    Dim chars As String
    Dim index As Integer
    Dim charArray() As String
    
    ' modify the line below to include the characters you
    ' want in the sequence
    chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
    
    ' blank revision passed in, return first letter in sequence
    curRevision = Trim(UCase(curRevision))
    If curRevision = "" Then
        NextRevision = Left(chars, 1)
        Exit Function
    End If
    
    ' make sure current revision has only valid characters in it
    ' otherwise...return first letter in sequence
    For i = 1 To Len(curRevision)
        char = Mid(curRevision, i, 1)
        If InStr(chars, char) = 0 Then
            MsgBox "Returning: " & Left(chars, 1), vbCritical, "Invalid Starting Revision: " & curRevision
            NextRevision = Left(chars, 1)
            Exit Function
        End If
    Next i
    
    ' build a character array from the current revision
    ReDim charArray(Len(curRevision) - 1)
    For i = 1 To Len(curRevision)
        charArray(i - 1) = Mid(curRevision, i, 1)
    Next i
    
    ' see if the just the last letter needs to be "incremented"
    char = charArray(UBound(charArray))
    index = InStr(chars, char)
    If index < Len(chars) Then
        charArray(UBound(charArray)) = Mid(chars, index + 1, 1)
        NextRevision = Join(charArray, "")
        Exit Function
    End If
                
    ' last letter "rolled over"
    ' propagate the "carry over" as far as it needs to go
    charArray(UBound(charArray)) = Left(chars, 1)
    For i = UBound(charArray) - 1 To 0 Step -1
        char = charArray(i)
        index = InStr(chars, char)
        If index < Len(chars) Then
            index = index + 1
            charArray(i) = Mid(chars, index, 1)
            NextRevision = Join(charArray, "")
            Exit Function
        Else
            charArray(i) = Left(chars, 1)
        End If
    Next i
    
    ' "carry over" propagated all the way thru sequence
    ' add first letter of sequence to the beginning of the revision
    NextRevision = Left(chars, 1) & Join(charArray, "")
End Function

Open in new window

0
 

Author Comment

by:linuxrox
ID: 22726951
that's pretty cool.  honestly i'd rather it be random.
like some function i throw in a loop like:

for x = 0 to 30000
somearray() = unique_function()  //which makes somearray() have 30,000 unique strings

next x

your code is nifty though and i like the idea!!  i have some code i was using but i have some problems with it...let me see if i can get it..
0
 

Author Comment

by:linuxrox
ID: 22727000
Try this code below and you'll see that the file that it writes "num.txt" contains duplicates.  after it creates the file just use a text editor like editpad pro or something and sort the results Ascending.

you can then show the duplicates in editpad pro by putting this in the searchbox at the bottom:

^(.*)(\r?\n\1)+$  click "regular expression" at right bottom on editpad pro and it will highlight the duplicates.

Here's what i don't understand about this, if i don't put the values into an array (in this case the array is called id1) and lets say i just print the value from num1 = GenerateCode(32), then i don't get duplicates!  i don't understand that at all!
Option Explicit
Private Sub Form_Load()
Dim num As Long
Dim num1 As String
Dim id1(2000) As String
For num = 0 To 2000
 
num1 = GenerateCode(32)
id1(num) = num1
num1 = ""
Next num
Dim item As Variant
For Each item In id1
Dim nFileNum3 As Integer
nFileNum3 = FreeFile
Open App.Path & "\num.txt" For Append As nFileNum3
Print #nFileNum3, item
Close nFileNum3
Next
End
end sub
 
Public Function GenerateCode(length As Integer)
Dim strInputString As String
Dim intLength As Integer
Dim intNameLength As Integer
Dim strName As String
Dim intStep As Integer
Dim intRnd As Integer
Call Randomize
   strInputString = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" 'these are the characters which will be in the password
  
   intLength = Len(strInputString)
  
   intNameLength = length 'edit this according to how long u want ur password to be
  
   'Call Randomize ' jus to make it random :D
  ' Call Randomize
  
   strName = ""
  
   For intStep = 0 To intNameLength
       intRnd = Int((intLength * Rnd) + 1)
       strName = strName & Mid(strInputString, intRnd, 1)
   Next
  
   GenerateCode = strName
End Function

Open in new window

0
Revamp Your Training Process

Drastically shorten your training time with WalkMe's advanced online training solution that Guides your trainees to action.

 

Author Comment

by:linuxrox
ID: 22727014
if i just do this below then i don't get duplicates....makes no sense...but i can't do this because i need the values in an array..
Dim num As Long
Dim num1 As String
Dim id1(2000) As String
For num = 0 To 2000
num1 = GenerateCode(32)
Dim nFileNum4 As Integer
nFileNum4 = FreeFile
Open App.Path & "\num.txt" For Append As nFileNum4
Print #nFileNum4, num1
Close nFileNum4
id1(num) = num1
num1 = ""
Next num

Open in new window

0
 
LVL 86

Expert Comment

by:Mike Tomlinson
ID: 22727022
You will get duplicates with that code...if you're not then it is just a matter of not generating enough values and you just got lucky.  =)

Do you really need an Array?  How about a Collection instead?
0
 

Author Comment

by:linuxrox
ID: 22727036
a collection would be fine as well...as long as i can pull a value out of the collection while in another loop you know?

what is it about that code that causes there to be duplicates?  can that code be modified so it wouldn't have them?
0
 
LVL 46

Accepted Solution

by:
aikimark earned 250 total points
ID: 22727045
you can also use a collection in place of an array.
dim somearray(30000) as string
for x = 0 to 30000
   somearray(x) = RandString
next x
 
 
Public Function RandString(Optional parmLen As Long = 30) As String
    Const chars As String = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
    Const charsLen As Long = 62     '=Len(chars)
    Dim strTemp As String
    Dim lngPosn As Long
    strTemp = Space(parmLen)
    For lngPosn = 1 To parmLen
        Mid$(strTemp, lngPosn, 1) = Mid$(chars, Int(Rnd * charsLen) + 1, 1)
    Next
    RandString = strTemp
End Function

Open in new window

0
 
LVL 86

Expert Comment

by:Mike Tomlinson
ID: 22727074
Your GenerateCode() function simply generates a string of random characters of a specified length.  It has no knowledge of what codes were generated BEFORE.  Given a small sample set, it is possible to generate a list with no repeats but there is no guarantee that it will be so.

To make a list with NO repeats you have to check ALL of the values already in the list to make sure you don't add a repeat.

There are two generic algorithms to do this:

(1) Brute Force - Generate a value.  Is it in the list?  Yes - Pick another value.  No - Add it to the List.  Repeat until you have the desired number of values.  Obviously as your list gets bigger and bigger it will take longer and longer to determine if a value has already been "chosen".  An advantage of the Collection object over an Array is that you can add items with a Key and then quickly tell if a given Key already Exists in the Collection.  (A Dictionary object is even  better in this respect.)

(2) Shuffle - Generate a list of "ordered" values.  Iterate over the list and swap each item with a random item from another slot in the list.  This is just like a card deck shuffle.
0
 

Author Comment

by:linuxrox
ID: 22727090
i see.  i ran aikimark's code and i don't see any duplicates.
do you all think this would be a reliable way to do this?
Private Sub Form_Load()
Dim somearray(30000) As String
Dim j As Long
For j = 0 To 30000
   somearray(j) = RandString
Next j
MsgBox UBound(somearray)
Dim item1 As Variant
For Each item1 In somearray
Dim nFileNum5 As Integer
nFileNum5 = FreeFile
Open App.Path & "\num.txt" For Append As nFileNum5
Print #nFileNum5, item1
Close nFileNum5
Next
End

Open in new window

0
 

Author Comment

by:linuxrox
ID: 22727105
Idle_Mind:

why though would i not get any duplicates when i just run the loop and write the file bypassing putting the values into an array?
0
 
LVL 86

Expert Comment

by:Mike Tomlinson
ID: 22727117
Simply put...you got lucky!

Standby...I'll post code that generates a collection with no duplicates.
0
 

Author Comment

by:linuxrox
ID: 22727133
Idle_Mind:  hehe, well honestly, i've never had any dup's when it wasn't in an array!! : )
0
 
LVL 46

Expert Comment

by:aikimark
ID: 22727148
@linuxrox

The VB Rnd function has a period of 2^24, so you can expect to get 559240 unique 30-character strings.  You will get more unique strings if their lengths are reduced.

If you need more than that:
There are 64k unique seeds available via the Randomize statement and you can start the (PRNG) sequence at some offset other than the first pseudo-random number as long as the offset is not equal to your string output length.

Reference (my analysis of the VB PRNG):
http://www.15seconds.com/issue/051110.htm
0
 
LVL 86

Assisted Solution

by:Mike Tomlinson
Mike Tomlinson earned 250 total points
ID: 22727168
Here is a sample that prevents repeat codes:
Option Explicit
 
Private chars As String
Private charsLen As Integer
Private codes As New Collection
 
Private Sub Form_Load()
    Call Randomize
    chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" 'these are the characters which will be in the password
    charsLen = Len(chars)
End Sub
 
Private Sub Command1_Click()
    Set codes = New Collection
    Call GenCodes
    
    ' print a small sample
    Dim i As Integer
    For i = 1 To 10
        Debug.Print codes(i)
    Next i
End Sub
 
Public Sub GenCodes()
    On Error GoTo AlreadyExists
    
    Dim i As Long
    Dim code As String
    For i = 1 To 30000
GetAnotherCode:
        code = GenerateCode(32)
        codes.Add code, code
        DoEvents
    Next
    Exit Sub
    
AlreadyExists:
    Debug.Print "Collision! Generating New Code..."
    Resume GetAnotherCode
End Sub
 
Public Function GenerateCode(ByVal length As Integer) As String
    If length > 0 Then
        While Len(GenerateCode) <> length
            GenerateCode = GenerateCode & Mid(chars, Int(charsLen * Rnd + 1), 1)
        Wend
    End If
End Function

Open in new window

0
 
LVL 46

Expert Comment

by:aikimark
ID: 22727174
If you want to ensure that you have no duplicates, you can use a collection, whose items' key property are unique.
Dim somearray(30000) As String
Dim colThing As New Collection
Dim j As Long
Dim strTemp As String
On Error Resume next
j = 0 
Do
   strTemp = RandString
   colThing.Add "a",strTemp
   If Err = 0 Then
       somearray(j) = strTemp
       j = + 1
   Else
       Err.Clear
   End If
Loop Until J > 30000

Open in new window

0
 

Author Closing Comment

by:linuxrox
ID: 31506545
This is great guys!  i truly appreciate the help on this!! very nice jobs!!
you guys rock!
0
 
LVL 46

Expert Comment

by:aikimark
ID: 22734461
@linuxrox

Thanks for the points and the interesting question.

* If you are using a collection to guarantee string uniqueness, you might as well use the string to hold the strings as well, replacing the somearray.

* If speed is a consideration, you might use the code in the snippet.  It is about 25% faster than the code in the accepted comment.
Public Function RandString_Fast(Optional parmLen As Long = 30) As String
    Static chars(0 To 61) As Byte
    Static b() As Byte
    Static lngPosn As Long
    Static lngLoop As Long
    If chars(0) = 0 Then
        For lngLoop = 0 To 9
            chars(lngLoop) = Asc(lngLoop)
        Next
        lngPosn = 10
        For lngLoop = Asc("a") To Asc("z")
            chars(lngPosn) = lngLoop
            lngPosn = lngPosn + 1
        Next
        For lngLoop = Asc("A") To Asc("Z")
            chars(lngPosn) = lngLoop
            lngPosn = lngPosn + 1
        Next
        lngLoop = 62
        'Note: move the ReDim statement outside this If...EndIf
        '       construct if different lengths will be requested
        '       in a single execution
        ReDim b(0 To parmLen - 1)
    End If
    
    For lngPosn = 0 To parmLen - 1
        b(lngPosn) = chars(Int(Rnd * lngLoop))
    Next
    
    RandString_Fast = StrConv(b, vbUnicode)
End Function

Open in new window

0
 
LVL 46

Expert Comment

by:aikimark
ID: 22736106
I get another performance boost (35% better than accepted solution) with:

    Dim lngPosn As Long
    Dim lngLoop As Long
0
 

Author Comment

by:linuxrox
ID: 22736240
AWESOME!!!  performance boosts rock!!! thanks once again!!
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

You can of course define an array to hold data that is of a particular type like an array of Strings to hold customer names or an array of Doubles to hold customer sales, but what do you do if you want to coordinate that data? This article describes…
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
Get people started with the utilization of class modules. Class modules can be a powerful tool in Microsoft Access. They allow you to create self-contained objects that encapsulate functionality. They can easily hide the complexity of a process from…
Suggested Courses
Course of the Month8 days, 11 hours left to enroll

617 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question