Solved

How do I create a large array of unique strings?

Posted on 2008-10-15
19
410 Views
Last Modified: 2013-11-05
Hello!  What i need is a function that creates a large array of unique strings.  The array will need to be around 30,000 elements.  each string in each element needs to be unique and composed of A-Za-z0-9
so lets just say each string was something like:

A1F3kj87DE1385    //something like that.  i'd like each string to not be over 32 chars but if that isn't possible then it's okay.  Basically i'll have another loop in my program that adds things to a file and one part of each line in the file will be one of these unique strings in this array.

Can someone provide me with a function that will achieve this goal?
thanks!!
0
Comment
Question by:linuxrox
  • 9
  • 5
  • 5
19 Comments
 
LVL 85

Expert Comment

by:Mike Tomlinson
ID: 22726830
Do they have to be "random"...or would you be happy with a sequence of values that "count" using your specified characters?

Try this out...just make a Form with a Button and a TextBox.  Click the TextBox repeatedly and see what happens to the value:
Option Explicit
 

Private Sub Form_Load()

    Text1.Text = "A1F3kj87DE1385"

End Sub

 

Private Sub Command1_Click()

    Text1.Text = NextRevision(Text1.Text)

End Sub

 

Private Function NextRevision(Optional ByVal curRevision As String) As String

    Dim i As Integer

    Dim char As String

    Dim chars As String

    Dim index As Integer

    Dim charArray() As String

    

    ' modify the line below to include the characters you

    ' want in the sequence

    chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

    

    ' blank revision passed in, return first letter in sequence

    curRevision = Trim(UCase(curRevision))

    If curRevision = "" Then

        NextRevision = Left(chars, 1)

        Exit Function

    End If

    

    ' make sure current revision has only valid characters in it

    ' otherwise...return first letter in sequence

    For i = 1 To Len(curRevision)

        char = Mid(curRevision, i, 1)

        If InStr(chars, char) = 0 Then

            MsgBox "Returning: " & Left(chars, 1), vbCritical, "Invalid Starting Revision: " & curRevision

            NextRevision = Left(chars, 1)

            Exit Function

        End If

    Next i

    

    ' build a character array from the current revision

    ReDim charArray(Len(curRevision) - 1)

    For i = 1 To Len(curRevision)

        charArray(i - 1) = Mid(curRevision, i, 1)

    Next i

    

    ' see if the just the last letter needs to be "incremented"

    char = charArray(UBound(charArray))

    index = InStr(chars, char)

    If index < Len(chars) Then

        charArray(UBound(charArray)) = Mid(chars, index + 1, 1)

        NextRevision = Join(charArray, "")

        Exit Function

    End If

                

    ' last letter "rolled over"

    ' propagate the "carry over" as far as it needs to go

    charArray(UBound(charArray)) = Left(chars, 1)

    For i = UBound(charArray) - 1 To 0 Step -1

        char = charArray(i)

        index = InStr(chars, char)

        If index < Len(chars) Then

            index = index + 1

            charArray(i) = Mid(chars, index, 1)

            NextRevision = Join(charArray, "")

            Exit Function

        Else

            charArray(i) = Left(chars, 1)

        End If

    Next i

    

    ' "carry over" propagated all the way thru sequence

    ' add first letter of sequence to the beginning of the revision

    NextRevision = Left(chars, 1) & Join(charArray, "")

End Function

Open in new window

0
 

Author Comment

by:linuxrox
ID: 22726951
that's pretty cool.  honestly i'd rather it be random.
like some function i throw in a loop like:

for x = 0 to 30000
somearray() = unique_function()  //which makes somearray() have 30,000 unique strings

next x

your code is nifty though and i like the idea!!  i have some code i was using but i have some problems with it...let me see if i can get it..
0
 

Author Comment

by:linuxrox
ID: 22727000
Try this code below and you'll see that the file that it writes "num.txt" contains duplicates.  after it creates the file just use a text editor like editpad pro or something and sort the results Ascending.

you can then show the duplicates in editpad pro by putting this in the searchbox at the bottom:

^(.*)(\r?\n\1)+$  click "regular expression" at right bottom on editpad pro and it will highlight the duplicates.

Here's what i don't understand about this, if i don't put the values into an array (in this case the array is called id1) and lets say i just print the value from num1 = GenerateCode(32), then i don't get duplicates!  i don't understand that at all!
Option Explicit

Private Sub Form_Load()

Dim num As Long

Dim num1 As String

Dim id1(2000) As String

For num = 0 To 2000
 

num1 = GenerateCode(32)

id1(num) = num1

num1 = ""

Next num

Dim item As Variant

For Each item In id1

Dim nFileNum3 As Integer

nFileNum3 = FreeFile

Open App.Path & "\num.txt" For Append As nFileNum3

Print #nFileNum3, item

Close nFileNum3

Next

End

end sub
 

Public Function GenerateCode(length As Integer)

Dim strInputString As String

Dim intLength As Integer

Dim intNameLength As Integer

Dim strName As String

Dim intStep As Integer

Dim intRnd As Integer

Call Randomize

   strInputString = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" 'these are the characters which will be in the password

  

   intLength = Len(strInputString)

  

   intNameLength = length 'edit this according to how long u want ur password to be

  

   'Call Randomize ' jus to make it random :D

  ' Call Randomize

  

   strName = ""

  

   For intStep = 0 To intNameLength

       intRnd = Int((intLength * Rnd) + 1)

       strName = strName & Mid(strInputString, intRnd, 1)

   Next

  

   GenerateCode = strName

End Function

Open in new window

0
 

Author Comment

by:linuxrox
ID: 22727014
if i just do this below then i don't get duplicates....makes no sense...but i can't do this because i need the values in an array..
Dim num As Long

Dim num1 As String

Dim id1(2000) As String

For num = 0 To 2000

num1 = GenerateCode(32)

Dim nFileNum4 As Integer

nFileNum4 = FreeFile

Open App.Path & "\num.txt" For Append As nFileNum4

Print #nFileNum4, num1

Close nFileNum4

id1(num) = num1

num1 = ""

Next num

Open in new window

0
 
LVL 85

Expert Comment

by:Mike Tomlinson
ID: 22727022
You will get duplicates with that code...if you're not then it is just a matter of not generating enough values and you just got lucky.  =)

Do you really need an Array?  How about a Collection instead?
0
 

Author Comment

by:linuxrox
ID: 22727036
a collection would be fine as well...as long as i can pull a value out of the collection while in another loop you know?

what is it about that code that causes there to be duplicates?  can that code be modified so it wouldn't have them?
0
 
LVL 45

Accepted Solution

by:
aikimark earned 250 total points
ID: 22727045
you can also use a collection in place of an array.
dim somearray(30000) as string

for x = 0 to 30000

   somearray(x) = RandString

next x
 
 

Public Function RandString(Optional parmLen As Long = 30) As String

    Const chars As String = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

    Const charsLen As Long = 62     '=Len(chars)

    Dim strTemp As String

    Dim lngPosn As Long

    strTemp = Space(parmLen)

    For lngPosn = 1 To parmLen

        Mid$(strTemp, lngPosn, 1) = Mid$(chars, Int(Rnd * charsLen) + 1, 1)

    Next

    RandString = strTemp

End Function

Open in new window

0
 
LVL 85

Expert Comment

by:Mike Tomlinson
ID: 22727074
Your GenerateCode() function simply generates a string of random characters of a specified length.  It has no knowledge of what codes were generated BEFORE.  Given a small sample set, it is possible to generate a list with no repeats but there is no guarantee that it will be so.

To make a list with NO repeats you have to check ALL of the values already in the list to make sure you don't add a repeat.

There are two generic algorithms to do this:

(1) Brute Force - Generate a value.  Is it in the list?  Yes - Pick another value.  No - Add it to the List.  Repeat until you have the desired number of values.  Obviously as your list gets bigger and bigger it will take longer and longer to determine if a value has already been "chosen".  An advantage of the Collection object over an Array is that you can add items with a Key and then quickly tell if a given Key already Exists in the Collection.  (A Dictionary object is even  better in this respect.)

(2) Shuffle - Generate a list of "ordered" values.  Iterate over the list and swap each item with a random item from another slot in the list.  This is just like a card deck shuffle.
0
 

Author Comment

by:linuxrox
ID: 22727090
i see.  i ran aikimark's code and i don't see any duplicates.
do you all think this would be a reliable way to do this?
Private Sub Form_Load()

Dim somearray(30000) As String

Dim j As Long

For j = 0 To 30000

   somearray(j) = RandString

Next j

MsgBox UBound(somearray)

Dim item1 As Variant

For Each item1 In somearray

Dim nFileNum5 As Integer

nFileNum5 = FreeFile

Open App.Path & "\num.txt" For Append As nFileNum5

Print #nFileNum5, item1

Close nFileNum5

Next

End

Open in new window

0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 

Author Comment

by:linuxrox
ID: 22727105
Idle_Mind:

why though would i not get any duplicates when i just run the loop and write the file bypassing putting the values into an array?
0
 
LVL 85

Expert Comment

by:Mike Tomlinson
ID: 22727117
Simply put...you got lucky!

Standby...I'll post code that generates a collection with no duplicates.
0
 

Author Comment

by:linuxrox
ID: 22727133
Idle_Mind:  hehe, well honestly, i've never had any dup's when it wasn't in an array!! : )
0
 
LVL 45

Expert Comment

by:aikimark
ID: 22727148
@linuxrox

The VB Rnd function has a period of 2^24, so you can expect to get 559240 unique 30-character strings.  You will get more unique strings if their lengths are reduced.

If you need more than that:
There are 64k unique seeds available via the Randomize statement and you can start the (PRNG) sequence at some offset other than the first pseudo-random number as long as the offset is not equal to your string output length.

Reference (my analysis of the VB PRNG):
http://www.15seconds.com/issue/051110.htm
0
 
LVL 85

Assisted Solution

by:Mike Tomlinson
Mike Tomlinson earned 250 total points
ID: 22727168
Here is a sample that prevents repeat codes:
Option Explicit
 

Private chars As String

Private charsLen As Integer

Private codes As New Collection
 

Private Sub Form_Load()

    Call Randomize

    chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" 'these are the characters which will be in the password

    charsLen = Len(chars)

End Sub
 

Private Sub Command1_Click()

    Set codes = New Collection

    Call GenCodes

    

    ' print a small sample

    Dim i As Integer

    For i = 1 To 10

        Debug.Print codes(i)

    Next i

End Sub
 

Public Sub GenCodes()

    On Error GoTo AlreadyExists

    

    Dim i As Long

    Dim code As String

    For i = 1 To 30000

GetAnotherCode:

        code = GenerateCode(32)

        codes.Add code, code

        DoEvents

    Next

    Exit Sub

    

AlreadyExists:

    Debug.Print "Collision! Generating New Code..."

    Resume GetAnotherCode

End Sub
 

Public Function GenerateCode(ByVal length As Integer) As String

    If length > 0 Then

        While Len(GenerateCode) <> length

            GenerateCode = GenerateCode & Mid(chars, Int(charsLen * Rnd + 1), 1)

        Wend

    End If

End Function

Open in new window

0
 
LVL 45

Expert Comment

by:aikimark
ID: 22727174
If you want to ensure that you have no duplicates, you can use a collection, whose items' key property are unique.
Dim somearray(30000) As String

Dim colThing As New Collection

Dim j As Long

Dim strTemp As String

On Error Resume next

j = 0 

Do

   strTemp = RandString

   colThing.Add "a",strTemp

   If Err = 0 Then

       somearray(j) = strTemp

       j = + 1

   Else

       Err.Clear

   End If

Loop Until J > 30000

Open in new window

0
 

Author Closing Comment

by:linuxrox
ID: 31506545
This is great guys!  i truly appreciate the help on this!! very nice jobs!!
you guys rock!
0
 
LVL 45

Expert Comment

by:aikimark
ID: 22734461
@linuxrox

Thanks for the points and the interesting question.

* If you are using a collection to guarantee string uniqueness, you might as well use the string to hold the strings as well, replacing the somearray.

* If speed is a consideration, you might use the code in the snippet.  It is about 25% faster than the code in the accepted comment.
Public Function RandString_Fast(Optional parmLen As Long = 30) As String

    Static chars(0 To 61) As Byte

    Static b() As Byte

    Static lngPosn As Long

    Static lngLoop As Long

    If chars(0) = 0 Then

        For lngLoop = 0 To 9

            chars(lngLoop) = Asc(lngLoop)

        Next

        lngPosn = 10

        For lngLoop = Asc("a") To Asc("z")

            chars(lngPosn) = lngLoop

            lngPosn = lngPosn + 1

        Next

        For lngLoop = Asc("A") To Asc("Z")

            chars(lngPosn) = lngLoop

            lngPosn = lngPosn + 1

        Next

        lngLoop = 62

        'Note: move the ReDim statement outside this If...EndIf

        '       construct if different lengths will be requested

        '       in a single execution

        ReDim b(0 To parmLen - 1)

    End If

    

    For lngPosn = 0 To parmLen - 1

        b(lngPosn) = chars(Int(Rnd * lngLoop))

    Next

    

    RandString_Fast = StrConv(b, vbUnicode)

End Function

Open in new window

0
 
LVL 45

Expert Comment

by:aikimark
ID: 22736106
I get another performance boost (35% better than accepted solution) with:

    Dim lngPosn As Long
    Dim lngLoop As Long
0
 

Author Comment

by:linuxrox
ID: 22736240
AWESOME!!!  performance boosts rock!!! thanks once again!!
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

Introduction While answering a recent question (http://www.experts-exchange.com/Q_27402310.html) in the VB classic zone, I wrote some VB code in the (Office) VBA environment, rather than fire up my older PC.  I didn't post completely correct code o…
You can of course define an array to hold data that is of a particular type like an array of Strings to hold customer names or an array of Doubles to hold customer sales, but what do you do if you want to coordinate that data? This article describes…
Show developers how to use a criteria form to limit the data that appears on an Access report. It is a common requirement that users can specify the criteria for a report at runtime. The easiest way to accomplish this is using a criteria form that a…
This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now