Hi all
I try and keep it as simple as i can. Basically i am working on a piece of code the generates 4 random numbers (let these numbers be a, b, c and d). Also note that the numbers will always appear in this order. Numbers a through c are random number between 01 and 99 and d is between 0 and 9. Now the way that this code works is it creates one string that contains the 4 number as they are each generated. For example lets say this first string is 1947299, a = 19, b = 47, c = 29 and d = 9.
Now as the string is created it is enters it into the database and the program generates another string (constructed the same way as the first is, and again randomly generating each number in the sequence) and that compares that string to the rest of the generated strings in the database.
Now in doing this i can create the program fine thats not my problem, my problem is that i would have thought that the chances of getting the same string (which remember is constructed of 4 randomly generated number) appearing twice would have been a fairly remote prospect (lets say about 1 in 10 000 000 chance, which is would be great). Because for the problem at hand i cant have the one sequence appearing more that once in say a run of 5000 strings.
Now when i sat down and ran some tests i was blown away by how often a randomly created string (which is shown to be constructed buy the about) appears.
For example i ran 500 test cases. Each test case consisted of rerunning a script that ran though the process of generating string and kept going until it found one that was repeated twice. Of those 500 tests, 3 finished with the results of ending with less that 100 strings generated before a duplicate was created. A further 50 ended with less that 500 strings generated before a duplicate was created. Then the next 165 ended before 1000 strings generated before a duplicate was created.
In short in all of these test cases NONE of these managed to get above 5000 before finding a duplicate. In fact only 25% of the test cases managed to get above 2500 strings with the other falling under 2500. One would thing if you had a dice with 10 000 000 sides the chances of rolling the dice 4 times and getting number a, b, c, and d the chances of getting this same exact sequence appearing again anytime soon would be pretty remote.
SO does anyone have any idea on how to create a piece that will be able to do the above with a minimum of 5000 strings created before creating a duplicate.
Hope you can help
ant
Start Free Trial