Alternatives to Rand() or mt_rand() - interested in perspectives.

I have an app that I'm developing.  It's essentially a voting app - where user submissions are reviewed/voted upon/etc.

On the server side, when I originally began developing it, I figured the most democratic way of approaching submission selection would be through a randomized function like rand() or mt_rand().  In practice, however, both are turning out to be pretty crappy solutions since it's not true randomization.  We currently have about 300 submissions - and as we're doing testing it seems like 40-50 of those entries are disproportionately "randomly" selected relative to all of the other ones.  

So I'm thinking about approaching it a different way - each of which has it's potential pitfalls.

#1 - return to the app an array of all possible submissions - and have the app just iterate through them.
Pros of this: Every entry - provided the user keeps progressing - has a shot of being reviewed.
Cons of this: The array could potentially get quite large...which would mean at some point I'd have to segment it up (maybe only return a list of 100 possible candidates for review later on).
#2 - Instead of returning the list of possible candidates - handle the selection via sessions.
Pros of this: Every entry - provided the user keeps progressing - has a shot of being reviewed.
Cons of this: A huge number of open sessions possible.

What would you do in such a scenario?  I've read that large numbers of sessions being open simultaneously can cause memory problems...  How big of a concern ought that be?  Or - is there a different way you can picture to approach this issue?  :)

Thanks!
LVL 2
erzoolanderAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Ray PaseurCommented:
The array could potentially get quite large...
That would seem to be a good problem, indicative of high popularity, right?

Let me put together a little script to show some ways of thinking about randomization.  If you're getting too much predictability from the PHP rand() and mt_rand() functions you may want to consider using shuffle() instead.
Ray PaseurCommented:
Please see: http://iconoun.com/demo/temp_errzoolander.php

I don't see any great difference in the randomization from these three algorithms, and randomization may not be your best approach.  If your goal is to distribute the services of the voting public evenly across the user submissions, it would seem that a database table of user submissions would be useful.  The table could contain a count of how many times the submission had been exposed for voting.  You can SELECT from this table and order the results set by the number of exposures.  As each submission is presented for voting, increment the number of exposures by one.  This will help ensure that no accidental extra attention is given to a subset of the submissions, since it will always be choosing the submissions that have been seen the least.

Does that make sense for your needs?

<?php // demo/temp_errzoolander.php
/**
 * http://www.experts-exchange.com/questions/28706188/Alternatives-to-Rand-or-mt-rand-interested-in-perspectives.html
 *
 * Show some ways of observing randomization
 */
error_reporting(E_ALL);
echo '<pre>';

// A SHORT RANGE OF DATA ELEMENTS
function data_elements()
{
    $letters = range('A', 'Z');
    $numbers = array_fill(0, count($letters), 0);
    $samples = array_combine($letters, $numbers);
    return $samples;
}

echo PHP_EOL . PHP_EOL . 'Using rand()';
$ltr = range('A', 'Z');
$dat = data_elements();
$num = 1;
$max = 2600;
while ($num <= $max)
{
    $ptr = rand(0, 25);
    $dat[$ltr[$ptr]]++;
    $num++;
}
foreach($dat as $key => $val)
{
    echo PHP_EOL;
    echo "<b>$key</b> ";
    echo str_pad($val, 4, ' ', STR_PAD_LEFT);
    echo ' ';
    echo str_repeat('*', $val);
}

echo PHP_EOL . PHP_EOL . 'Using mt_rand()';
$ltr = range('A', 'Z');
$dat = data_elements();
$num = 1;
$max = 2600;
while ($num <= $max)
{
    $ptr = mt_rand(0, 25);
    $dat[$ltr[$ptr]]++;
    $num++;
}
foreach($dat as $key => $val)
{
    echo PHP_EOL;
    echo "<b>$key</b> ";
    echo str_pad($val, 4, ' ', STR_PAD_LEFT);
    echo ' ';
    echo str_repeat('*', $val);
}

echo PHP_EOL . PHP_EOL . 'Using shuffle()';
$ltr = range('A', 'Z');
$dat = data_elements();
$num = 1;
$max = 2600;
while ($num <= $max)
{
    $xyz = $ltr;
    shuffle($xyz);
    $ptr = array_pop($xyz);
    $dat[$ptr]++;
    $num++;
}
foreach($dat as $key => $val)
{
    echo PHP_EOL;
    echo "<b>$key</b> ";
    echo str_pad($val, 4, ' ', STR_PAD_LEFT);
    echo ' ';
    echo str_repeat('*', $val);
}

Open in new window

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
erzoolanderAuthor Commented:
Actually that's a really good idea (the counting of the number of times each element has been presented - and then just using the select order (by the number) to pick.  I got so wrapped up in the predictability of rand()/etc that this didn't even occur to me.

Gracias - you rock ;)

E
Why Diversity in Tech Matters

Kesha Williams, certified professional and software developer, explores the imbalance of diversity in the world of technology -- especially when it comes to hiring women. She showcases ways she's making a difference through the Colors of STEM program.

Ray PaseurCommented:
Thanks for the points and thanks for using -E -- it's a great question! ~Ray
erzoolanderAuthor Commented:
As an addendum in the event anyone else uses this type of solution - it occurred to me that new entries need to be "caught up" with their view number...else they will get extraordinary precedent/priority.  Think where everyone else has received 100 impressions and the new one sits at zero.

If there's only one person using the app they'll see that same entry 100 times.

What I've done is on new entries just given it an initial view count as the highest one so far.  That way they're caught up.
Ray PaseurCommented:
If there's only one person using the app they'll see that same entry 100 times.
A common design for something like this is a junction table that allows clients to see all the resources, but when the client has already voted, the entry is marked as "read" or similar.  Most email uses this sort of design.  This allows the client to ask, "What's new?" and get a good answer without having to look at the same entry many times.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
PHP

From novice to tech pro — start learning today.