wazdaka
asked on
Getting a value from Standard Dev, and Mean
Hi all,
I have the Mean and Standard Deviation of a variable. This variable is cost, and this will vary for each entity. Using the mean and SD can someone tell me how I would create a random number based on lognormal.
For example if the SD is 1 and the mean is 50, I would expect the result to produce (mainly) between 49 and 51. (I understand that this isnt always true by the very nature of the curve).
I am making the tool in Java, which contains method nextGaussian() to return a random number normal distrubution between 0 and 1.
Many thanks,
H
I have the Mean and Standard Deviation of a variable. This variable is cost, and this will vary for each entity. Using the mean and SD can someone tell me how I would create a random number based on lognormal.
For example if the SD is 1 and the mean is 50, I would expect the result to produce (mainly) between 49 and 51. (I understand that this isnt always true by the very nature of the curve).
I am making the tool in Java, which contains method nextGaussian() to return a random number normal distrubution between 0 and 1.
Many thanks,
H
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
nextGaussian() should give you a normal curve centered around zero with SD = 1
Multiplying by [Your SD] spreads (or shrinks) the width of the curve.
Adding [Your Mean] shifts the curve up to where you want it centered.
You may have to check for negative values: If your mean was 5 and your SD is one, nextGaussin() will return values <5 once every million calls.
Multiplying by [Your SD] spreads (or shrinks) the width of the curve.
Adding [Your Mean] shifts the curve up to where you want it centered.
You may have to check for negative values: If your mean was 5 and your SD is one, nextGaussin() will return values <5 once every million calls.
You may have to check for negative values: If your mean was 5 and your SD is one, nextGaussin() will return values < -5 once every million calls.
That's negative 5.
That's negative 5.
ASKER
Perfect, thank you very much. I should always be shifting the curve up enough to not worry about negatives bu I'll take the absolute value of the result just in case.
Thanks again
H
Thanks again
H
A lower level, more work but more flexibility, approach is to generate a uniformly distributed random number and then project it onto the gaussian distribution to find the corresponding gaussian coordinate. This isn't too hard to do:
Generate a uniformly distributed number x where 0 <= u < 1
Set the number of standard deviations you want to cover in your gaussian: Nsd = 40
Set a left bound for the range of your Gaussian distribution: gl = m - sd*Nsd
Set a right bound for the range of your Gaussian distribution: gr = m + sd*Nsd
Set the number of intervals to divide the Gaussian distribution into: N = 200
Set the initial coordinate to g = gl
Set the interval to dg = (gr - gl)/N
Write a subroutine to return the height of the gaussian given the coordinate value g (as well as the mean and standard deviation) = G(g)
Now start at the left bound and begin to sum the value of the Gaussian curve at coordinate g = gl. Increment g = gl + dg and accumulate G(g) into the sum. So far we have sum = G(gl) + G(gl + dg) ...
Continue this summation until the sum is greater than or equal to u (your uniformly distributed random number). The last value of g used is the gaussian coordinate sought.
The idea here is that the integration over the guassian distribution gives the probability of occurence of coordinate less than or equal to g. We simply sum here as our approximate integration.
This technique can be applied to any distribution you need, not just the gaussian.
Generate a uniformly distributed number x where 0 <= u < 1
Set the number of standard deviations you want to cover in your gaussian: Nsd = 40
Set a left bound for the range of your Gaussian distribution: gl = m - sd*Nsd
Set a right bound for the range of your Gaussian distribution: gr = m + sd*Nsd
Set the number of intervals to divide the Gaussian distribution into: N = 200
Set the initial coordinate to g = gl
Set the interval to dg = (gr - gl)/N
Write a subroutine to return the height of the gaussian given the coordinate value g (as well as the mean and standard deviation) = G(g)
Now start at the left bound and begin to sum the value of the Gaussian curve at coordinate g = gl. Increment g = gl + dg and accumulate G(g) into the sum. So far we have sum = G(gl) + G(gl + dg) ...
Continue this summation until the sum is greater than or equal to u (your uniformly distributed random number). The last value of g used is the gaussian coordinate sought.
The idea here is that the integration over the guassian distribution gives the probability of occurence of coordinate less than or equal to g. We simply sum here as our approximate integration.
This technique can be applied to any distribution you need, not just the gaussian.
Oops:
Generate a uniformly distributed number x where 0 <= u < 1
should be
Generate a uniformly distributed number u where 0 <= u < 1
Generate a uniformly distributed number x where 0 <= u < 1
should be
Generate a uniformly distributed number u where 0 <= u < 1
Oops:
Continue this summation until the sum is greater than or equal to u (your uniformly distributed random number).
should be
Continue this summation until the sum*dg is greater than or equal to u (your uniformly distributed random number).
Continue this summation until the sum is greater than or equal to u (your uniformly distributed random number).
should be
Continue this summation until the sum*dg is greater than or equal to u (your uniformly distributed random number).
ASKER