I'm generating a dataset, and as I am doing so I am calculating the standard deviation of that data set. I want to know how to determine when my data set hits the Normal Distribution mark. That is where 68% of the dataset is with the first standard deviation, 95 in the second, and 99 in the third.
Wikipedia defines the Normal distribution as having a Variance of 1. I don't see the logical in this statment. My dataset starts with some number of integers all equaling the mean. It then loops through a function slowly randomly expanding the values & recalculating the standard deviation. This goes on unil the standard deviation reaches some pre-defined limit.
Example would be having 100 integers. Say the mean is 100 so at the start are the values are set to 100. We then loop with these steps:
1) randomly selecting some of the integers for movement outward from the mean, both positive and negative (the values also have a range of 100 so then cannot move beyond 0 or 200).
Graphing the results I get nice distributions, but they definately change based on the standard deviation threshhold. what I want to know is how to calculate where my standard deviation threshold should be in order to get the "Normal Distribution" as specified above.
2) recalcuate the current standard deviation. If the deviation exceeds my limit the loop breaks.