What is the best representation of the distribution of data


This is a very basic question but I totally forgot high school stuff :/

What is the best way to represent a set of random data? (random numbers)

Average? (Although this is basic, I am looking to do more calculation if possible to get the best fit)

LVL 13
Who is Participating?
d-glitchConnect With a Mentor Commented:
Which month is which?  It matters.

The sales of fruit are typically seasonal, with lots of interest and activity around Fall Harvest time.

The best way to predict next years sales is to use this years sales directly , month by month.

Unless you have other information -- there was a drought or a forest fire, or an apple sauce factory started operation down the street - there is nothing else to do.
there is no single "best" way

the answer will depend on what you are trying to represent and what you want to do with it.
Shanan212Author Commented:
ok the figures are of one year

Say the figures are
Apples     Oranges
12                     9
2                      66
232                  5258
23                    555
53                    68

I am trying to use the figures as forcaster for next year

So I have a price figure x average of apples to get revenue (but average seems not the best option IMO?)
A proven path to a career in data science

At Springboard, we know how to get you a job in data science. With Springboard’s Data Science Career Track, you’ll master data science  with a curriculum built by industry experts. You’ll work on real projects, and get 1-on-1 mentorship from a data scientist.

Standard statistics would be mean (average) and standard deviation (spread around the average).

You can do these calculations (as well as generate random numbers to test) in Excel.
What do those numbers mean?

Numbers of apples, cost of apples, ...?

What does the series of numbers mean?  Is it a function of time?   In weeks, or months, ... ?
Shanan212Author Commented:
Average                     Standard Dev
2.702068966            1.409153394

9.055578427            1.381641658

12.96546873            1.605291381
17.35004248            1.305662328

22.85169104            1.052316898

36.86032306            7.809687104

71.86956522            17.55706286

127.856833              11.12366758

Function of months (montly sales in terms of numbers sold) Should I use the average to forcast next year data? (or is there better figure I can derive - I am concerned because the numbers are so apart/deviates)

The above is the actual figures
"if possible to get the best fit"
best fit to what?
What is the best way to represent a set of random data? (random numbers)"
If you are dealing with random numbers, there is no way to predict the next numbers
Shanan212Author Commented:
When I say best-fit, the closest figure that better represents a month's sale of apples?
If you have several years worth of data, you could average it month by month and maybe get a better prediction.

If the sales are changing over the course of years because of dietary trends or a growing population, you might want to do month by month linear regression.

But you really need to have real numbers.
Shanan212Author Commented:
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.