• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 707
  • Last Modified:

Best method/formula/statistic to compare two sets of numbers.

I have 2 items , itemA and ItemB. I have experts who choose which items is the best. However, I am more interested in quality then quantity. I have a past prediction success rate as percentage of time the particular expert is "correct" 0 -100%.

I want to find out the best method/formula/statistic to compare the two items.

Examples (items, expert success percentage)

ItemA: 0.8,0.75,0.4 - 3 votes average .65
ItemB:0.9,0.77,0.58,0.5,0.8,0.9,0.4,0.5,0.3 -  9 votes average ~.63

Because itemB had more votes, its average was reduced, however it had most successful experts vote for it.

The solution should not require me defining "success", like defining success as .7 and averaging  or counting only experts > .7.

This example is simple, it reality the number of experts vary a lot more 120 versus 2000 experts for 2 items.
0
surfsideinternet
Asked:
surfsideinternet
  • 6
  • 3
  • 2
  • +1
1 Solution
 
aburrCommented:
If you want to give more weight to "successful" experts, you will have to define "success".
Having defined success you can the weight the more successful responses.
In general, the more votes, the more confidence
0
 
surfsideinternetAuthor Commented:
In this case, the more votes not the more confidence, assume the people voting have to vote for one or the other. That one item is just more well known then the other. The more votes would suggest popularity or recognition more then the "best".

Any way to weigh by the success percentage without penalizing for having more voters who are not that successfull?  I guess I would define success and having a higher success percentage, 2 voters 1 having a success pct of 75% is better then one  with 65%. But they could just as easily be 35% to 25%, no way to define success as X%, more importantly is the difference in *number of votes*.
0
 
aburrCommented:
I find it difficult to know exactly what you know and what you want to do
but there is a way to wight the votes according to the percent success rate (which you know already.)
-
multiply each vote by the percent success rate. ie if the rate is 0.67 multiply that vote by 67 so that that particular expert is given 67 votes. Be sure to divide the vote total by the whole number of votes cast not just the number of people voting.
0
Cloud Class® Course: Microsoft Exchange Server

The MCTS: Microsoft Exchange Server 2010 certification validates your skills in supporting the maintenance and administration of the Exchange servers in an enterprise environment. Learn everything you need to know with this course.

 
rkursemCommented:
You could also calculate the standard diviation and use that as a measure of consistency. Low standard diviation means the expert provides equally good answers.

To address your question problem, you might want to consider to remove outliers, e.g., the most extreme 5% in both ends (best and worst).
0
 
NerdsOfTechTechnology ScientistCommented:
Lets try a formula:
("above" or equal to average / count) = score
above 0.50 mark is a PASS; otherwise FAIL.

ItemA: 0.8,0.75,0.4 - 3 votes average .65
(2 above / 3) = 0.66 PASS

ItemB:0.9,0.77,0.58,0.5,0.8,0.9,0.4,0.5,0.3 -  9 votes average ~.63
(4 above / 9) = 0.44 FAIL

ItemC: 0.9, 0.8, 0.7, 0.6, 0.5 average 0.7
(3 above / 5) = 0.60 PASS

ItemD: 0.3 average 0.3
1 above / 1 = 1.00 PASS

ItemE: 0.9, 0.8 average = 0.85
1 above / 2 = 0.50 PASS

ItemF: 0.7, 0.3, 0.1 average = 0.36
1 above / 3 = 0.33 FAIL
0
 
NerdsOfTechTechnology ScientistCommented:
In my scenario you could give a weighted factor to the above side:
for example x1.2 linear. Keep in mind with weight the average can be above 1.00 etc.

ItemA: 0.8,0.75,0.4 - 3 votes average .65
(2 above *1.2 / 3) =
2.4/3=0.8 PASS

ItemB:0.9,0.77,0.58,0.5,0.8,0.9,0.4,0.5,0.3 -  9 votes average ~.63
(4 above *1.2 / 9) =
4.8/9= 0.53 PASS
0
 
surfsideinternetAuthor Commented:
aburr: yes, the weight votes gives itemA .698 and itemB .697 it makes them both identical in this case.

rkursem: been awhile with std dev. itemA avg .65 dev. .217 itemB itemB avg .63 dev .211. I thought dev was range from mean but then should itemA dev. be .25 (.65 - .4 = .25)? Not sure how std dev would help me choose or rank the items.

NerdsOfTech: itemB is still getting "punished" for low experts voting for it in the (/ count) portion and in determining the average to compare with.
0
 
NerdsOfTechTechnology ScientistCommented:
I agree with aburr's approach. This gives immediate weight to the vote logarithmically -- which I think is what you are after.

0
 
NerdsOfTechTechnology ScientistCommented:
this allows immediate reward vs immediate punishment to the result
0
 
NerdsOfTechTechnology ScientistCommented:
formula:
( 
 (a_1*(a_1*100))+(a_2*(a_2*100))+...(a_n*(a_n*100))
)
 
/ 

(
 (a_1+a_2+a_n)*100
)

Open in new window

0
 
NerdsOfTechTechnology ScientistCommented:
:)
( 
 (a_1*(a_1*100))+(a_2*(a_2*100))+...(a_n*(a_n*100))
)
 
/ 

(
 (a_1+a_2+...a_n)*100
)

Open in new window

0
 
surfsideinternetAuthor Commented:
Answer didn't give me fully what I was looking for, but it was the closest.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

  • 6
  • 3
  • 2
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now