Solved

Popularity algorithm

Posted on 2004-08-26
8
442 Views
Last Modified: 2008-03-03
Hi all,

I have created a small web page for my benefit that will list shortcuts to my favourite websites.  So far so good!  It has a database backend which stores various data including the date and time each time I access a site listed on the page.  The reason for this is that I want to alter the order of the links in each category so that my most popular links appear higher up the list.  Of course I could just count the number of times I have accessed each site but this has two flaws that I can see (perhaps more!):

1.  If I add a new site that I access a lot, it would still appear lower down than some more established, less-popular sites

2. A certain site may become less popular over time but remain nearer the top of the list despite me not accessing so often.

Obviously I could simply count the number of times the sites are accessed in a given period, e.g. over the last 4 weeks, but I was wondering if there is a more efficient algorithm?  I am sure a similar algorithm must exist as this could be used in a wide variety of circumstances, e.g. most played songs in a personal juke-box.

So if you know of any algorithm that solves this, and perhaps gives added weight to more recent 'accesses', and if there are any other issues I should consider, please let me know,

Cheers,

Nev
0
Comment
Question by:nevillestyles
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
  • 2
8 Comments
 
LVL 27

Expert Comment

by:d-glitch
ID: 11903317
You could increment the popularity score by one every time you visit a site.

And then let the score decay over time:  Once a day (or once  a week) multiply all the scores by 0.95 (or some other fraction).

That would give you a weighted average with a memory of 20 days (or 20 weeks).      0.95  =  1 - 1/20
0
 

Author Comment

by:nevillestyles
ID: 11903550
Thanks for that, it makes a lot of sense.  But can I run an example by you.  There are 2 sites initially, namely S1 and S2.  After one day their scores are as follows:

S1 - 20
S2 - 40

They are multiplied by 0.95 to get:
S1 - 19
S2 - 38

This continues but the scores are still cumulative so although this seems fair for sites that are added at the same time, it does not allow for newly added sites.  

This is certainly along the right lines but I think it needs a little modification.

Cheers,

Nev
0
 
LVL 27

Accepted Solution

by:
d-glitch earned 75 total points
ID: 11903796
You have to let things develop over a longer period of time.

Say the initial scores for sites (A,  B,  C)  on Day_1 are  (20, 10, 0)

You never visit A again.  You isit B once  a day.  You visit C twice a day.

The scores will evolve like this.  It takes 8 days for the algorithm to realize that A isn't your favorite any more.
You can make it faster or slower by changeing the .95 up or down.

Day      A      B      C

1      20      10      0
2      19.00      10.50      2.00
3      18.05      10.98      3.90
4      17.15      11.43      5.71
5      16.29      11.85      7.42
6      15.48      12.26      9.05
7      14.70      12.65      10.60
8      13.97      13.02      12.07
9      13.27      13.37      13.46
10      12.60      13.70      14.79
11      11.97      14.01      16.05
12      11.38      14.31      17.25
13      10.81      14.60      18.39
14      10.27      14.87      19.47
15      9.75      15.12      20.49
16      9.27      15.37      21.47
17      8.80      15.60      22.39
18      8.36      15.82      23.28
19      7.94      16.03      24.11
20      7.55      16.23      24.91
21      7.17      16.42      25.66
22      6.81      16.59      26.38
23      6.47      16.76      27.06
24      6.15      16.93      27.71
25      5.84      17.08      28.32
26      5.55      17.23      28.90
27      5.27      17.36      29.46
28      5.01      17.50      29.99
29      4.76      17.62      30.49
30      4.52      17.74      30.96
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:nevillestyles
ID: 11903917
Thanks for that, I was being lazy by not going further with my example!  That solution will work a treat!
0
 
LVL 13

Expert Comment

by:SteH
ID: 11904208
Already finished but got another solution you could consider:

Add not 1 for each visit but
exp (- (days since visit)).
The exponent will be ~1 for recently visited sites and going towards 0 for visits long ago.
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 11904728
SteH's solution is pretty good.       For one thing it updates automatically.  So you don't have to do the daily decay adjustment.

                                                 It also saves information long term.  With my solution all the sites you stop visiting eventually
                                                 windup tied at zero.

                                                 Try simulating the algorithm in Excel.  Some combination of the two might work out best.


                                           
                                           

0
 

Author Comment

by:nevillestyles
ID: 11912096
Thanks to you both, I have now got a good system that seems to work a treat.  

Is it possible to give SteH some points for your contribution?
0
 
LVL 13

Expert Comment

by:SteH
ID: 11912158
You can post a new question "points for SteH" were you refer to this one. Only requirement of EE for this is that the points for this Q and the new Q added are 500 or less.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Article by: Nicole
This is a research brief on the potential colonization of humans on Mars.
When we purchase storage, we typically are advertised storage of 500GB, 1TB, 2TB and so on. However, when you actually install it into your computer, your 500GB HDD will actually show up as 465GB. Why? It has to do with the way people and computers…
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

697 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question