Popularity algorithm

Posted on 2004-08-26
Medium Priority
Last Modified: 2008-03-03
Hi all,

I have created a small web page for my benefit that will list shortcuts to my favourite websites.  So far so good!  It has a database backend which stores various data including the date and time each time I access a site listed on the page.  The reason for this is that I want to alter the order of the links in each category so that my most popular links appear higher up the list.  Of course I could just count the number of times I have accessed each site but this has two flaws that I can see (perhaps more!):

1.  If I add a new site that I access a lot, it would still appear lower down than some more established, less-popular sites

2. A certain site may become less popular over time but remain nearer the top of the list despite me not accessing so often.

Obviously I could simply count the number of times the sites are accessed in a given period, e.g. over the last 4 weeks, but I was wondering if there is a more efficient algorithm?  I am sure a similar algorithm must exist as this could be used in a wide variety of circumstances, e.g. most played songs in a personal juke-box.

So if you know of any algorithm that solves this, and perhaps gives added weight to more recent 'accesses', and if there are any other issues I should consider, please let me know,


Question by:nevillestyles
  • 3
  • 3
  • 2
LVL 27

Expert Comment

ID: 11903317
You could increment the popularity score by one every time you visit a site.

And then let the score decay over time:  Once a day (or once  a week) multiply all the scores by 0.95 (or some other fraction).

That would give you a weighted average with a memory of 20 days (or 20 weeks).      0.95  =  1 - 1/20

Author Comment

ID: 11903550
Thanks for that, it makes a lot of sense.  But can I run an example by you.  There are 2 sites initially, namely S1 and S2.  After one day their scores are as follows:

S1 - 20
S2 - 40

They are multiplied by 0.95 to get:
S1 - 19
S2 - 38

This continues but the scores are still cumulative so although this seems fair for sites that are added at the same time, it does not allow for newly added sites.  

This is certainly along the right lines but I think it needs a little modification.


LVL 27

Accepted Solution

d-glitch earned 300 total points
ID: 11903796
You have to let things develop over a longer period of time.

Say the initial scores for sites (A,  B,  C)  on Day_1 are  (20, 10, 0)

You never visit A again.  You isit B once  a day.  You visit C twice a day.

The scores will evolve like this.  It takes 8 days for the algorithm to realize that A isn't your favorite any more.
You can make it faster or slower by changeing the .95 up or down.

Day      A      B      C

1      20      10      0
2      19.00      10.50      2.00
3      18.05      10.98      3.90
4      17.15      11.43      5.71
5      16.29      11.85      7.42
6      15.48      12.26      9.05
7      14.70      12.65      10.60
8      13.97      13.02      12.07
9      13.27      13.37      13.46
10      12.60      13.70      14.79
11      11.97      14.01      16.05
12      11.38      14.31      17.25
13      10.81      14.60      18.39
14      10.27      14.87      19.47
15      9.75      15.12      20.49
16      9.27      15.37      21.47
17      8.80      15.60      22.39
18      8.36      15.82      23.28
19      7.94      16.03      24.11
20      7.55      16.23      24.91
21      7.17      16.42      25.66
22      6.81      16.59      26.38
23      6.47      16.76      27.06
24      6.15      16.93      27.71
25      5.84      17.08      28.32
26      5.55      17.23      28.90
27      5.27      17.36      29.46
28      5.01      17.50      29.99
29      4.76      17.62      30.49
30      4.52      17.74      30.96
Build your data science skills into a career

Are you ready to take your data science career to the next step, or break into data science? With Springboard’s Data Science Career Track, you’ll master data science topics, have personalized career guidance, weekly calls with a data science expert, and a job guarantee.


Author Comment

ID: 11903917
Thanks for that, I was being lazy by not going further with my example!  That solution will work a treat!
LVL 13

Expert Comment

ID: 11904208
Already finished but got another solution you could consider:

Add not 1 for each visit but
exp (- (days since visit)).
The exponent will be ~1 for recently visited sites and going towards 0 for visits long ago.
LVL 27

Expert Comment

ID: 11904728
SteH's solution is pretty good.       For one thing it updates automatically.  So you don't have to do the daily decay adjustment.

                                                 It also saves information long term.  With my solution all the sites you stop visiting eventually
                                                 windup tied at zero.

                                                 Try simulating the algorithm in Excel.  Some combination of the two might work out best.



Author Comment

ID: 11912096
Thanks to you both, I have now got a good system that seems to work a treat.  

Is it possible to give SteH some points for your contribution?
LVL 13

Expert Comment

ID: 11912158
You can post a new question "points for SteH" were you refer to this one. Only requirement of EE for this is that the points for this Q and the new Q added are 500 or less.

Featured Post

A proven path to a career in data science

At Springboard, we know how to get you a job in data science. With Springboard’s Data Science Career Track, you’ll master data science  with a curriculum built by industry experts. You’ll work on real projects, and get 1-on-1 mentorship from a data scientist.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

This article covers the basics of data encryption, what it is, how it works, and why it's important. If you've ever wondered what goes on when you "encrypt" data, you can look here to build a good foundation for your personal learning.
There's never been a better time to become a computer scientist. Employment growth in the field is expected to reach 22% overall by 2020, and if you want to get in on the action, it’s a good idea to think about at least minoring in computer science …
This is a video describing the growing solar energy use in Utah. This is a topic that greatly interests me and so I decided to produce a video about it.
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…
Suggested Courses

600 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question