?
Solved

Popularity algorithm

Posted on 2004-08-26
8
Medium Priority
?
504 Views
Last Modified: 2008-03-03
Hi all,

I have created a small web page for my benefit that will list shortcuts to my favourite websites.  So far so good!  It has a database backend which stores various data including the date and time each time I access a site listed on the page.  The reason for this is that I want to alter the order of the links in each category so that my most popular links appear higher up the list.  Of course I could just count the number of times I have accessed each site but this has two flaws that I can see (perhaps more!):

1.  If I add a new site that I access a lot, it would still appear lower down than some more established, less-popular sites

2. A certain site may become less popular over time but remain nearer the top of the list despite me not accessing so often.

Obviously I could simply count the number of times the sites are accessed in a given period, e.g. over the last 4 weeks, but I was wondering if there is a more efficient algorithm?  I am sure a similar algorithm must exist as this could be used in a wide variety of circumstances, e.g. most played songs in a personal juke-box.

So if you know of any algorithm that solves this, and perhaps gives added weight to more recent 'accesses', and if there are any other issues I should consider, please let me know,

Cheers,

Nev
0
Comment
Question by:nevillestyles
  • 3
  • 3
  • 2
8 Comments
 
LVL 27

Expert Comment

by:d-glitch
ID: 11903317
You could increment the popularity score by one every time you visit a site.

And then let the score decay over time:  Once a day (or once  a week) multiply all the scores by 0.95 (or some other fraction).

That would give you a weighted average with a memory of 20 days (or 20 weeks).      0.95  =  1 - 1/20
0
 

Author Comment

by:nevillestyles
ID: 11903550
Thanks for that, it makes a lot of sense.  But can I run an example by you.  There are 2 sites initially, namely S1 and S2.  After one day their scores are as follows:

S1 - 20
S2 - 40

They are multiplied by 0.95 to get:
S1 - 19
S2 - 38

This continues but the scores are still cumulative so although this seems fair for sites that are added at the same time, it does not allow for newly added sites.  

This is certainly along the right lines but I think it needs a little modification.

Cheers,

Nev
0
 
LVL 27

Accepted Solution

by:
d-glitch earned 300 total points
ID: 11903796
You have to let things develop over a longer period of time.

Say the initial scores for sites (A,  B,  C)  on Day_1 are  (20, 10, 0)

You never visit A again.  You isit B once  a day.  You visit C twice a day.

The scores will evolve like this.  It takes 8 days for the algorithm to realize that A isn't your favorite any more.
You can make it faster or slower by changeing the .95 up or down.

Day      A      B      C

1      20      10      0
2      19.00      10.50      2.00
3      18.05      10.98      3.90
4      17.15      11.43      5.71
5      16.29      11.85      7.42
6      15.48      12.26      9.05
7      14.70      12.65      10.60
8      13.97      13.02      12.07
9      13.27      13.37      13.46
10      12.60      13.70      14.79
11      11.97      14.01      16.05
12      11.38      14.31      17.25
13      10.81      14.60      18.39
14      10.27      14.87      19.47
15      9.75      15.12      20.49
16      9.27      15.37      21.47
17      8.80      15.60      22.39
18      8.36      15.82      23.28
19      7.94      16.03      24.11
20      7.55      16.23      24.91
21      7.17      16.42      25.66
22      6.81      16.59      26.38
23      6.47      16.76      27.06
24      6.15      16.93      27.71
25      5.84      17.08      28.32
26      5.55      17.23      28.90
27      5.27      17.36      29.46
28      5.01      17.50      29.99
29      4.76      17.62      30.49
30      4.52      17.74      30.96
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 

Author Comment

by:nevillestyles
ID: 11903917
Thanks for that, I was being lazy by not going further with my example!  That solution will work a treat!
0
 
LVL 13

Expert Comment

by:SteH
ID: 11904208
Already finished but got another solution you could consider:

Add not 1 for each visit but
exp (- (days since visit)).
The exponent will be ~1 for recently visited sites and going towards 0 for visits long ago.
0
 
LVL 27

Expert Comment

by:d-glitch
ID: 11904728
SteH's solution is pretty good.       For one thing it updates automatically.  So you don't have to do the daily decay adjustment.

                                                 It also saves information long term.  With my solution all the sites you stop visiting eventually
                                                 windup tied at zero.

                                                 Try simulating the algorithm in Excel.  Some combination of the two might work out best.


                                           
                                           

0
 

Author Comment

by:nevillestyles
ID: 11912096
Thanks to you both, I have now got a good system that seems to work a treat.  

Is it possible to give SteH some points for your contribution?
0
 
LVL 13

Expert Comment

by:SteH
ID: 11912158
You can post a new question "points for SteH" were you refer to this one. Only requirement of EE for this is that the points for this Q and the new Q added are 500 or less.
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Foreword (May 2015) This web page has appeared at Google.  It's definitely worth considering! https://www.google.com/about/careers/students/guide-to-technical-development.html How to Know You are Making a Difference at EE In August, 2013, one …
This article covers the basics of data encryption, what it is, how it works, and why it's important. If you've ever wondered what goes on when you "encrypt" data, you can look here to build a good foundation for your personal learning.
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…
I've attached the XLSM Excel spreadsheet I used in the video and also text files containing the macros used below. https://filedb.experts-exchange.com/incoming/2017/03_w12/1151775/Permutations.txt https://filedb.experts-exchange.com/incoming/201…
Suggested Courses

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question