monoceros
asked on
Data Structure and approach for recommendations system
Hi,
I am working on a simple recommendations system. Conceptually the bare bones structure (using relational database terms because thats what I am familiar with) is:
fact_table:
score
recommender_id
item_id
I then want to, for a given recommender_id rank all other recommender_ids by their similarity (e.g. if recommender_id 1 scored item_id 1 as 10 and item_id 100 as 0 then another recommender ranking 1 as 9 and 100 as 1 will rank higher than a recommender scoring 10 for 1 and 8 for 100).
I expect this to be a quite large sparse matrix, possibly tens of thousands on each axis.
I would like to get the approach right from the start and would like recommendations for how to best structure the data and which language / tools to use.
I am working on a simple recommendations system. Conceptually the bare bones structure (using relational database terms because thats what I am familiar with) is:
fact_table:
score
recommender_id
item_id
I then want to, for a given recommender_id rank all other recommender_ids by their similarity (e.g. if recommender_id 1 scored item_id 1 as 10 and item_id 100 as 0 then another recommender ranking 1 as 9 and 100 as 1 will rank higher than a recommender scoring 10 for 1 and 8 for 100).
I expect this to be a quite large sparse matrix, possibly tens of thousands on each axis.
I would like to get the approach right from the start and would like recommendations for how to best structure the data and which language / tools to use.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks - thats eloquent and looks straightforward (and happens to be in Java, which I do know)! For some reason I was blinkered with a view that "this is math, its big(ish) data - therefore I need one of the fancier newer languages" (partly driven by a desire to learn them!) but this is clean and can get me to a MVP without any steep language learning curve.
Cheers,
Mono.