Edited by Shaquil, 07 August 2015 - 06:47 AM.
Jump to content
Posted 17 May 2013 - 02:57 AM
Omae Wa Mou Shindeiru
Posted 19 May 2013 - 03:37 AM
I think you should largely abandon the focus on categories and genres. If users are rating things, then go for some variant of the Netflix model, which looks for correlations between ratings for different products, and uses those correlations to drive recommendations (e.g. people who like X and dislike Y usually like Z; this user likes X and dislikes Y; therefore this user will probably like Z). In short, in your case, users should be rating books, not categories of book.
Shaquil, on 17 May 2013 - 07:06, said:
LorenzoGatti, on 17 May 2013 - 02:04, said:
A recommendation system shouldn't be based on categories; work at the level of individual SKUs. For example, Amazon is very explicit about what it proposes: "customers who bought/browsed this also bought/browsed ...". Recommend the items which are correlated in the real world, not in your abstract and arbitrary category system.
Relax. Amazon's recommendation engine sucks particularly because it only considers what you and other people bought, but has no idea why you bought them. There are many, many factors that contribute to bulk/related purchases that could have nothing to do with an individual user's tastes. The only correlation that matters is in the customer's mind; the real world doesn't matter. Working at the level of individual SKUs has got to be the craziest idea I've ever heard. You do realize that all 15 different Anniversary Editions of Atlas Shrugged are the same book but different SKUs?
Anyway, I'm looking for advice/a point in the right direction on designing the system in general. I came up with one method that I'm going to go with for now, though I don't like it:
When users sign up for the service, they indicate things about themselves. Favorite genre of music/movie/books, their job, their favorite celebrities, etc, etc. Using that info, points will be attributed to appropriate categories of books. When the user loads up the main page where the recommendations will be, they'll see 10 recommendations. Every single recommendation is calculated individually as a random draw. The system adds up all the points, and any individual category's percentage of being chosen is equal to its percentage of the total points. When a category wins, the system chooses a book from that category based on other rules that are irrelevant for now. On the next recommendation, the category that won the previous one is deducted a certain amount of points (its original amount of points will stay the same in the database, but the actual value we're using in the system will be decreased), and all losing categories will gain the same amount of points. The random drawing is then done again. The point is that the categories with 0 points will never get the first recommendation, and rarely get in the next 4 or 5, but might possibly grab the 10th spot at the end. A category with 0 points should have a chance to win because 0 doesn't mean "dislike," it just means "has expressed no interest." Negative points means dislike.
I recognize that the more categories there are, the more drastically the system will alter percentages for each next "drawing," but that's why I'm looking for input/other ideas, because it isn't very ideal.
Posted 19 May 2013 - 03:53 PM
I'm going to say that a genre/category system is fine. Sure, it will be have some disadvantages relative to a direct product correlation system, but the fact is that correlations are only useful on objects you have some data. Genres help you extrapolate that data to new products or products that haven't sold any copies yet, while restricting it to pairs of products a user has rated will limit the useful data to the most popular products - often the exact opposite of what you want.