The present system
The present user rating system, visible under every post as a number, was created to solve a set of problems:
- How do users distinguish the people that should be listened to from the people that shouldn't?
- How do we identify users who are contributing to the site and community?
- How do we identify users who are detracting from the site and community?
These problems were all solvable, but they required a lot of time investment and effort. We wanted to shift away from solutions that relied on users and moderators spending lots of time watching site activity. The solution was to seek to recruit the entire userbase to help solve the problem, by giving everybody a means to indicate who should and shouldn't be listened to. That, in turn, needed some kind of balancing to determine which people were good judges, which is why higher-rated users have a larger effect on the ratings of others than lower-rated users.
It's true that in general, the rating system has worked. The top-rated users are, pretty much uniformly, good contributors to the site. The lowest-rated users are generally incoherent, in(s)ane, and unwanted - though I think that exceptions exist. And users do pay some attention to the ratings of those they read, though only around 1% of registered user accounts actually filter out posts with ratings below a given threshold.
We do definitely see some undesirable behaviour. For example:
- People getting upset about their rating dropping a few points and posting threads about it. This wouldn't happen if people were less sensitive, of course, but we have to face the fact that they are this sensitive. It doesn't help that there's not much one can tell those people except "be nicer."
- Bandwagoning - people voting somebody down partly because they've got a low rating, and That's What This Thread Is All About Anyway. Group dynamics can be bizarre at times.
- People who are great technical contributors, ending up with low(er) ratings because they got a bit ranty in the Lounge, and therefore start to be ignored in technical discussions.
- Similarly, people who are really funny in Lounge threads get high ratings, and then when posting in technical threads perhaps get given more authority and credit than they're due.
- People who get low ratings can have trouble recovering that rating, partly because people aren't inclined to vote low-rated users up, and if the filters are in play then their posts won't even get seen. This usually leads to the low-rated poster either creating a new account (which is a policy violation) or just leaving the site altogether. Sometimes they'll stay and just not care about their rating, but whether or not they care doesn't change the fact that we then have a user who is making positive contributions but has a low rating.
At the heart of the current rating system's design rests a few fundamental assumptions. Firstly, it assumes that if a user is good in any one way recognised by the community, then they're good in all ways - or at least are smart enough to disclaim themselves in areas where they're not good. Secondly, it assumes that users will fully consider a user and the contributions they've made to the site as a whole before rationally rating them. Thirdly, it assumes that users have good ideas about how to respond to changes in their rating - that they don't just keep doing exactly what they've been doing (albeit with an added air of bafflement and indignation) expecting a different result.
It also contributes to a bad philosophical assumption on the part of the user, and that is: that something is right because a particular person said so. Smart users won't read the ratings in this way; but some users will, when given two answers to their question, pick the answer from the higher-rated user because the user is higher-rated rather than because the answer is better.
None of these assumptions are good. They're true enough of the time that we can point to some corroborating accounts and say, "look, the system works!" but that doesn't tell us whether the system works as well as it could do.
I'm the highest-rated user on this site, so it's not something I consider lightly [grin] but in V5 I'm planning to replace the present rating system with an approach that is less susceptible - albeit not totally immune - to the above problems.
The V5 Rating Strategy
Tagging
The first problem I set out to solve was this: How do we make the rating better convey the ways or areas in which a person is good?
The solution to this one seemed fairly obvious. A mechanism by which users can express their support of a person in arbitrary, user-defined categories? Sounds like a job for tagging to me! By letting users tag users as another kind of site content, we go from having a single rating axis, to as many axes as you want - be they subject-area tags like 'Python' or 'object oriented,' or style tags like 'funny' or 'friendly.' Reconciling the different ways users tag content is already something the tagging engine has to do.
Immediately this also defeats the assumption that 'good in one area == good in all areas.' It becomes very easy to identify when a user is participating in something that matches their tags - i.e. when they're talking about what they're good at.
Thanks
How do we defeat the second assumption - that users will think long and hard before selecting tags for a user? In reality, people don't do that - they read one post, have a strong reaction to it, and then rate accordingly; they don't go "well, this post is obnoxious, but maybe the guy's just having a bad day. I'll check out his other stuff to be sure." If we embrace the strong-reaction-to-a-single-post idea instead of denying it, what we get is: Let people express that reaction with a single click, and then aggregate those reactions to get a feeling for where the user is most well received.
The way this'll be implemented will be via a 'thanks' button on every content item that a user can contribute to. It lets you express that strong reaction quickly. Then, over time, the posts that a user is 'thanked' for will start to contribute their tags to the user - if the user receives lots of 'thanks' in threads that are tagged 'Python performance pygame' for example, then they'll start to acquire those tags themselves. This also gives users more feedback on what they're doing right.
Will there be a 'No thanks!' button? I'm not sure, but I think probably not. If you don't like a contribution, just don't thank the author. If it's really necessary, you can still tag the author explicitly, or even report the post to a moderator.
Decay
How do we deal with the fact that a user's expertise will change over time? Maybe they were a game programming guru 10 years ago, but they've not kept up and their advice is out of date now. This is a fairly simple one, actually: have tags 'decay' over time. Tags that are still frequently applied to a user will 'refresh' and will decay more slowly than tags that aren't. This also solves the 'idiot' problem - how to handle people tagging each other as 'idiot' - because if the user stops being an idiot, the tag will fade away; and it mitigates the lack of a 'no thanks' button, because posting without receiving thanks will cause your tags to fade away.
Getting input
How do we get people to actually use this stuff? That's one of the bigger problems with tagging in general. Step one is to make things as easy to use as possible - single-click to 'thanks' a post, two clicks to get to adding more complex tags. Step two is to get users to at least tag their own stuff; users will be encouraged to 'self-assess' by tagging themselves, to tag their own threads and entries, and so on. Step three is to incentivize. Now, there's a limited amount we can do here - we're not about to start paying people to tag content. What you saw in my last post, though, was the 'badges' system in userboxes; what we can quite easily do is grant a badge to people who tagged 100 content items in the past month, or something like that.
Using the output
Lastly, how do we help users find the best possible content, instead of wasting their time with incoherent in(s)anity - without encouraging them to trust an answer just because it's from a highly rated user? This is a balancing act to be sure, because most of the time the best content is produced by the high-rated users.
The first trick here is to make the way that ratings are displayed be subtle; no more four-digit numbers on each post. Instead, we're considering things like changing the background colour of the post, or the thickness of the post border, to indicate when a user is strongly aligned (tagged the same way as) a thread. Making the display subtle in this way will still make the post stand out a little in the thread, without providing such a clear and definitive thing that people can get overexcited about.
What we will probably display clearly on a post is the number of times it's been thanked (perhaps only within the past X weeks). This makes the number that people latch onto be about individual posts, rather than about users, and that's a lot safer - posts are easier to talk about without people taking things personally.
The second trick is to use the information on a broader level to bias search results. When you're searching for content on a particular topic, the search can elevate threads that have good alignment, or that have lots of 'thanked' posts in. This is still sort of acting on this idea that that content will be right 'because a smart person said it,' but by elevating it to the per-thread level instead of the per-post level, lower-rated users will still have a good opportunity to point out when the higher-rated user isn't making sense.
You'll notice I've not talked about 5-star ratings at all so far. We're still deciding exactly how they'll be integrated. The advantage that 5-star ratings offer is that they are coarse; tagging a thread with particular tokens might capture what the thread is about, but maybe you just want to convey some overall impression that the thread is awesome (or terrible), without figuring out exactly which tags would express that; they might be more applicable to, say, gallery entries. They've got their fair share of problems, of course, as comments on my previous post about the rating UI pointed out. We'll have to do some more thinking about them.
Conclusion
The new system doesn't quite solve the problems that the original rating system set out to solve. Instead, it focuses on the deeper problems of how to get the best content into your hands as quickly as possible and how to describe users; they're harder problems, naturally, but I think more worthwhile.
So, what do you think? I expect that quite a lot of people might have strong feelings about this topic [smile]
I love the idea of tags for individuals. I don't know about the aged tags. Yes, there is a point where someone's skills are not as up to date as others, but there are some skills which don't change that much. There are some new techniques, but for the most part C itself hasn't changed much. Just the libraries and tools wrapped around it. I guess it would just matter how fast those tags decay.
I like the flexibility and crowd-sourcing of this system. Yeah, it probably will get a fair amount of chatter as people assign random tags (sux0rsHard!) assuming you allow arbitrary tags. I would consider either moderating tags themselves or allowing people to ignore tags under a certain usage (only used by X different people or something like that).