Most people don’t realize that Wikipedia is so good because it has a a legion of machine learning bots snuffing out bad content 24/7.
I wonder if the level of online discourse on Less Wrong, Reddit, or even mainstream places like NYTimes, ESPN, or YouTube could be improved greatly by having more adaptive, semi-automated filtering in general?
For instance, on Less Wrong, perhaps a few different bots/algorithms could pre-score the karma a post is predicted to get. Then the scores are aggregated and anything below a certain threshold gets held back (from everyone except the poster) until it has been manually reviewed by moderators who have to vote on likely karma scores of the post before they can release it (if most mods vote positive)? Then mods would have their karma score on the line and get karma added/removed from their score based on how great the differential between the bots and their judgements were. And if you see a way to get more karma with some automated rule, add your own bot to the system and collect the karma.
12 Responses to “This machine kills trolls”
February 28
Michael WitbrockLouie Helm As I understand it, the NYT already does automated curation to some extent. Evan Sandhaus can you elaborate at all?
February 28
Matthew GravesMy roommate and I were just talking about a predictive karma bot for LW.
My suspicion is that it’s primarily good at filtering trolls, but not very good at telling the difference between low positive karma posts and high positive karma posts, and the kinds of high positive karma posts it *can* identify are not the ones that you want to encourage. (“This looks like an HPMOR joke; +30!”)
February 28
Alexei AndreevYeah, there is an underlying assumption that often gets violated: that upvotes correlate with that thing that we want to promote in the online community.
February 28
Jonathan WeissmanMaybe separate karma tracking insightfullness/accuracy/usefulness from a flagging system for content that is uncivil/off topic and should be removed. StackExchange sites use a system like that.
February 28
Louie HelmThis also doesn’t remove concern trolls since they get fed karma for their “loyal opposition”.
February 28
Tarn Somervell FletcherI think that karma for main posts is much more predictive of value than for comments. I also think that a predictive karma bot would be worse than not having one. It’s main utility would be anti-outright trolls, which LW doesn’t (seem) to have a huge problem with.
February 28
Nick PisarroPlayed with this sort of things quite a few times. Nearly impossible to label good content, easier to label really bad content (and the latter is what the bots in the article are doing). Good content often has countless indistinguishable counterparts with 1 vote and a few won due to unpredictable network effects. So you end up predicting close to 1 vote for everything. What makes content interesting is often relevance to some background knowledge. Popularity is black-swanny.
February 28
Abram DemskiThe goal is to estimate the quality of posts, and the information available includes post text and who upvoted it. In response to Nick Pisarro’s observation, I suggest that part of the problem is confusing quality prediction with upvote prediction. Ideally the posts with little voting activity would be treated as unknown. (Information on how many visits a post has gotten could help here.) To avoid aggregating dumbness (such as upvoting HPMOR jokes), Louie Helm’s suggestion of using mods as a gold standard seems appropriate. An alternative would be to offer each user a custom view, using their own upvotes as their personal gold standard. This would be more scaleable (suited to facebook-size sites rather than LW-size), since it allows for accommodation of many niche groups.
February 28
Abram Demski(I am not claiming any of this really solves the problem of “easy to find bad, hard to find good”, especially when the quality of the best stuff is based on nuanced argument rather than easy-to-spot word-inclusion stuff.)
February 28
Nick PisarroActually now that I think of it I once wrote a machine learning based bot to find quality less wrongish content and post it to reddit. Could share with anyone if interested
February 28
Abram DemskiInterested.
March 5
Nick Pisarrouploaded to github: https://github.com/grandinquisitor/autobrigenbot