Tuesday, December 16, 2008

Users lie

When we had a tech support operation, our former product dictator trained the tech support people with the maxim: “users lie.” I thought it was a bit extreme, but then this manager liked extreme rhetoric for effect.

The underlying truth is that when handling a tech support calls, the information from users is not 100% reliable. (As someone now on the other side again, I can certainly see that.) Users may leave something out and tell the story slightly inaccurately. Or they might deliberately leave out relevant information:

Q: When did your cell phone stop working?
A: I dunno, it just won’t turn on anymore.
instead of
Q: When did your cell phone stop working?
A: When I dropped it on the concrete sidewalk.
(I would have used the example of dropping in water, but cellphone companies now test for that.)

Michael Mace highlights another, curious example of deliberate misrepresentation — political retaliation over Proposition 8 (the initiative repealing the June court decision legalizing gay marriage). In this case, users can lie when generating user-generated content.

The why is pretty simple: Opponents of Prop 8 want to retaliate against any individual or business that supported Proposition 8. The example he uses is a Mexican restaurant where the manager gave $100 to Prop 8 and then gay marriage advocates retaliated on Yelp. (Of course, this might extend to other social or political controversies, such as a non-union grocvery store)

It seems to me there are at least three cases:
  1. Customers who complain about a business but then give ta numerical rateing that the store aotherwise deserved. While iritating, this does proviade additional information that might be important to some customers. The risk, of course, is if the sample of activist-complainers is biased: pro gay marirage types provide information but pro life (anti abortion) types do not.
  2. Customers who complain about the busienss polciies and then give a dishonest rating of the quality of the firm’s goods or services in a deliberate attempt to lower the numeric scores. Mike shows a lot of this going on in this case
  3. Customers who completely lie; the hyperbole of this example that Mike quotes is emblematic:
  4. “I’ve been here a few times, and this is without a doubt the worst place I know of in California. Do not go here unless you don’t mind bad food, high prices, a horrible vibe, and can turn your back on risks of food poisoning and human rights and health code violations!”
The more you try to stamp out #1 and #2 (which are recognizable), the more you get #3 (which are harder if not impossible to recognize.

Mike writes not about the implications for restaurants, but for the reliability of user rating sites. The problem is, such sites were never all that reliable in the first place. For example, I went searching for a hotel for a possible (now unlikely) Hawai‘i trip, and found a few places where the only rating was an anonymous 5-star with no comments; presumably these are posted by the owners or their friends. In fact, hotel sites in general seem to have unreliable ratings.

Yelp could go for weighting the contributions of users (useful/not useful) as does Amazon, but then this could becoming an ideological war of fellow travelers vs. enemies. Look at the reviews of any politically or socially controversial book on Amazon — any book (or comment) that takes a stand will become a battleground for ideologies rather than the merits of the book.

Some e-commerce sites will only allow you to rate what you’ve bought (and they can verify that). Although not applicable to independent ratings site, that does reduce abuse. However, the reduction of quantity means little or no feedback on thinly-traded goods and services.

In the end, this is exactly the same problem as Wikipedia or any other site built upon user-generated content. The site designers generally assume altruistic self-disclosure, and fall down (if not fail miserably) at any systematic source of bias in the evaluations. The goal of encouraging maximal coverage through maximal participation is in direct conflict with having control over the quality. The problem is magnified by fragmentation of contributions across many sites, thus increasing the pressure to attract contributions (of unknown quality) at all costs.

The incentives (and energy) to lie are stronger than those to catch the lie. Manual catching systems don’t scale; attempts to use peers to discover bias will only work if the peers also are free from bias (they aren’t). Algorithms might work for a while, but those who game the system will discover ways to make their bias more difficult to detect.

The problem will only get worse: as crowd sourcing and its impact becomes better understood, efforts to manipulate it become better organized, more common and probably more subtle (and thus hard to detect). It’s possible (by no means assured) that a decade from now crowd sourcing will prove to be a noble experiment that failed.

I think the end result will be to bring us back full circle, to where we were a decade ago: I will trust only the feedback of someone I know, either personally or a brand name reviewer or analyst who has proven reliable (by my standards) in the past. I’m not sure how that helps me find a restaurant in Los Angeles or Scottsdale, although my standards for under $10 restaurants are pretty lax.

No comments: