立即打开
口碑经济的罪与罚

口碑经济的罪与罚

Jeffrey Pfeffer 2015-06-10
从使用打车软件的司机和乘客,从医生到教师,再到餐馆,几乎每个人和每件事现在都成了打分的对象。不过,这种所谓“口碑经济”存在很多问题。首先,打分的确很重要,但消费者的打分往往并不够准确,而且这种评价体系经常会鼓励一些错误的行为。事实证明,即使在口碑经济中,“谨防上当”也仍然是一条颠扑不破的真理。

    斯坦福大学MBA毕业生、OwnerListens公司的联合创始人阿迪•比坦告诉我,目前企业主要采取了两种类型的策略:“白帽”和“黑帽”。“白帽”战略一般会先找到最满意的顾客,然后鼓励他们在热门网站上撰写评价。“黑帽”战略即给竞争对手甚至是潜在的竞争对手写差评、扣帽子。比如有人在Yelp上只给了芝加哥名厨格拉罕姆•艾略特的三明治和冰淇淋店1星评价,那人还说,当他走到那家店时发现它关门了,结果毁掉了“他原本惬意的散步心情”。但事实上,当时根本还没有到那家店的营业时间。艾略特举这个例子来说明差评究竟可以没有下限到什么程度。他对Yelp的评价用词基本上不适合发表。

    除了虚假和不实评分以外,口碑经济还有其它更多的问题。为了吸引顾客的评价,企业有时会采取没有任何用处,甚至有害的方法来“刷好评”。

    再次以教师的评分为例。大家都知道这样一个潜规则:教师要想获得更高评分,其中一种方法,便是给有评分权的学生更高的分数。正是由于“要好评”心理的作祟,这种现象已经成了高校的流行病,同时使考试成绩越发失去了学生学习成果和能力衡量指标的意义。我们不知道更高的考试成绩是否一定会带来更高的教师评分,但单单是这种心态本身就会影响教师的行为。

    这种行为无非是一种互惠主义的体现——我帮了你(比如给你一个好成绩),然后你再帮我(比如给我一个比较高的评分)。反正人类天生就有与人为善的习性,大家也不愿意被别人当成一个难说话或者讨厌的人。这让人不禁去想,就像教师与学生、打车软件的司机与乘客之间的互惠关系一样,如果交易双方都可以给对方打分,会是什么样子。

    科技媒体TechCrunch的一篇文章指出,eBay在2008年通过评价体系改革消除了买卖双方相互评价的可能。文章还指出,同样一套房子在Airbnb(该网站允许相互评价)上的评分要比在TripAdvisor上高出14%(不允许相互评价)。该文章指出:“在不匿名的社会环境中,人们都想在别人眼里留下好印象,不愿意说别人的坏话,因为谁都不想让别人觉得自己是个老是在抱怨的人,或老是在唠唠叨叨。”阿迪•比坦也指出Uber上司机的得分都太高了,他认为这也是互利评价的缘故,因为你只有给别人一个极为正面的评价,别人才会投桃报李给你一个正面的评价。

    除此之外,“口碑经济”还有可能导致更加严重的问题。比如医生为了获得病人的好评,经常会给病人做一些不必要的诊断测试,或是给病人开抗生素或强效止痛药,特别是当病人主动要求的时候,而不管病人需不需要,有没有用。也就是说,评分或对评分的预期改变了医生的治疗方法。“在南卡罗莱纳医疗协会2012年的一项调查中,半数受访医生表示,由于面临着需要提高患者满意度的压力,很多医生不当地为病人开了抗生素或麻醉剂。”患者评分的流行与滥用麻醉剂之间是否存在某种关系,也是个值得观察的问题。

    有办法解决这个问题吗?

    虚假评价,特别是那些比较极端和简单的虚假评价,是可以通过统计方法检测出来的,只是目前技术还不完美。经济学家布莱恩•雅各布茨和史蒂芬•列维在一篇著名的论文中指出:“出乎意料的测试成绩波动和可疑的答案模式”,可以用来检测教师是否为了提高学生的分数而弄虚作假。正如我上文指出的那样,Yelp、亚马逊和谷歌等网络公司都在努力消除虚假评价,比如通过构建算法筛查出可疑行为等。

    亚马逊采取了验证网购评价者身份等策略,从而提高了水军发布虚假信息的成本和难度。

    企业招聘新人和考核绩效(这两件事的本质也是评估)时的做法提供了另一种有效的解决方案,即标准化的产品或服务评价指标。米其林和普通食客对同一家餐厅的评价之所以非常悬殊,是因为米其林的员工有一套更加正规的评价标准,以及一套确保这些标准能被严格遵循的流程。

    比坦的公司旨在帮助各种类型的企业获得实时的顾客反馈,先发制人地解决服务问题,阻止负面评价。比坦为我们提出了两条建议。她指出,如果人们不能匿名发虚假信息的话,他们就不大愿意那样做了,所以身份验证可能是个有用的办法。另外,由于很多明显的原因,你的朋友和熟人提供的信息一般比陌生人更加有用和可信。不过在这个问题上,“也有些数据显示,在北美,大多数人更相信网上的评论,而不是他们的朋友。”这真不是一个好习惯。

    AdiBittan, a former Stanford MBA student and co-founder of OwnerListens, told me that there were two types of strategies that companies used: “white hat” and “black hat” approaches. “White-hat” strategies entail moves such as figuring out who your most satisfied customers are and then encouraging them—and even making it easier for them—to write reviews on popular websites. “Black-hat” strategies involve disparaging competitors, or maybe even future competitors. In one particularly notorious and well-known example, Chicago celebrity chef Graham Elliot’s “highly anticipated and oft-delayed gourmet sandwich/soft serve shop” got a 1-star review on Yelp from a prospective patron who said his “otherwise pleasant walk” was ruined by going to the establishment and finding that it was closed. The café had not even opened its doors for business at that point. Elliot, whose opinions of Yelp are essentially unprintable, took this as an example of how bad reviews are.

    There are more problems with the reputation economy beyond just manipulated and inaccurate ratings. The prospect of customer reviews can induce behaviors designed to increase customer ratings in ways that are not useful and are sometimes harmful.

    Returning to teacher ratings, there is a common belief, supported by at least some evidence, that one way to achieve higher ratings is for instructors to give the students who are doing the ratings higher grades. This belief produces the now-endemic grade inflation in higher education and also makes grades less meaningful as indicators of student achievement or ability. It’s unclear if higher grades produce higher teacher ratings, but the belief that this relationship holds nonetheless affects instructor behavior.

    This behavior is all about reciprocity—I help you out (for instance, by giving you a good grade) and you help me out (for instance, by giving me a high rating)—and the natural human tendency to be nice and the associated desire to not be perceived as negative or difficult. These ideas call into question what happens when, like with teachers or Uber drivers, you have counterparties in a transaction rating each other.

    An article in TechCrunch noted that eBay dispensed with reciprocal reviews in 2008 and also reported on a study that found that the identical property was rated 14% higher on Airbnb (that uses reciprocal ratings) than on TripAdvisor, which does not. That same piece noted: “People want to look good in social settings in which people’s identities are not anonymous, people tend to shy away from saying bad things because they don’t want to be the one who seems like a constant complainer or never-ending nagger.” The average Uber driver score is too high, according to Bittan, who believes that reciprocal reviews create incentives for being overly positive to get a positive review in return.

    And there are more serious problems than just giving higher grades or higher ratings to encourage others to help you out in return. Doctors seeking higher patient ratings are more willing to order (unnecessary) diagnostic tests or to prescribe antibiotics or potent painkillers even when not needed or helpful, particularly if patients request them. In other words, reviews or the prospect of being reviewed changes treatment: “In a 2012 survey by the South Carolina Medical Association, half of the physicians surveyed said that pressure to improve patient satisfaction led them to inappropriately prescribe antibiotics or narcotics.” It would be interesting to see if there is a relationship, both over time and across settings, between the prevalence of patient reviews and the growing problem of opiate abuse.

    Is there any way out of this problem?

    Cheating, particularly in its extreme or least sophisticated forms, can be detected statistically, albeit imperfectly. Economists Brian Jacobs and Steven Levitt, in a famous paper, showed that “unexpected test score fluctuations and suspicious patterns of answers” could be used to detect teacher cheating to artificially raise their students’ scores. As I noted above, Yelp, Amazon, and Google, among others, are all working to try to eliminate fake reviews, including by building algorithms to highlight suspicious activity.

    Amazon’s verified purchaser identification of reviews and related strategies help to raise the cost and difficulty of flooding sites with bogus information.

    The world of assessing job candidates and doing performance appraisals, both forms of rating, offer another useful solution: provide standardized product or service dimensions for evaluation. One reason Michelin and diners’ ratings differ is that the Michelin employees have a more standardized set of criteria to evaluate restaurants and a process to ensure that those standards are used.

    Bittan, whose company was established to help provide businesses of all sizes with real-time customer feedback, preemptively solve service issues, and head off negative reviews, made two other suggestions. She noted that people are less likely to engage in deception if they can’t do so anonymously, so requiring people to identify who they are might help. And she noted that, for many obvious reasons, your friends and even acquaintances are more likely to provide useful and honest information than are others. However, in this regard, “some data show that a good majority of people in North America believe and trust online reviews more than they trust their friends’ opinions.” Bad decision.

热读文章
热门视频
扫描二维码下载财富APP