立即打开
从贾斯汀•比伯到数据学家,Twitter何以成为一门显学

从贾斯汀•比伯到数据学家,Twitter何以成为一门显学

Erika Fry 2014年09月01日
自Twitter创建以来,各路学者纷纷涌向这一微博平台,不是去发帖,而是去从事研究工作。在学术界看来,Twitter拥有最为丰富,也许是前所未有的数据集。它就相当于一个实时数据的虚拟培养皿,吸引着各个学科的学者开展五花八门的研究。

    这个故事的开篇距现在并不遥远,最初的主角是计算机科学家。相较于大多数学者,数据对于计算机科学家甚至更为重要——多年来,他们一直在挖掘他们各种稀奇古怪的数据集。例如,安然公司(Enron)的电邮【大约600,000条讯息,分属于158名安然雇员,美国联邦能源监管委员会(Federal Energy Regulatory Commission)在结束对安然公司的调查后将其公布于众】于2003年公布后,就成为该领域的流行素材。

    看上去,社交媒体显然是学者们挖掘数据的下一个前沿阵地,但在2003年,当计算机科学家詹妮弗•戈尔贝克受到MySpace启示,首次开始研究这些社交平台时,人们并不认为这些研究是有前途,或严肃的工作。她的高科技领域同事将这一研究嗤之为“社交科学”;而在社交网络的萌芽阶段,规模最大的网站是拥有两千万会员的成人交友网站AdultFriendFinder。

    作为一名博士研究生,戈尔贝克看到了此类平台中蕴含的巨大潜力。她说:“在这些平台上可以做大量有趣的计算工作”。然而,甚至当她在2005年拿到学位的时候,她依然没有说服计算机科学系认同这种观点。

    现如今,已经成为马里兰州大学帕克分校(University of Maryland, College Park)教授,并兼任人机互动实验室负责人的戈尔贝克,继续利用社交媒体研究人和人际关系。她的著述颇丰,曾以“YouTube上的社区感与社区结构”、国会议员如何使用Twitter、以及人与宠物关系等主题发表论文。而使她尤其受到追捧的是她在TED大会上的发言:《扭扭薯条谜题:社交媒体点赞泄露的信息超乎你想象的原因何在》,自2013年10月以来,该视频的观看次数已经多达120万次。

    另一名先驱是密歇根州大学(University of Michigan)信息与计算机科学助理教授埃伊坦•阿达尔。数年前,他利用博客来研究模因的蔓延机制,2007年,他参与创立了“网络博客与社交媒体国际大会”(International Conference on Weblogs and Social Media),其目的是为从事类似工作的研究者建立一个生态圈。同年的活动吸引了145人参与,大会主题包括《在公司博客上建立信任》和《Flickr上的社交探索》等等,其主旨演讲人埃文•威廉姆斯不是别人,正是当时羽翼未丰的Twitter公司的创始人。

    研究Twitter的首批学者,往往是像戈尔贝克和阿达尔这样的计算机科学家,他们既懂Twitter,同时也具备收集并处理数据的技术。此外,首批学者中还包括对网络效应特别感兴趣的物理学家以及信息科学和通讯学者。早期的研究往往以Twitter为中心,对该服务的使用方式和目的进行统计分析。然后出现了一些更复杂的研究,其重点是研究Twitter的机制:比如“取消关注的动态情况”、“瞬时群体发现”、或者“Twitter主题内用户及消息集群的模式”。新加入研究大军的人多为埃默里这样的社会科学家,他们提出了数据应用的构想,比如预测选举的结果,或者阐明Twitter大学年龄用户自恋情节。但这些人往往并不是收集和处理数据的行家里手。(正因如此,大量跨学科研究工作层出不穷,戈尔贝克的实验室就从事类似研究)。

    研究报告《人们研究Twitter时是在研究什么?》指出,专注于Twitter的论文数量在2007年有3篇,2008年增加到了8篇,2009年增加到了36篇,此后便一路显著上升。

    The story begins in the not-too-distant past with computer scientists. Even more than most academics, computer scientists need data—and for years, they’ve mined whatever odd and interesting datasets have come their way. The Enron emails—the 600,000 some messages belonging to 158 Enron employees and made public by the Federal Energy Regulatory Commission after its investigation of the company—became popular fodder in the field, for example, after they were released in 2003.

    Social media may seem an obvious next frontier for data-minded academics, but when computer scientist Jennifer Golbeck first started studying such platforms in 2003 (she was inspired by MySpace), it was not considered particularly promising or serious work. Colleagues in her highly technical field dismissed it as “social science”; and in the nascent universe of online social networks, the largest was a hook-up site with a community of 20 million members called AdultFriendFinder.

    Golbeck, a Ph.D. student at the time, saw greater potential in such platforms: “There was so much interesting computing to be done,” she says. But she was still battling to convince computer science departments of this when she completed her degree in 2005.

    Now a professor at University of Maryland, College Park, Golbeck heads up the school’s Human-Computer Interaction Lab and continues to study what can be learned about humans and relationships using social media. Her prolific output has included papers on “the sense and structure of community on YouTube,” how Congressional representatives use Twitter, and the dynamics of the human-pet relationship (many platforms). That work makes her much in demand—her TED talk, “The Curly Fries Conundrum: Why social media likes say more than you might think,” has been viewed 1.2 million times since October 2013.

    Eytan Adar, now an assistant professor of information and computer science at the University of Michigan was another pioneer. Years ago he used blogs to study how memes spread and, in 2007, he co-founded the International Conference on Weblogs and Social Media in an effort to build community among researchers doing similar work. That year, the event drew 145 people, offered talks like “Building Trust on Corporate Blogs” and “Social Browsing on Flickr,” and featured Ev Williams, the founder of a then-fledgling start-up called Twitter, as the keynote speaker. (Like Twitter, the conference has grown a lot since then.)

    The first academics to study Twitter tended to be computer scientists like Golbeck and Adar, who had both the savvy to understand Twitter and the tech skills to collect and manipulate its data, as well as physicists and information science and communications scholars who were particularly interested in network effects. Research from those early years tended to focus on Twitter—statistical analyses of how and for what the service was used. Then came more sophisticated studies focused on the mechanics of Twitter: the study of things like “unfollow dynamics,” “transient crowd discovery,” or “patterns in Twitter intra-topic user and message clustering.” Later to the party were social scientists, like Emery, who dreamt up applications for the data—predicting the outcome of elections, for instance, or elucidating the narcissism of Twitter’s college-aged users—but tended to be less technically adept at collecting and manipulating it. (As a result, a number of interdisciplinary research efforts—like those that take place in Golbeck’s lab—have sprung up.)

    According to the study, “What do people study when they study Twitter?,” the number of Twitter-focused papers has grown from 3 in 2007, to 8 in 2008, to 36 in 2009, and is up considerably since then.

  • 热读文章
  • 热门视频
活动
扫码打开财富Plus App