首页 500强 活动 榜单 商业 科技 领导力 专题 品牌中心


David Meyer



OpenAI公司CEO山姆·奥特曼。图片来源:Stefano Guidi—Getty Images



这篇文章还指出:“消息人士指出,此项进展具有重大意义。几名OpenAI的员工曾向他们的朋友表示,他们对近期的研发进展既感到兴奋,又感到担忧。” 而这些消息人士显然来自“美国政府和一些领先的AI公司。”



昨天,山姆·奥特曼又发推称:“我们准备了一些非常酷的东西给大家”。我已经询问了 OpenAI是否就是那家即将推出“博士级超级智能体”的公司,但尚未收到他们的回复。不过据科技媒体《The Information》报道,OpenAI有可能最早于本月推出一个名为Operator 的智能体系统,它将可以代表用户自主执行任务。


首先我们要介绍一下FrontierMath,它是由Epoch AI编制的一套数学基准测试,旨在检验AI大模型推理数学问题的能力。为了避免测试问题已经在大模型的训练库中,FrontierMath只包含“全新且尚未发布过”的数学问题。结果令人有些失望,Epoch AI称,当前市面上的主流大模型(如OpenAI的GPT-4和谷歌的Gemini)的解题正确率还不到2%。在公开演示中,只有OpenAI最新推出的o3大模型的得分略高于 25%。

问题是,OpenAI还资助了FrontierMath的开发,而且还要求Epoch AI在o3大模型发布前对此保密。因此,Epoch AI的一名外包在LessWrong论坛上发帖抱怨称,参与出题的数学家们一直被蒙在鼓里,根本想不到OpenAI与FrontierMath还有这样一层关系。这条帖子火了之后,Epoch AI的副主任塔梅・贝西罗格鲁才公开道歉,表示是因为OpenAI的合同里有相关条款规定。才导致Epoch AI无法更早披露二者之间的关系。



马库斯表示:“OpenAI 应该更透明地说清它与 Epoch AI的商业安排,以及他们在多大程度上获得了竞争优势,在多大程度上直接或间接地利用获得的材料进行了训练,还有在多大程度上对这些信息使用了数据增强技术。如果他们对这些问题不透明,我们就不必把他们当回事。”




亚马逊对Covariant的“收购式招聘”遭举报。Covariant AI是一家专门为物流机器人研发AI程序的公司。近日,该公司的一名匿名股东兼前员工向美国有关部门举报了亚马逊对该公司的收购存在问题。亚马逊于去年8月份宣布,它聘用了Covariant 的3位创始人以及该公司四分之一的员工,同时获得了该公司研发的AI模型的非独家许可。据《华盛顿邮报》报道,举报人称,这笔“收购式招聘”的交易价值达到3.8亿美元,超过了向反垄断监管机构备案门槛的3倍,但亚马逊却并未就此进行备案。而且亚马逊的交易条款还限制了Covariant向其他公司出售许可。对此,亚马逊的一位发言人回应称:“Covariant将继续为其数十家客户提供服务,而且由于亚马逊获得的是Covariant技术的非独家许可,因此Covariant公司仍可以自由地向其他公司进行技术授权。”

Metropolis 收购 Oosto。Oosto是一家以色列人工智能面部识别公司,其前身为AnyVision公司,该公司目前已经找到了买家。据科技媒体TechCrunch 报道,一家名叫Metropolis的公司将以价值1.25亿美元的股份收购Oosto。Metropolis是一家帮助停车场运营者实现无感支付的AI公司。此前,Oosto已从投资者手中拉到了3.8亿美元的融资。Oosto是一家颇具争议的公司,一方面,很多人都对面部识别技术感到不安,另一方面,以色列政府还利用了该公司的软件监视约旦河西岸的巴勒斯坦人。




Meta宣称取得“巴别鱼”级别的突破。“巴别鱼”是《银河系漫游指南》里的一种奇特生物,只要把它塞进耳朵,就能听懂其他物种的话。近日,Meta公司的研究人员发布了一个“大规模多语言多模态机器翻译系统”,简称“SEAMLESSM4T”。该系统无需将语音先换为文本,再转换回语音,就能将口语对话翻译成其他语言。研究人员称,SEAMLESSM4T 在排除背景噪音方面比同类系统出色得多。





3月10-13日:Human [X] 大会,拉斯维加斯

3月17-20日:英伟达 GTC 大会,圣何塞

4月9-11日:谷歌云 Next 大会,拉斯维加斯



首先,杭州的深度求索公司(DeepSeek)在圣诞节前夕发布了DeepSeek V3 模型,有人认为它是目前市面上最好用的开源AI工具。V3在训练中用到了DeepSeek R1模型。深度求索公司表示,R1在数学、编程和推理任务方面,已经几乎可以与 OpenAI 的o1模型相媲美。基准测试也表明深度求索公司并没有说大话,该模型已经成为o1的一个强大对手,而且运行成本还要低得多。

深度求索公司现在已经开源了R1 的一个版本——R1-Zero。虽然R1-Zero遇到了一些问题,比如“无休止的重复、可读性差、语言混乱等等”,但是R1显然已经没有这些问题了。或许是因为这两个模型体量太大,深度求索还把它们的知识迁移到了Meta的Llama和阿里巴巴的 Qwen模型版本上,而且也将这些模型开源了。

此外,中国的月之暗面公司(Moonshot AI)刚刚发布了Kimi k1.5模型,它能够对文本和视觉模态进行推理,月之暗面也表示该模型可与o1媲美。据说,该模型的新版本很快将应用于在它的Kimi 聊天机器人中。(财富中文网)




这篇文章还指出:“消息人士指出,此项进展具有重大意义。几名OpenAI的员工曾向他们的朋友表示,他们对近期的研发进展既感到兴奋,又感到担忧。” 而这些消息人士显然来自“美国政府和一些领先的AI公司。”



昨天,山姆·奥特曼又发推称:“我们准备了一些非常酷的东西给大家”。我已经询问了 OpenAI是否就是那家即将推出“博士级超级智能体”的公司,但尚未收到他们的回复。不过据科技媒体《The Information》报道,OpenAI有可能最早于本月推出一个名为Operator 的智能体系统,它将可以代表用户自主执行任务。


首先我们要介绍一下FrontierMath,它是由Epoch AI编制的一套数学基准测试,旨在检验AI大模型推理数学问题的能力。为了避免测试问题已经在大模型的训练库中,FrontierMath只包含“全新且尚未发布过”的数学问题。结果令人有些失望,Epoch AI称,当前市面上的主流大模型(如OpenAI的GPT-4和谷歌的Gemini)的解题正确率还不到2%。在公开演示中,只有OpenAI最新推出的o3大模型的得分略高于 25%。

问题是,OpenAI还资助了FrontierMath的开发,而且还要求Epoch AI在o3大模型发布前对此保密。因此,Epoch AI的一名外包在LessWrong论坛上发帖抱怨称,参与出题的数学家们一直被蒙在鼓里,根本想不到OpenAI与FrontierMath还有这样一层关系。这条帖子火了之后,Epoch AI的副主任塔梅・贝西罗格鲁才公开道歉,表示是因为OpenAI的合同里有相关条款规定。才导致Epoch AI无法更早披露二者之间的关系。



马库斯表示:“OpenAI 应该更透明地说清它与 Epoch AI的商业安排,以及他们在多大程度上获得了竞争优势,在多大程度上直接或间接地利用获得的材料进行了训练,还有在多大程度上对这些信息使用了数据增强技术。如果他们对这些问题不透明,我们就不必把他们当回事。”




亚马逊对Covariant的“收购式招聘”遭举报。Covariant AI是一家专门为物流机器人研发AI程序的公司。近日,该公司的一名匿名股东兼前员工向美国有关部门举报了亚马逊对该公司的收购存在问题。亚马逊于去年8月份宣布,它聘用了Covariant 的3位创始人以及该公司四分之一的员工,同时获得了该公司研发的AI模型的非独家许可。据《华盛顿邮报》报道,举报人称,这笔“收购式招聘”的交易价值达到3.8亿美元,超过了向反垄断监管机构备案门槛的3倍,但亚马逊却并未就此进行备案。而且亚马逊的交易条款还限制了Covariant向其他公司出售许可。对此,亚马逊的一位发言人回应称:“Covariant将继续为其数十家客户提供服务,而且由于亚马逊获得的是Covariant技术的非独家许可,因此Covariant公司仍可以自由地向其他公司进行技术授权。”

Metropolis 收购 Oosto。Oosto是一家以色列人工智能面部识别公司,其前身为AnyVision公司,该公司目前已经找到了买家。据科技媒体TechCrunch 报道,一家名叫Metropolis的公司将以价值1.25亿美元的股份收购Oosto。Metropolis是一家帮助停车场运营者实现无感支付的AI公司。此前,Oosto已从投资者手中拉到了3.8亿美元的融资。Oosto是一家颇具争议的公司,一方面,很多人都对面部识别技术感到不安,另一方面,以色列政府还利用了该公司的软件监视约旦河西岸的巴勒斯坦人。




Meta宣称取得“巴别鱼”级别的突破。“巴别鱼”是《银河系漫游指南》里的一种奇特生物,只要把它塞进耳朵,就能听懂其他物种的话。近日,Meta公司的研究人员发布了一个“大规模多语言多模态机器翻译系统”,简称“SEAMLESSM4T”。该系统无需将语音先换为文本,再转换回语音,就能将口语对话翻译成其他语言。研究人员称,SEAMLESSM4T 在排除背景噪音方面比同类系统出色得多。





3月10-13日:Human [X] 大会,拉斯维加斯

3月17-20日:英伟达 GTC 大会,圣何塞

4月9-11日:谷歌云 Next 大会,拉斯维加斯



首先,杭州的深度求索公司(DeepSeek)在圣诞节前夕发布了DeepSeek V3 模型,有人认为它是目前市面上最好用的开源AI工具。V3在训练中用到了DeepSeek R1模型。深度求索公司表示,R1在数学、编程和推理任务方面,已经几乎可以与 OpenAI 的o1模型相媲美。基准测试也表明深度求索公司并没有说大话,该模型已经成为o1的一个强大对手,而且运行成本还要低得多。

深度求索公司现在已经开源了R1 的一个版本——R1-Zero。虽然R1-Zero遇到了一些问题,比如“无休止的重复、可读性差、语言混乱等等”,但是R1显然已经没有这些问题了。或许是因为这两个模型体量太大,深度求索还把它们的知识迁移到了Meta的Llama和阿里巴巴的 Qwen模型版本上,而且也将这些模型开源了。

此外,中国的月之暗面公司(Moonshot AI)刚刚发布了Kimi k1.5模型,它能够对文本和视觉模态进行推理,月之暗面也表示该模型可与o1媲美。据说,该模型的新版本很快将应用于在它的Kimi 聊天机器人中。(财富中文网)


OpenAI may or may not be about to release something big and agentic.

According to a rather breathless Axios article on Sunday, an unidentified company is preparing “Ph.D.-level super-agents” that would be “a true replacement for human workers.” No names are named, but the article prominently notes that OpenAI CEO Sam Altman will give Trump administration officials a closed-door briefing at the end of the month.

It goes on to add: “Sources say this coming advancement is significant. Several OpenAI staff have been telling friends they are both jazzed and spooked by recent progress.” Those sources apparently come from “the U.S. government and leading AI companies.”

There’s more than a whiff of hype about all this. But Altman is no fan of such things, he claims. Addressing the separate but perhaps connected issue of OpenAI’s efforts to achieve “artificial general intelligence” (definitions differ, but this usually means AI with human- or superhuman-level capabilities), the CEO tweeted yesterday that “Twitter hype is out of control again” and “we are not gonna deploy AGI next month, nor have we built it.”

If he’s so anti-hype, Altman might want to take himself aside for tweeting, less than three weeks ago: “I have always wanted to write a six-word story. Here it is: Near the singularity; unclear which side.” A story, sure, but it also came across as a strong hint. (“The singularity” is a term referring to the inflection point where AI surpasses human intelligence.)

In yesterday’s tweet, Altman promised “We have some very cool stuff for you.” I’ve asked OpenAI whether it is the company that’s about to reveal “Ph.D.-level super-agents” and have received no response. But The Information reports that OpenAI will launch an agentic system called Operator, which can autonomously execute tasks on the user’s behalf, as soon as this month.

Whatever OpenAI does release, people should scrutinize it very closely, because the company has in recent days been caught up in a bit of a benchmarking scandal that raises questions about its performance claims.

The benchmark in question is FrontierMath, which was used in the demonstration of OpenAI’s flagship o3 model a month back. Curated by Epoch AI, FrontierMath contains only “new and unpublished” math problems, which is supposed to avoid the issue of a model being asked to solve problems that were included in its training dataset. Epoch AI says models such as OpenAI’s GPT-4 and Google’s Gemini only manage scores of less than 2%. In its demo, o3 scored a shade over 25%.

Problem is, it turns out that OpenAI funded the development of FrontierMath and apparently instructed Epoch AI not to tell anyone about this, until the day of o3’s unveiling. After an Epoch AI contractor used a LessWrong post to complain that mathematicians contributing to the dataset had been kept in the dark about the link, Epoch associate director Tamay Besiroglu apologized, saying OpenAI’s contract had left the company unable to disclose the funding earlier.

“We acknowledge that OpenAI does have access to a large fraction of FrontierMath problems and solutions, with the exception of a unseen-by-OpenAI hold-out set that enables us to independently verify model capabilities,” Besiroglu wrote. “However, we have a verbal agreement that these materials will not be used in model training.”

OpenAI has not yet responded to a question about whether it nonetheless used its FrontierMath access when training o3—but its critics aren’t holding back. “The public presentation of o3 from a scientific perspective was manipulative and disgraceful,” the notable AGI skeptic Gary Marcus told my colleague Jeremy Kahn in Davos yesterday, adding that the presentation was “deliberately structured to make it look like they were closer to AGI than they actually are.

“OpenAI should be more transparent about what the business arrangements were [with Epoch AI] and the extent to which they were given a competitive advantage and the extent to which they trained directly or indirectly on materials they had access to and the extent to which they used data augmentation techniques on information they had access to,” Marcus said. “If they are not transparent, we should not take them seriously.”

That’s something to bear in mind over the coming weeks. And with that, here’s more on what has been a very busy few days on the AI news front.


Trump scraps Biden’s AI order. On his first day back in office, President Donald Trump scrapped dozens of his predecessor’s policies, among them Biden’s 2023 Executive Order on Safe, Secure, and Trustworthy Development and Use of AI. Much of that particular order has already been carried out, such as the creation of an AI Safety Institute within the National Institute of Standards and Technology (NIST). But Trump’s move does mean that AI companies will no longer have to give the U.S. government safety-test results before releasing new models. It also means that the U.S. now has no significant federal AI rules, creating an enormous disparity with the EU in particular, and perhaps setting the stage for future EU-U.S. clashes over the issue of AI safety.

Whistleblower targets Amazon’s Covariant acquihire. An unnamed shareholder and former employee of Covariant AI, a company that makes AI for logistics robots, has complained to the U.S. authorities about Amazon’s recent deal with the company. As it announced last August, Amazon hired three Covariant founders and a quarter of its staff, while taking a nonexclusive license for its models. Per the Washington Post, the whistleblower claims the acquihire deal was worth $380 million—over three times the threshold for giving antitrust regulators a heads-up, which never happened—and also that its terms limited the licenses that Covariant could sell to others. An Amazon spokesperson responded: “Covariant continues to serve its dozens of customers, and because Amazon is licensing Covariant technology on a non-exclusive basis, Covariant is free to license its technology to other companies."

Metropolis buys Oosto. Oosto, the Israeli AI facial recognition firm formerly known as AnyVision, has found a buyer. Metropolisan, an AI company that helps parking operators provide checkout-free payment experiences, will pay $125 million of its stock in exchange for Oosto, according to TechCrunch. Oosto had raised some $380 million from investors. Oosto/AnyVision was a controversial outfit, partly because many people are generally uneasy about facial recognition, but also because the Israeli government used its software to surveil West Bank Palestinians.

British government details extensive AI plans. The U.K.’s Labour government said last week that it would “mainline AI into the veins” of the country’s economy, and now it’s detailed how the country’s public services will embrace the new technology. As part of an announcement around the digitization of services and better sharing of data between agencies, the government announced an AI toolkit for civil servants. The package is dubbed “Humphrey," a witty reference to the classic TV show Yes Minister. The kit includes tools for rapidly parsing responses to public consultations, draws on decades of parliamentary debate to “better manage bills” (reportedly by predicting how legislation will be received by lawmakers), and summarizing policies and laws.


Google pits Titans against transformers. There’s a lot of buzz around a new neural-network architecture that Google researchers have just announced. The Titans architecture provides the possibility of long-term, persistent neural memory that can act in concert with more short-term memory, of the sort that is associated with the transformer architecture that underpins today’s LLMs. This would be useful for building agents. According to Google’s researchers, the new architecture is “more effective” than transformers when it comes to “common-sense reasoning” and other tasks, specifically when it comes to handling large amounts of information. However, the big question now is what the compute requirements look like.

Meta claims Babel Fish breakthrough. Meta’s researchers have announced a system called Massively Multilingual and Multimodal Machine Translation, or SEAMLESSM4T, that can translate spoken words into other languages without the need to convert the recording to text and back again (though it can do that too.) They suggest this is a big step towards the creation of something like the Babel Fish, a universal translator (and fish) that makes it possible for characters in Douglas Adams’s Hitchhiker’s Guide to the Galaxy to communicate with other species. According to the researchers, SEAMLESSM4T is far better at rejecting background noise than comparable systems.


Feb. 10-11: AI Action Summit, Paris, France

March 3-6: MWC, Barcelona

March 7-15: SXSW, Austin

March 10-13: Human [X] conference, Las Vegas

March 17-20: Nvidia GTC, San Jose

April 9-11: Google Cloud Next, Las Vegas


Reasoning models flourish in China. In the push for better AI “reasoning” models, all eyes are currently on China thanks to a couple of notable announcements.

First up: DeepSeek-R1. Hangzhou-based DeepSeek released its V3 model, currently considered by some to be the best open-source AI model out there (sorry, Meta,) just before Christmas. R1 was used to train V3, and DeepSeek claims it can just about match OpenAI’s o1 “across math, code, and reasoning tasks.” Benchmarking suggests this is true, providing a serious competitor to o1 that is much cheaper to run.

DeepSeek has now open-sourced a version of R1 called R1-Zero, which it says “encounters challenges such as endless repetition, poor readability, and language mixing,” as well as R1 itself, which apparently doesn’t. Perhaps because both are enormous, it has also transferred (or “distilled”) knowledge from them to versions of Meta’s Llama and Alibaba’s Qwen models, and open-sourced those too.

Meanwhile, China’s Moonshot AI just announced Kimi k1.5, a model that can reason over both text and vision modalities, and that Moonshot also claims is comparable to o1. It says the new version of the model will soon power its popular Kimi chatbot.



请打开财富Plus APP
