Big factories have entered the game, and Baidu, Alibaba, and Byte have snatched the right to speak on agents

In May 2025, the Sequoia Capital AI Summit came to an end in San Francisco, and at this conference, which included 150 founders, scientists and investors of the world’s top AI companies, including Sam Altman, founder of Open AI, and Jeff Dean, chief scientist of Google, a consensus was gradually established – that is, the next round of AI will not sell tools, but profits. In this seemingly jumpy sentence, Sequoia gives a patterned explanation that AI will move from selling tools to collaboration, and finally to selling software as an outcome.

How to go from selling tools to selling results, this Sequoia summit is more meaningful, perhaps because it emphasizes the value of Agent. Subsequently, Silicon Valley giants started the first wave of acceleration, and Microsoft CEO Nadella announced in his keynote speech: “We have entered the era of AIAgent and are witnessing how AI systems can help us solve problems in new ways.” Open AI CEO Sam Altman announced the launch of a new Codex agent for developers, saying that “this could be the biggest change in the history of programming.” ”

In the domestic market at this end of the ocean, the big manufacturers that are ready to move seem to have the same judgment, judging from the actions of Byte, Baidu, and Alibaba, the leading Internet giants have accelerated the layout of Agent applications. According to insiders, Byte already has 7 teams racing Agents, and Baidu directly took out its heartbeat at the recent create conference, claiming that it is a directly available Agent. On the Alibaba side, Quark has positioned itself as a “super agent” internally.

In addition to general agents, various major manufacturers are also accelerating in terms of vertical agents. Alibaba Fliggy asked, Baidu’s Faxingbao and others are also expanding.

The second wave of certainty after the Agent as a large model has started to accelerate the competition, and the variables that affect the battle situation in the end are not only the ecological thickness of each company, but also the occupation of the mind and user habits. This also means that in the end, only a few players such as Alibaba and Baidu, Byte, and Tencent may be left, and Baidu and Alibaba, who took the lead in playing cards, can seize the opportunity, which is worth dismantling.

The Agent outbreak year is also the sprint year

The first to wake up the big manufacturers is obviously Manus, a domestic agent team invested by Zhen Fund, which suddenly launched Manus in early March 2025, winning the attention of the world from the beginning of the closed beta. Hot words such as “100,000 yuan to grab an internal test code” and “one code is hard to find” stimulate people’s nerves. For the first time, the mainstream view realized that agents based on mainstream large models can actually achieve such powerful practical functional applications and user experience. And the actions of the big factories seem to have started to accelerate since then.

Bytes are representative of saturation attacks. In a later report, before and after the agent application Manus came out of the circle in early March, Byte had at least 5 teams developing different agent products, some of which were internal tools. At the end of April, this number had reached 7 teams. At the end of April, Byte’s flow team took out the button space positioned as “the best place for users and AI Agents to work together” and opened internal testing, which is based on the self-developed Doubao large model (such as Doubao 1.5Pro), supports MCP (Model Context Protocol) protocol, and can call Feishu multi-dimensional tables, AutoNavi maps, image tools and other components.

Baidu’s actions can also be seen from Robin Li’s speech. In the Baidu Create conference in April, Robin Li directly stated that “the ultimate value of AI lies in the application landing, and the agent is the core carrier that connects model capabilities and user needs”.

On the other hand, Alibaba has not relaxed, and the advantages of the basic model, as well as the increase in quark and DingTalk, have given it the strength to compete on both the TOB and TOC sides.

From the competition of large models to the competition of agents, what are the big factories competing for? In short, it is the change of interaction mode after the acceleration of technology, and the competition for the entrance to the ecological level.

A typical analogy is that in the era of large models, when DeepSeek appeared, due to its technological leadership, it easily obtained a huge number of users without spending any marketing expenses, and Tencent directly snatched the ticket to the large model era with Yuanbao’s link to DeepSeek. Today’s Agent is also the same logic, the bottom layer of technology has been formed, and whose product can be quickly formed means that it is ahead of the market crushing level.

In the Sequoia sharing session, a special direction was also mentioned, that is, in the enterprise-level market, the real entrance to run out first is not necessarily the general large model, but the intelligent OS in vertical fields such as Harvey (law) and Open Evidence (medical), because they can understand the language of the industry and understand the real needs. As for these two agents, the open evidence is more familiar to everyone, and investor Zhu Xiaohu mentioned in many interviews that as an AI diagnostic aid tool designed for doctors, it has quickly become popular among American doctors through accurate clinical support and innovative business models.

Therefore, it can be seen that the potential of Agent is not limited to one or two applications, but more importantly, it is an ecological level entrance. From this point of view, whether it is an agent ecology, a vertical application agent, or a general agent (heartbeat, button, and quark), it seems that it is a direction that must be competed for.

Where is the winner?

“Manus was able to succeed, and we can see that it uses the cloude3.5 model.” The founder of Secret Tower once commented in a podcast column. To some extent, this also means a consensus within the industry, the most obvious example of which comes from the Buckle team at Byte.

In a later report, “Buttons are an open platform, and if there are large models in China that are better than bean bags, we will also actively use them.” When the buckle team was developing the buckle space, the Doubao deep thinking model had not yet been released, and they had considered using DeepSeek -R1 first, but after testing, they found that it was not capable of calling tools enough.

Later, the team compared six large domestic models and used a variety of models based on Doubao 1.5 Pro. Because Doubao performs best in instruction compliance, calling tools, and multimodal processing capabilities, and the inference cost is very low, it can support large-scale calls.

To achieve these three challenges, product managers will only continue to appreciate
Good product managers are very scarce, and product managers who understand users, business, and data are still in demand when they go out of the Internet. On the contrary, if you only do simple communication, inefficient execution, and shallow thinking, I am afraid that you will not be able to go through the torrent of the next 3-5 years.

View details >

And this view is also widely recognized by the industry, Li Guangmi of Shixiang Technology once admitted in a podcast that pre-training may become important again, and the ability of large models also determines the ability of agents and other capabilities.

From this point of view, Baidu, Alibaba and Byte can be said to be almost indistinguishable. Baidu won the cooperation with Apple, and to some extent, Apple, as the largest entrance to the smartphone industry, recognized Baidu’s ability in the direction of model energy. Alibaba, on the other hand, has QWEN, which may be the best open source model. For Byte, products based on the Doubao model have long dominated the C-end download volume, and their strength is also very outstanding. The pouring of a large amount of resources from the three parties has made this competition more and more intense.

Of course, from the perspective of agent applications, the thickness of the ecosystem is also the key to considering the winner. The essence of an agent is to “allow AI to truly complete tasks autonomously”, and the ability to invoke applications is also the key to winning the battle.

Baidu has opened the compatibility of large models and Qianfan development platforms, and Maps, Library, Network Disk, and Comate have also opened MCP Server. But Byte is relatively conservative and prefers to become a new Agent factory itself, Alibaba collects everything in quarks, and the concept of a superbox is another ability call.

Judging from a data from industry practitioners in March, Alibaba’s thickness advantage is obvious. Judging from the report, the MCP servers with high calls in China include AutoNavi Maps, notion, Alipay, and minimax, among which AutoNavi Maps has become a highly called APP by providing map services covering all scenarios, including geocoding, reverse geocoding, IP positioning, weather query, cycling path planning, walking path planning, and driving path planning.

Of course, there are many variables that evaluate the thickness of the ecosystem, but it can be seen that Alibaba’s lead is expanding.

From a morphological point of view, Agent is still a kind of social, and from the perspective of socialization, Tencent undoubtedly has a strong advantage. Tencent President Martin Lau responded: “Within the WeChat ecosystem, I think we have the opportunity to create a very unique Agent, that is, AI is connected to content unique to the WeChat ecosystem, including social, communication and community capabilities, as well as the content ecosystem, such as official accounts and video accounts, as well as millions of mini programs. In fact, you have access to a variety of information, as well as the ability to transact and operate many different vertical applications. ”

The unique social ecology has made Tencent, especially WeChat Agent, a force that cannot be ignored.

In addition, cost is another key to whether the Agent can complete its transformation. On March 18, The Information reported that Manus’ current product is subject to both its server capacity and high operating costs. According to two people with direct knowledge of the situation, Manus uses models from artificial intelligence company Anthropic and pays Anthropic an average of $2 for each task completed.

In summary, the capabilities of the basic model, the thickness of the ecosystem, and the cost have all become the key to whether the agent can stand out, and from the current point of view, it is difficult to distinguish between obvious advantages and disadvantages for the time being.

Agent has not yet reached the “GPT moment”

Although the racing of major manufacturers and the emergence of popular products such as Manus have made the popularity of the agent track soar, it can be confirmed that the current industry seems to be far from reaching a disruptive tipping point similar to GPT.From technology maturity, business model implementation to user mental occupation, agents still need to cross multiple gaps.

On the one hand, the core capabilities of the current agent are still highly dependent on large models, but the model itself has significant limitations. The CSDN blog pointed out that when dealing with multi-step tasks, the planning ability of large models is prone to collapse, such as bank transfers, which require more than a dozen steps, and the model often fails due to logical chain breaks.

At the same time, although major manufacturers have launched agent platforms, ecological integration is still in a fragmented state, and the situation of “fighting separately” has led to inconsistent tool call interfaces. The ways different agents call their functions vary significantly, and developers need to adapt them repeatedly.

Although the concept of “Software as an Outcome” proposed by the Sequoia Summit is widely recognized, the implementation path is still unclear. Currently, agents are mainly subscription-based or charged per call, and are still essentially tool-thinking. For example, the high cost of Manus makes it difficult to popularize it to small and medium-sized enterprises, while Baidu Xinxiang claims to be directly available, but enterprise users are more concerned about whether it can truly increase sales conversion rates or reduce operating costs.

In addition, although the exploration of vertical fields is beginning to dawn, it will still take time to scale. Alibaba Fliggy’s performance in the tourism scene is remarkable, and Baidu Faxingbao has also accumulated cases in the legal field, but the industry penetration rate of these vertical agents is still not high. User trust in agents has not yet been established, especially in scenarios involving sensitive data, and enterprises prefer to retain manual review.

The current Agent user experience is significantly polarized. Vertical scenarios such as Lovart in the design field have achieved a full closed loop of “demand-delivery” by integrating industry knowledge bases and multi-modal outputs, allowing designers to iterate directly based on the hierarchical files they generate, increasing efficiency by several times. However, general-purpose agents such as Manus are still clumsy when handling complex tasks – for example, there may be problems such as small fonts and stacked elements when generating design drawings, which require frequent manual adjustments.

More importantly, there is a gap between users’ expectations of agents and their actual capabilities. Some users mistakenly believe that agents can completely replace humans, but in fact, they still require manual intervention in fuzzy instruction parsing, task boundary control, etc.

The outbreak of Agent confirms the leap from technology to application of AI, but there is still a long way to go before the true “GPT moment”.

The current competition is essentially a competition for ecological card slots and the right to define scenes, and the competition layout of large manufacturers seems to be paving the way for future ecological wars. What determines the final outcome is not only the speed of technology iteration, but also the depth of understanding of the pain points of the industry and the courage to innovate the business model. When agents can be integrated into daily life like water, electricity and coal, real change will come.

Information:

Late post “Byte AI Re-entrepreneurship: Saturation Attack of Independent Organizations and Whole Chain”

New Cortex Newthings “How did Wu Yongming manage Alibaba for a year and a half after taking power”

Tencent Technology “Microsoft released 50 new things in one night, to build an agent interconnected “Eden” “

AI In-depth Researcher “Only Talking About Survival: AIAgent Countdown to 730 Days, 3 “Dissidents” in Silicon Valley Give 3 Ways to Survive”

AI in-depth researcher “Sequoia AI Summit closed for 6 hours, 150 founders consensus emerged: AI no longer sells tools, but sells profits”

Chinese entrepreneur “AI new battlefield, Tencent bets on Agent”

Big factories have entered the game, and Baidu, Alibaba, and Byte have snatched the right to speak on agents

The Agent outbreak year is also the sprint year

Where is the winner?

Agent has not yet reached the “GPT moment”

JD.com vs. Meituan, Cudi won

Several variables affecting JD.com’s takeaway appeared at the same time

Exceeded expectations! Taobao flash sale opened up nationwide in advance, and joined forces with Ele.me to reverse the takeaway war

JD.com VS Meituan: The final deduction of the “takeaway war”

Why is a Hello bicycle more expensive than a bus?

Xiaohongshu Entertainment live broadcast sprints urgently, appearing in the background in early May, and the voice hall may appear, are you ready?

o3 In-depth Interpretation: OpenAI Finally Uses Tool Use, Is Agent Products Dangerous?

The Truth Behind AI App Hits: From Cursor to Arc, PMF’s Key Insights That Determine Life and Death

In-depth Interview Practical Guide: Say goodbye to awkward chats and superficial information, and dig into user treasures

How does AI programming choose the right large model? 4 stages + 6 recommendations

In-depth research on “AI glasses”

Feminist organizations respond to criticism: Itch is not asked to remove all adult games!

It’s fierce! Baidu Maps “welding” advertisements on navigation roads? Who won this game between business and experience?

Learn Labubu and teach you how to sell IP products more expensively

The full-scenario functional module design and multi-terminal collaborative logic of the hospital information SaaS platform