Those AI Agents that are beyond imagination

Excel World Championship function Shura field, Shortcut takes 10 minutes to submit the paper; The foreign trade sales champion can’t withstand Agentforce’s 7×24-hour “flattery”; Polaris has pushed the diagnostic accuracy rate to 99%, making nurses and doctors nod at the same time. This article uses four real-life cases beyond imagination to tell you that agents do not replace humans, but liberate us from repetition, staying up late, and massive amounts of data, allowing cognitive boundaries to extend infinitely.

In the mid-5th century, an unknown Christian poet died, which happened to be the deadline for the reconstruction of an ancient environment. What is the name of this scientific chronology? ”

Faced with such an unpopular problem, I am afraid that even the most senior scholars will fall into deep thought. Neither the name of the poet nor the chronology is known, the traditional search engine is completely out of order here, and two seemingly unrelated information points are like two grains of sand in the sea, making people unable to start.

This is such a confusing problem, but an agent called WebSailor can quickly lock in the correct answer through cross-verification: the poet is Synesius of Cyrene, the scientific chronology “PAGES 2k”, the time is 414 years.

This can’t help but be shocking: when did AI evolve to such an extent?

You know, just half a year ago, Agent was generally considered to be a toy attribute greater than a tool attribute. It is difficult to find a ticket for the internal test of most products, but the actual performance is frequently overturned.

Despite the initial unsatisfactory results, agents evolve rapidly. Nowadays, in professional fields such as marketing and healthcare, the performance of agents has even surpassed the human level.

Today, let’s take a look at what agents are available in the first half of the year, which has exceeded our previous imagination.

1. In 10 minutes, answer a set of World Championship problems correctly

In the face of World Championship-level financial modeling questions, even experienced analysts often need hours of deduction and verification. But if I tell you now, can someone give an accurate answer in 10 minutes, would you believe it?

Such a complex task, even if placed on the best large model on the market, may be at a loss. But an agent called Shortcut completed it in just 10 minutes, not only with an accuracy rate of more than 80%, but also 10 times faster than humans.

How hard is Excel World Championship?

How can product managers do a good job in B-end digitalization?
All walks of life have taken advantage of the ride-hail of digital transformation and achieved the rapid development of the industry. Since B-end products are products that provide services for enterprises, how should enterprises ride the digital ride?

View details >

It is officially endorsed by Microsoft and operated by the FMWC Organizing Committee, and its tasks cover complex functions, Power Query, dynamic arrays, Monte Carlo simulations, etc., and are described by contestants as “the most cruel function Shura field”. The contestants come from all over the world, most of whom are investment bank data analysts, four major financial modeling directors, and former MVPs of Microsoft.

This year’s exam questions, which are Shortcut’s debut test questions, are themed on the 30th anniversary of “World of Warcraft” and require contestants to complete more than 20 related form operations within 40 minutes. Participants need to manually establish formulas such as VLOOKUP and INDEX-MATCH to establish precise links in the complex data maze.

In this regard, Shortcut not only overcomes the limitations of traditional AI models in terms of data processing volume, but also perfectly avoids the pain points of hallucinatory output. In the face of massive amounts of disordered data and highly deterministic function rules, it can quickly understand task requirements and provide precise solutions like an experienced analyst.The question originally took 1~2 hours for human players to complete, but Shortcut handed over the perfect answer sheet in just 10 minutes.

According to the development team, Shortcut supports natural language instruction interaction and can easily handle complex tasks such as financial modeling, 5,000-line CSV data analysis, data visualization, and even pixel art creation. Its core capabilities cover professional functions such as intelligent filling, automatic error troubleshooting, and multi-table association analysis, making it a hexagonal warrior in the field of Excel.

Seeing such a financial personnel, they may call it a savior.

Because the finance department is most worried about countless data, tables, and documents, but the early AI development is subject to token limitations and illusions, unable to process hundreds or thousands of data, and a decimal point or punctuation error will bring immeasurable losses to the company. This also left the public with the impression that AI could not solve practical problems.

The emergence of Shortcut breaks the situation and brings new possibilities to this pain point.

After all, 5,000 rows of CSV data can be nearly a week’s work if they are entered and proofread line by line. Now, although Shortcut still has the possibility of errors in complex function graphing, even solving a single information collation work can also save their increasingly scarce volume.

2. With a move of the mouse, the transaction rate is directly more than half?

In the foreign trade industry, the sales team may only be able to push the transaction rate from 10% to 15%. But there is one company that quietly pulled this number to 50% – not by crazy overtime, not by crowd tactics, but by an invisible sales trump card.

The company thinks that the other party has invited a master, and the customer thinks that he is making his own decisions? No, they may have long fallen into the gentle trap elaborated by the Agent.

The data shows that the order completion rate of a traditional salesman is generally 10%~15%.However, an agent called Agentforce has an order completion rate of 50%. Since its launch in 2024, there have been more than 8,000 orders.

The most heart-wrenching thing for salesmen is that this agent not only has a high transaction rate and a low signing amount, but also a seven-figure dollar level. If these large orders are signed by yourself, the commission will be at least four figures. But the reality is that the most sophisticated sales champion also has to think, why did the painstaking management skills and words be cut off by an agent who came out of nowhere?

The first point is that humans who want to rest cannot fight the machines that rotate the axle.There is a saying in cross-border trade that whoever stays up late makes more money. The existence of jet lag has produced the day and night shift work and rest of foreign trade, but still no one can stick to their posts 24 hours a day and accurately persuade the customer when he decides to place an order. Agentforce does it, it is like a digital sales system that never gets tired, processing thousands of conversations concurrently in 24×7h mode, reducing the number of manual agents by 30~60%.

The second point is that the unified and stereotyped rhetoric is not as good as the multifaceted “flattery”.Why do customers often not realize that it is AI that makes them tempted when placing orders? Because in the 21st century, there is really no character who is more flattering than AI. Traditional sales rely on manpower, and salesmen rely on experience to judge customer intentions, which are affected by personal emotions and fatigue, and it is difficult to weave words for the appetite. However, Agentforce can analyze behavioral traces such as official website browsing and email interaction in real time, target high-intent targets, and automatically adjust speech through sentiment analysis to improve subsequent conversion rates.

Third, a native speaker is no match for an AI that is proficient in foreign languages and encyclopedias.With AI, being able to speak a foreign language really cannot be regarded as significantly powerful. It is reported that Agentforce’s training corpus spans 17 languages and covers 740,000 official Salesforce documents and metadata. Relying on Salesforce’s industry-grade data lake with a total volume of up to 200~300PB, Agentforce achieves contextual depth and domain accuracy that far exceeds similar products, significantly reducing the risk of hallucinations and providing more reliable results.

We have reason to believe that in the future, Agent salesmen will attack every trading field, whether it is a commodity or a small business, its transaction rate will become higher and higher, and the transaction range will become wider and wider.

3. The diagnostic accuracy rate is more than 99%, which is more reliable than some human doctors

Do you dare to take the medicine prescribed by AI?

We all know that AI has entered various fields, and medical care is no exception, but most people may still be frightened to take the medicine prescribed by AI directly. After all, small differences in medication dosage can lead to addiction, and small deviations in medication regimens can also cause serious side effects.

But if you tell me that AI doctors even exceed professional doctors in diagnostic accuracy, can you believe it?

In the United States, a medical agent called Polaris can provide patients with real medication advice, with an accuracy rate of more than 99%, which is much higher than the average of 81% for registered nurses in the United States.Moreover, the drugs and follow-up opinions recommended by the agent tend to be close to 90% in the patient’s praise rate. This means that AI is not only more accurate than humans, but even more trusted by patients than humans.

But how does it do it as an agent? This stems from the collaborative work and cross-validation mechanism of multiple agents.

Polaris is made up of three agents instead of a single model making independent decisions.For example, when a patient asks about the side effects of a drug, the laboratory agent retrieves the latest drug clinical trial data to ensure that the information is based on authoritative medical research; The drug agent checks the patient’s medication history and allergy records to avoid potential drug interaction risks; The main agent synthesizes the analysis of the first two to generate a final recommendation and mark the confidence level.

In order to further ensure medication safety and patient welfare, more than 6,500 nurses and 500 doctors participated in the final safety evaluation, helping the system obtain FDA-approved medical AI patents.

It is reported that in the UAE, Polaris has been integrated into the digital system of Burjeel Medical Group. In more than 1.85 million real patient interaction tests, Polaris3.0 has a clinical accuracy rate of 99.38% and a patient satisfaction rate of 8.95/10.

However, it should be noted that Polaris can only provide consultation plans and medical advice for diseases with clear solutions and medical cases, and cannot directly participate in drug research and development.In other words, medical agents emphasize the diagnostic accuracy of routine cases rather than R&D and innovation work. So to some extent, it can only play a role in clinical practice, and cannot participate in cutting-edge work such as drug development for rare diseases. Because for a life-first scenario like a hospital, safety must come first. Agents still have a long way to go if they want to compete with professional doctors.

It is not difficult to see that in just one year, agents have gradually exceeded people’s imagination. From the development trajectory of these agents, we can clearly see a trend: Agents are moving from concept to practicality, from the laboratory to our daily work and life. They are not cold machines, but are gradually becoming the right-hand men of professionals in various fields. WebSailor frees researchers from being overwhelmed by the vast amount of literature, Shortcut frees the hands of finance staff, Agentforce becomes the secret weapon of sales teams, and Hippocratic is the second brain of healthcare workers.

The most valuable thing about these agents is that they are not meant to replace humans, but by compensating for the limitations of human efficiency, memory and computing power, we can devote more energy to areas that really need human intelligence.Just as telescopes extend human horizons, these agent tools are expanding the boundaries of our cognition.

In the foreseeable future, each of us may have one or more agents as assistants: an agent mentor who helps us learn new knowledge, an agent secretary who manages our schedules, an agent doctor who takes care of our health, and an agent partner who creates content…… But like all the great tools of history, they will not replace us, but will make us stronger and eventually become part of human capabilities.

End of text
 0