In today’s rapid development of artificial intelligence, Nvidia founder and CEO Jensen Huang delivered a landmark speech at the VivaTech conference in Paris, France. He not only elaborated on the latest advances in AI technology, but also put forward a forward-looking view that humanoid robots will become one of the largest industries in the future.
Content source: On June 11, Nvidia founder and CEO Jensen Huang gave a keynote speech at the VivaTech conference in Paris, France, “GTC”.
On June 11, Nvidia founder and CEO Jensen Huang appeared at the VivaTech conference in Paris, France, and delivered his iconic GTC keynote.
Huang once again delivered a long speech spanning AI, computing architecture, humanoid robots and the future of industry, announcing that “the new industrial revolution driven by AI has arrived”.
In his speech, Huang used a GB200 worth $3 million and weighing two tons to connect key concepts such as AI factories, agents, humanoid robots, digital twins and quantum computing, and proposed that AI is not only a tool, but also a new productivity core.
Compared with the past, this speech is more engineering and systematic, integrating software and hardware, ecology and national infrastructure into a grand industrial picture, which not only shows the ambition of NVIDIA’s technology stack, but also reveals the path of AI social deployment in the next ten years.
The following is Huang Renxun’s speech, the text of which has been deleted.
1. AI has entered the Agent era
Agent agent is a very important thing.
As you know, at first, when using pre-trained models, people said, “It hallucinates.” “It makes up content.” “It doesn’t have access to the latest news and data.”
B-end product manager’s ability model and learning improvement
The first challenge faced by B-end product managers is how to correctly analyze and diagnose business problems. This is also the most difficult part, product design knowledge is basically not helpful for this part of the work, if you want to do a good job in business analysis and diagnosis, you must have a solid …
View details >
All these things, you know, why is it trying to figure out how to add or calculate count numbers and add them together? Why doesn’t it just use the calculator?
So all the abilities associated with intelligence – everyone can criticize, but that’s completely true, because everyone has a general understanding of how intelligence works.
But these technologies are being developed and built around the world: they all come together, from augmented retrieval generation to web search to multimodal understanding, so you can read PDFs, visit websites, view images and text, listen to videos, watch videos, and then put all of that understanding into your context.
You can certainly understand the hints of almost anything now.
You can even say, “I’m going to ask you a question, but start with this image.” I can say, “Start with this, start with this text and then answer,” answer the question or do what I ask you to do. “Then it reasons, plans, and self-evaluates itself.
All of these capabilities are now integrated, and you can see them everywhere in the market. Agent AI is real. Autonomous intelligence is a huge leap forward from one-off AI. One-time AI is the necessary foundation that allows us to teach agents how to become agents.
You need to have a certain knowledge base and reasoning ability to be teachable. Pre-training is about the teachability of artificial intelligence. Post-training, reinforcement learning, supervised learning, human demonstration, context provision, generative AI, all of which are converging to form today’s intelligent agent AI.
Let’s look at an example. It is built on Perplexity, an artificial intelligence search engine.
AI agents are digital assistants.
Based on prompts, they reason and break down the problem into a multi-step plan. They use the right tools, collaborate with other agents, and use the context in memory to perform tasks correctly on NVIDIA-accelerated systems.
It all starts with a simple prompt. Let’s ask Perplexity to help open a food truck in Paris.
First, the Perplexity agent reasons and plans through prompts, then calls in other agents to help solve each step using multiple tools.
Market researchers read reviews and reports to spot trends and analyze competitive markets.
Based on this research, the concept designer explores local ingredients and comes up with a menu with an estimate of preparation time, and studies the color palette to generate a brand identity. Financial planners then use Monte Carlo simulations to predict profitability and growth trajectories.
An operations planner created a launch schedule with every detail: from purchasing equipment to obtaining the correct permits.
The marketing commissioner developed a launch plan that included social media campaigns and even wrote an interactive website with maps, menus, and online ordering.
Each agent’s work is compiled into a final proposal.
It all starts with a simple prompt. A prompt, like that one, may only generate a few hundred tokens in the original chatbot.
But now with just a hint and handed over to the agent to solve the problem, it must generate 10,000 times more tags.
That is why the Grace-Blackwell system is needed (which transforms ordinary computers into powerful supercomputers): that is why we need performance and a greater improvement in the system between generations. This is how Perplexity builds their smart agents. Every company must build its own intelligent agent.
This is great, you will be recruiting agents from platforms such as OpenAI, Gemini (an artificial intelligence lab set up under Google’s parent company Alphabet), Microsoft Copilot, Perplexity, Mistral (a startup dedicated to building large general artificial intelligence (AGI) models formed by former researchers from DeepMind and Meta Platforms (META.US). There will also be agents tailored to you. They might help you plan a vacation or, you know, do some research, and so on.
However, if you want to start a company, you will need dedicated agents and dedicated tools, as well as access to specialized tools and specialized skills. So the question is, how do you structure these agents?
Therefore, we have created a platform for you. We’ve created a framework and a suite of tools for you to use, as well as a large group of partners to help you achieve your goals.
It all starts at the bottom, at the bottom: the inference model capabilities I mentioned earlier. NVIDIA’s NeMo, NeMotron inference large language models are world-class.
We have NeMo Retriever, a multimodal search engine. Semantic search engine. Unbelievable. We built a blueprint, a working demo, basically a general-purpose agent. We call it IQ, AI, AIQ.
At the top level, we have a set of tools that allow you to bring in a proxy:
a general-purpose agent that organizes data to teach it;
Evaluate it, set security boundaries, supervise training it, use reinforcement learning until deployment;
Stay safe and secure.
The toolkit is already integrated, and these libraries are integrated into the AIOps ecosystem. You can also download it directly from our website. But it is mainly integrated in the AIOps ecosystem. Based on this, you can create your own dedicated agents.
Now the question is, how do you deploy this? Because as I mentioned earlier, NVIDIA’s computing resources exist in the public cloud. There is a regional cloud that we call NCP. Here, let’s say Mistral.
You may have a private cloud for security needs and data reasons.
So the question is, how do you run all of this? Sometimes they are located in different places because these are microservices: these are artificial intelligence that can communicate with each other, and they can obviously communicate with each other over the network.
So, how do you deploy all these microservices? Now, we have a great system.
I’m excited to announce this for you. This is our DGX Lepton. DGX Lepton, what you see now is a variety of different clouds.
Here is Lambda Cloud (a serverless computing service provided by Amazon Web Services), AWS Cloud (Amazon is the world’s most comprehensive and widely used cloud platform, providing more than 200 full-featured services from global data centers), you know. This is your own developer machine, your own system: it can be a DGX workstation. NeBS (Network Equipment Building System), Yotta (a company that provides data infrastructure), Nscale (UK AI cloud service provider). It could be AWS, or it could be GCP (Common Purpose Computer Program). NVIDIA’s architecture is everywhere.
So you can decide where to run your model.
You deploy it through a super cloud, so it’s a cloud within a cloud.
Once you get it up and running, once you deploy those NIMs into Lepton, it hosts and runs on the various clouds of your choice. A model architecture, deployed once, run anywhere. You can even run it on this tiny machine.
This is my favorite little machine.
We built an AI supercomputer in 2016. It was called DGX-1.
It is the first version of all the technologies I just mentioned. The eight Volta GPUs are interconnected via NVLink, a bus and its communication protocol developed and introduced by Nvidia. We spent billions of dollars to build it, and on the day we announced it: DGX-1, there were no customers. No interest, no applause.
So we built it anyway. Thankfully, a young company, a startup: A non-profit startup in San Francisco was so happy to see the computer that they said, “Can we get one?” I was like, “Oh my God, we’re selling one.” ”
But then I found out that it was a nonprofit. Then I put a DGX-1 in my car and took it to San Francisco. The name of that company is OpenAI.
Imagine you own Lepton. It’s right in your browser, and you develop an AI agent and want to run it here: part of it you want to run on AWS, and part of it you want to run somewhere else, you know? In a regional cloud.
You use Lepton, deploy your Helm Chart, and it magically appears here.
So we’re doing this for Lepton, but next: Hugging Face and NVIDIA have connected Lepton together.
So whenever you train a model on Hugging Face, if you want to deploy it to Lepton: and directly to Spark, that’s fine.
Just one stroke. Whether you’re training or inference, we’re now connected to Hugging Face, and Lepton will help you decide where to deploy.
2. AI industrial revolution and digital twin
This is a factory digital twin built by Mercedes-Benz and its Omniverse, NVIDIA’s graphics and simulation simulation products based on NVIDIA RTX and Pixar Universal Scene Description (USD).
This is the digital twin of the warehouse that Schaeffler and he built in Omniverse.
This is your train station in France: build a digital twin of their train station in Omniverse.
This is the digital twin of Toyota building their warehouse in Omniverse.
When you build these warehouses and factories in Omniverse, you can design it, you can plan it, you can change it.
It is great in a greenfield environment and great in a brownfield environment. You can simulate the effect before actually moving and adjusting it to avoid finding that it is not optimal.
Therefore, the ability to digitize everything in a digital twin is incredible. But the question is, why does a digital twin have to look as real as a photo? Why does it have to obey the laws of physics?
The reason is that we ultimately want to be a digital twin that allows robots to learn how to operate as robots. Robots rely on photons to implement their perception systems. These photons are generated through the Omniverse.
The robot needs to interact with the physical world so it knows if it is doing the right thing and can ……
Learn how to do it right, so these digital twins must look real and behave realistically. Do you understand? That’s why Omniverse was built.
This is a digital twin of a fusion reactor. This is an extremely complex instrument, as you know: without artificial intelligence, the next generation of fusion reactors would not be possible.
We are announcing today that we will be building the world’s first industrial AI cloud here in Europe. I’m going to announce – yes.
These industrial artificial intelligence clouds are indeed a large number of computing resources…… There are a large number of computers in the cloud. However, its requirements in terms of performance and security are fundamentally different. So I’ll tell you more about it on Friday.
Today I’m just selling a pass first. But this industrial cloud will be used for design and simulation. In a virtual wind tunnel, you just need to drive the car in and see its performance.
Opening doors, opening windows, changing designs, all operations are carried out in real time.
As you know, we’ve been here for a long time. NVIDIA has been around for 33 years. We first came to Europe at a time when workstation and product digitalization was on the rise.
We are now in the time of the digital twin revolution, with a nearly two trillion dollar ecosystem in Europe that we work with…… And have the privilege of supporting it.
What has emerged is a new revolution that is taking place.
As you know, everything that moves will be robots. Everything that moves will be powered by artificial intelligence. And cars are the next most obvious area.
NVIDIA builds AI supercomputer for training models: AI supercomputer for Omniverse digital twins. We also build AI supercomputers for the robots themselves.
Whether in the cloud: for Omniverse or in the car, we offer the full technology stack, including the computer itself, as well as the operating system running on that computer.
The computer was high-speed and sensor-rich, so it had to be functionally safe. In no case should it be completely invalidated. Therefore, safety requirements are extremely high.
Now we have an incredible model running on top of it. The model running on top of it is a transformer model.
It is an inference model capable of receiving sensor input: you tell it what you want to do and it will take you there. Receives pixel input and generates path planning output. So it’s a transformer-based generative AI model.
A billion cars are on the road, driving an average of 10,000 miles a year, a trillion miles. The future of autonomous driving is clearly huge, and it will be powered and supported by artificial intelligence.
This is the next big opportunity, and we are working with so many large and exceptional companies around the world to make it possible. Safety is always at the heart of all our work related to autonomous driving.
We are very proud of our HALOS system. It starts with the architecture of the chip, then the chip design and system design, the operating system, the AI model, and the methodology of software development, the way we test, from the method of training the model to the data provided to the model, to the way the model is evaluated.
NVIDIA’s HALOS system and our autonomous safety team and capabilities are recognized worldwide. This computer was the first software-defined computer.
The world’s first fully 100% software-defined, AI-powered software for augmented reality AI-driven stacks for autonomous vehicles. We’ve been doing this for almost a decade, and I’m very proud of this ability that is recognized worldwide.
The changes that are taking place in the automotive industry are also being staged in an emerging industry.
As I mentioned before, if you can generate videos based on prompts, if AI can perceive, it can reason, and it can also generate videos, text, and images, the car, the path, the steering wheel path just mentioned, why can’t it generate both local motor and joint mobility?
Therefore, the fundamental ability of AI to revolutionize one of the hardest problems in robotics is on the horizon.
Humanoid robots will become a reality. We now know how to build these things, train these things, and operate these things.
Humanoid robots could become one of the largest industries ever, which requires companies that know how to make things, to make things with extraordinary capabilities. This refers to European countries. Many industries around the world are based here. I think this will be a huge opportunity.
Let’s say there are a billion robots in the world. Having a billion robots is a very reasonable idea. So, why hasn’t this happened yet? The reason is simple.
Today’s robot programming is too complex. Only the largest companies can afford to install robots. Let the robot learn, program it to perform exactly the right actions. Maintain adequate enclosure for safety. That’s why the world’s largest car companies are equipped with robots.
They are large enough and work repetitively enough. Indeed, the industry has reached a sufficient scale to deploy robots in these factories. This is true for almost all small and medium-sized businesses, whether mom-and-pop stores, restaurants, shops or warehouses.
While we are talking about autonomous intelligence, we now have a toolkit and a Nemo toolkit that can be learned through teaching.
NVIDIA HERE is also built on a three-layer stack. We built this computer called the Thor computer. The dev kit looks roughly like this. This is a completely self-sufficient robotic computer.
The development kit is on your desktop. These are sensors, and inside is a small supercomputer Thor chip.
This is the Thor processor. Above is the operating system designed for robots. In addition, the transformer model receives sensor data and instructions and converts them to generate flight paths or trajectories, as well as motion control for arm joints and, of course, motion control for your leg joints.
The biggest challenge for humanoid robots now is the amount of data required for training is very, very difficult to obtain.
So the question is how do you do that? The solution to this problem is to go back to Omniverse, a digital twin world that follows the laws of physics.
This is an incredible work that we are doing. We developed computers to simulate and train them.
A large number of humanoid robot companies are being established around the world. They both saw a huge opportunity to revolutionize this new field. It can be said that it is a new device that is progressing very quickly. They learn in a virtual world that must obey the laws of physics.
An industrial revolution has begun.
The next wave of artificial intelligence has begun.
Grek is a perfect example of what is possible at this stage of robotics. Teach the robot the techniques necessary to operate, perform simulations, and, of course, an incredible robot is now in front of us. We have physical bots, and we have information bots – we call them agents.
The next wave of AI has begun. The explosion of inference workloads. It will basically grow exponentially. The number of people using reasoning has increased from eight million to eight hundred. It has grown a hundredfold in just a few years.
The number of prompts generated by Tokens, as I mentioned earlier, goes from a few hundred tokens to a few thousand, and of course, we are now using AI more than ever, more than ever.
So, we need a computer designed specifically for thinking, designed for reasoning, and that’s Blackwell, a thinking machine.
These Blackwells will be used in new types of data centers, essentially AI factories, designed for one thing, these AI factories will generate tokens that will be your food.
What is truly incredible is that I am excited to see that Europe is fully committed to AI. The AI infrastructure built here will increase by an order of magnitude in the coming years.
3. The inflection point of quantum computing and CUDA-Q
Quantum computing is at an inflection point.
In 1995, an error correction algorithm was invented. In 2023, nearly 30 years later, Google showcased the world’s first logical qubit (“Logical Qubit” is a core concept in quantum computing designed to solve the noise, decoherence, and other problems that qubits face in real-world applications).
Since then, years have passed, and the number of logical qubits (consisting of a large number of physical qubits with error correction) has also increased.
Then the number of logical qubits starts to grow, and like Moore’s Law, I can fully expect the number of logical qubits to increase tenfold every five years. The number of logical qubits increases a hundredfold every decade.
These logical qubits will have better error correction capabilities: more robust, more performant, more resilient, and of course will continue to be scalable. Quantum computing is reaching an inflection point.
In the next few years, or at least in the next generation of supercomputers, each will be assigned a quantum processing unit (QPU), and the QPU will be connected to the GPU.
The quantum processing unit performs quantum calculations, of course, while the GPU is used for preprocessing: for control and error correction, where the computation is extremely intensive, including post-processing, etc.
Between these two architectures, just as we accelerated CPUs, there are now QPUs working together with GPUs to drive the next generation of computing.
Today we are announcing that our entire quantum algorithm stack is now accelerated on the Grace Blackwell 200. The acceleration is incredible.
We work with the computing, communications, and quantum computing industries in a variety of ways.
One way to do this is to use cuQuantum, a software development kit that accelerates quantum computing, to simulate qubits, or the algorithms that run on these quantum computers. Basically, it is the use of classical computers to simulate or simulate quantum computers.
At the other extreme, cruxly important is CUDA-Q: basically inventing a new CUDA (CUDA, a parallel computing platform and programming model designed and developed by Nvidia), extending CUDA into the realm of quantum classics.
In this way, applications developed on CUDA-Q can run simulated before the arrival of quantum computers, or collaboratively after the arrival of quantum computers: a quantum-classical accelerated computing method.
Today we are announcing that CUDA-Q is available for Grace Blackwell.
The ecosystem here is extremely rich, and of course Europe has a deep heritage in science and supercomputing expertise.
It is not surprising to see progress in quantum computing here. In the coming years, we will see a truly wonderful turning point.
In 2012, we worked with developers on a new algorithm called deep learning. It contributed to AI’s AlexNet (Deep Convolutional Neural Network) Big Bang: 2012. Over the past 15 years or so, AI has made incredibly rapid progress.
The first wave of artificial intelligence is perception, which allows computers to recognize and understand information.
The second wave of artificial intelligence is the generative artificial intelligence that most of us have been talking about in the past five years or so. It is multimodal, meaning that the AI is able to learn both images and language.
So, you can prompt it with language and it will generate images. AI’s multimodal capabilities and ability to translate and generate content are driving a revolution in generative AI. Generative AI, the ability to generate content, is crucial for our productivity.
We are starting a new wave of artificial intelligence. Over the past few years, we have witnessed tremendous advancements in AI capabilities. Fundamentally, intelligence is the task of understanding, perceiving, reasoning, and planning:
How to solve the problem and then perform the task. Perception, reasoning, planning, the basic cycle of intelligence. It allows us to apply some previously learned rules to solve problems we have never seen before.
This is why smart people are considered smart because they are able to break down a complex problem step by step, reason how to solve it, and perhaps conduct research:
maybe learn some new knowledge and ask for help;
Use tools to solve problems step by step.
These words I just described are basically possible today through so-called agent artificial intelligence. I’ll show you more soon.
Now, generative power is generating motion, not generating videos, not generating images or generating text; This artificial intelligence generates motor abilities, i.e. the ability to walk or reach for something and use tools.
Giving artificial intelligence the ability to take on physical form is basically robotics. These capabilities are the basic technologies for realizing agents. They are basically information robots and embodied artificial intelligence: physical robots, two fundamental capabilities that have now arrived.
The era of artificial intelligence is really exciting. But it all started with GeForce . And GeForce brings computer graphics technology. This is the first accelerated computing application we have ever developed.
The evolution of computer graphics technology is incredible. GeForce brings CUDA to the world, enabling Mars machine learning researchers and AI researchers to advance deep learning.
Subsequently, deep learning revolutionized computer graphics technology, allowing us to take computer graphics to a whole new level.
This is the all-new GeForce. Weighing up to two tons, or even two and a half tons. Consists of 1.2 million parts. About $3 million. Manufactured in 150 factories. 200 technology partners work with us to achieve this goal.
Probably about $40 billion in R&D budget, now moving towards GB300. It is already fully in production.
This machine is designed to be a thinking machine. The so-called thinking machine means that it can reason. It has a plan. It spends a lot of time talking to itself, just like you.
We spend most of our time generating words for our minds, and before we express them, we generate images for our own minds. Therefore, thinking about machines is actually the architectural goal of Grace Blackwell’s design. It is designed to be a huge GPU.
There is a good reason why I use this analogy. GeForce is a GPU, and so is the GB200, which is a huge virtual GPU.
Moore’s Law, semiconductor physics can only bring about twice the performance improvement every three to five years. What we need is a 30 to 40x performance improvement because the inference model is talking to itself. How can we achieve a 30 to 40x performance improvement in a single generation?
It’s no longer a one-time ChatGPT, but a reasoning model.
When you think for yourself, it generates more markup. You are breaking down the problem step by step. You’re reasoning, trying different paths. Maybe it’s a chain of thought, maybe it’s a tree of thought. It is reflecting on its answers.
When you see these research models, you reflect on the answer and say, “Is this a good answer?” Can you do better? And then they said, “Oh yes, I can do better.” ”
Then go back and think about it again. Therefore, those thinking models, inference models achieve amazing performance, but this requires more computing power.
We now know for sure that AI is a software that has the potential to revolutionize every industry.
It can do these amazing things. Here’s what we know. We also know that the way we deal with AI is fundamentally different from how we used to deal with handwriting software.
Machine learning software is developed differently and operates differently. The architecture of the system, the architecture of the software: completely different. The way the network works is completely different. The way to access storage is completely different.
So we know that this technology can do different things: incredible things, it’s smart. We also know that it is developed fundamentally differently: it requires new computers.
What is really interesting is what this means for countries? What does it mean for enterprises and society? This is a phenomenon that we noticed almost a decade ago, and now everyone is starting to realize it:
In fact, these AI data centers are not data centers at all. They are not data centers in the traditional sense and are used to store files that you can retrieve.
These data centers do not store our files. It has only one task, and that is the generation of intelligent markers, that is, the generation of artificial intelligence.
These AI factories look like data centers because there are a lot of computers in them.
No one really considers their data center as an income-generating facility. I said something and everyone said, “Yes, I think you’re right.” “No one thinks of a data center as a revenue-generating facility.
But they see their factory, the car factory, as an income-generating facility: they can’t wait to build another factory because every time you build a factory, revenue grows quickly. You can create more for more people.
These AI factories are revenue-generating facilities designed to manufacture tokens. These tokens can be reimagined into productive intelligence in multiple industries, so AI factories are now part of a nation’s infrastructure.
That’s why you see me traveling around the world talking to heads of state: because they all want to have AI factories. They both want AI to be part of their infrastructure. They want AI to be a growth manufacturing industry for them.
It’s really far-reaching, and I think what we’re talking about is: So, there’s a new industrial revolution because every industry is affected, and a new industry is born.
Just as electricity, when it was first described and presented as a technology, later developed into an emerging industry: it was understood as a technology, but then we realized that it was also a huge industry. Then there is the information industry, which we now call the Internet.
Both of these have affected many industries and become part of infrastructure. We now have a new industry – the artificial intelligence industry: it is now part of a new infrastructure called smart infrastructure. Every country, every society, every company will depend on it.