摘要

Gary Shapiro, CEO of the Consumer Technology Association, opened CES 2025 emphasizing the transformative power of technology in addressing global challenges and creating opportunities. He introduced Jensen Huang, CEO of NVIDIA, who highlighted NVIDIA...

逐字稿

講者01:24 - 01:27

Thank you.

講者05:44 - 06:14

Thank you.

講者06:19 - 06:20

So,

講者07:04 - 07:31

It is important to understand the meaning of the word. It is important to understand the meaning of the word. It is important to understand the meaning of the word.

講者11:11 - 11:30

Thank you.

講者12:12 - 12:13

Thank you.

講者13:04 - 13:07

Thank you.

講者15:33 - 15:57

Thank you.

講者18:07 - 18:30

CES isn't just about what's next. It's about what's possible. And when technology and humanity intersect, the answer is anything. That's because tech doesn't just solve challenges. It transforms them into opportunities. It helps us move smarter.

講者18:30 - 18:57

Live healthier. And experience the world in ways we never thought possible. We are not just here for a tech event. We're here to connect. To solve. To discover. Together. Tech isn't just advancing. It's uniting. Bringing us closer to an autonomous future. Connecting us to better care. Making life more connected, more dynamic, and more human.

講者18:57 - 19:13

Today's challenges demand bold solutions, and CES is where they start to take shape. Breakthroughs in sustainability, advancements to help feed our growing world. This week isn't just a stage for breakthroughs, it's the spark for discovery.

講者19:14 - 19:41

Every screen, every pixel, every part of the tech you will see here showcases the extraordinary potential of human ingenuity meeting technological power. Now we begin this celebration of what connects us, has the power to solve our greatest challenges, and offers endless possibilities we're yet to discover. Right here, right now. The world is watching. So let's dive in.

講者19:47 - 20:12

Ladies and gentlemen, welcome to CES 2025. I'm Gary Shapiro, CEO and Vice Chairman of the Consumer Technology Association, the producer of CES, and I am so thrilled to kick off this show with a keynote by one of the most consequential companies in the world.

講者20:12 - 20:32

NVIDIA exemplifies the cutting-edge innovation we celebrate at CES, and founder and CEO Jensen Wang is a true visionary, demonstrating the power of ideas, technology, and conviction to drive innovation and reshape our industry and our society.

講者20:34 - 20:57

I always like to say that if I'd listened a little bit more closely the last time Jensen spoke at a CTA event, I could have retired already. But over the past three decades, he has established NVIDIA as a force driving change across the globe in industries ranging from healthcare to automotive and entertainment.

講者20:58 - 21:22

Today, NVIDIA is pioneering breakthroughs in AI and accelerated computing that touch nearly every person and every business. Thanks to his leadership, NVIDIA's innovations enable advanced chatbots, robots, software-defined vehicles, huge virtual worlds, hyper-synchronized factory floors, and so much more.

講者21:22 - 21:39

Wang has been named the world's best CEO by Fortune and The Economist, as well as one of Time Magazine's 100 most influential people in the world. But you know, the fact is, like for all of us in this room, our success and his success was not preordained.

講者21:41 - 22:10

Jensen started out working at a Denny's as a dishwasher and a busboy, so be nice to them in the future. And he said that the lessons he's learned there, the value of hard work, humility, and hospitality, are what helped him keep the faith and persevere through some of NVIDIA's early challenges. In just a few minutes, we'll hear from NVIDIA founder and CEO Jensen Huang on his unwavering vision of the future and where we're headed next.

講者22:11 - 22:14

Stay tuned and have a great CES.

講者22:54 - 23:23

Thank you. Thank you.

講者23:28 - 23:31

Thank you.

講者24:09 - 24:30

Thank you for watching.

講者24:38 - 24:42

Thank you.

講者25:12 - 25:16

Thank you for watching.

講者26:38 - 27:06

This is how intelligence is made. A new kind of factory, generator of tokens, the building blocks of AI. Tokens have opened a new frontier, the first step into an extraordinary world where endless possibilities are born.

講者27:11 - 27:40

Tokens transform words into knowledge and breathe life into images. They turn ideas into videos and help us safely navigate any environment. Tokens teach robots to move like the masters, inspire new ways to celebrate our victories,

講者27:40 - 28:10

And give us peace of mind when we need it most. They bring meaning to numbers to help us better understand the world around us.

講者28:13 - 28:40

Predict the dangers that surround us. And find cures for the threats within us. Tokens can bring our visions to life,

講者28:48 - 29:16

And restore what we've lost. Zachary, I got my voice back, buddy. They help us move forward. One small step at a time. And one giant leap. Together.

講者29:26 - 29:48

And here is where it all begins. Welcome to the stage, NVIDIA founder and CEO, Jensen Wang.

講者29:57 - 30:26

Welcome to CES! Are you excited to be in Las Vegas? Do you like my jacket? I thought I'd go the other way from Gary Shapiro. I'm in Las Vegas after all. If this doesn't work out, if all of you object, well, just get used to it. I really think you have to let this sink in.

講者30:28 - 30:54

In another hour or so, you're going to feel good about it. Well, welcome to NVIDIA. In fact, you're inside NVIDIA's digital twin. And we're going to take you to NVIDIA. Ladies and gentlemen, welcome to NVIDIA. You're inside our digital twin.

講者30:58 - 31:26

Everything here is generated by AI. It has been an extraordinary journey, extraordinary year, and it started in 1993. Ready, go! With NV1, we wanted to build computers that can do things that normal computers couldn't. And NV1 made it possible to have a game console in your PC.

講者31:27 - 31:57

Our programming architecture was called UDA. Missing the letter C until a little while later, but UDA, Unified Device Architecture. And the first developer for UDA and the first application that ever worked on UDA was Sega's Virtual Fighter. Six years later, we invented, in 1999, the programmable GPU. And it started

講者31:58 - 32:26

20 years, 20 plus years of incredible advance in this incredible processor called the GPU. It made modern computer graphics possible. And now, 30 years later, Sega's Virtual Fighter is completely cinematic. This is the new Virtual Fighter project that's coming. I just can't wait. Absolutely incredible.

講者32:27 - 32:55

Six years after that, six years after 1999, we invented CUDA so that we could explain or express the programmability of our GPUs to a rich set of algorithms that could benefit from it. CUDA initially was difficult to explain, and it took years, in fact. It took approximately six years. Somehow, six years later,

講者32:56 - 33:24

Six years later or so, 2012, Alex Krzyzewski, Ilya Suskovor, and Jeff Hinton discovered CUDA, used it to process AlexNet, and the rest of it is history. AI has been advancing at an incredible pace since. It started with perception AI. We now can understand images and words and sounds.

講者33:25 - 33:52

To generative AI, we can generate images and texts and sounds. And now, agentic AI, AIs that can perceive, reason, plan, and act. And then the next phase, some of which we'll talk about tonight, physical AI, 2012. Now magically, 2018, something happened that was pretty incredible.

講者33:54 - 34:23

Google's Transformer was released as BERT, and the world of AI really took off. Transformers, as you know, completely changed the landscape for artificial intelligence. In fact, it completely changed the landscape for computing altogether. We recognized properly that AI was not just a new application with a new business opportunity, but AI

講者34:23 - 34:51

More importantly, machine learning enabled by transformers was going to fundamentally change how computing works. And today, computing is revolutionized in every single layer, from hand coding, instructions that run on CPUs to create software tools that humans use. We now have machine learning that creates and optimizes neural networks

講者34:52 - 35:17

That processes on GPUs and creates artificial intelligence. Every single layer of the technology stack has been completely changed. An incredible transformation in just 12 years. Well, we can now understand information of just about any modality. Surely you've seen text and images and sounds and things like that.

講者35:17 - 35:38

But not only can we understand those, we can understand amino acids, we can understand physics. We understand them, we can translate them, and generate them. The applications are just completely endless. In fact, almost any AI application that you see out there, what modality is the input that it learned from?

講者35:39 - 36:07

What modality of information did it translate to? And what modality of information is it generating? If you ask these three fundamental questions, just about every single application could be inferred. And so when you see application after applications that are AI driven, AI native, at the core of it, this fundamental concept is there. Machine learning has changed how every application is going to be built, how computing will be done.

講者36:07 - 36:37

And the possibilities beyond. Well, GPUs, GeForce, in a lot of ways, all of this with AI is the house that GeForce built. GeForce enabled AI to reach the masses. And now, AI is coming home to GeForce. There are so many things that you can't do without AI. Let me show you some of it now.

講者37:13 - 37:35

Thank you.

講者38:11 - 38:32

That was real-time computer graphics. No computer graphics researcher, no computer scientist would have told you that it is possible for us to ray trace every single pixel at this point.

講者38:33 - 38:58

Ray tracing is a simulation of light. The amount of geometry that you saw was absolutely insane. It would have been impossible without artificial intelligence. There are two fundamental things that we did. We used, of course, programmable shading and ray traced acceleration to produce incredibly beautiful pixels. But then we have artificial intelligence be conditioned

講者38:59 - 39:26

Be controlled by that pixel to generate a whole bunch of other pixels. Not only is it able to generate other pixels spatially, because it's aware of what the colors should be, it has been trained on a supercomputer back in NVIDIA, and so the neural network that's running on the GPU can infer and predict the pixels that we did not render. Not only can we do that, it's called DLSS,

講者39:27 - 39:55

The latest generation of DLSS also generates beyond frames. It can predict the future, generating three additional frames for every frame that we calculate. What you saw, if we just said four frames of what you saw, because we're gonna render one frame and generate three. If I said four frames at full HD, 4K, that's 33 million pixels or so. Out of that 33 million pixels,

講者39:56 - 40:24

We computed only two. It is an absolute miracle that we can computationally, computationally using programmable shaders and our ray traced engine, ray tracing engine, to compute two million pixels and have AI predict all of the other 33. And as a result, we're able to render at incredibly high performance because AI does a lot less computation.

講者40:25 - 40:55

It takes, of course, an enormous amount of training to produce that. But once you train it, the generation is extremely efficient. So this is one of the incredible capabilities of artificial intelligence. And that's why there's so many amazing things that are happening. We used G-Force to enable artificial intelligence. And now artificial intelligence is revolutionizing G-Force. Everyone, today we're announcing our next generation, the RTX

講者40:55 - 40:58

Blackwell family, let's take a look.

講者41:56 - 42:24

Here it is, our brand new GeForce RTX 50 series Blackwell architecture. The GPU is just a beast. 92 billion transistors, 4,000 tops, four petaflops of AI, three times higher than the last generation Ada. And we need all of it to generate those pixels that I showed you.

講者42:25 - 42:51

380 ray tracing teraflops so that we could, for the pixels that we have to compute, compute the most beautiful image you possibly can. And of course, 125 shader teraflops. There is actually a concurrent shader teraflops as well as an integer unit of equal performance. So two dual shaders. One is for floating point, one is for integer. G7 memory.

講者42:51 - 43:18

From Micron, 1.8 terabytes per second, twice the performance of our last generation, and we now have the ability to intermix AI workloads with computer graphics workloads. And one of the amazing things about this generation is the programmable shader is also able to now process neural networks. So the shader is able to carry these neural networks, and as a result, we invented

講者43:18 - 43:44

Neural texture compression and neural material shading. As a result of that, you get these amazingly beautiful images that are only possible because we use AIs to learn the texture, learn the compression algorithm, and as a result, get extraordinary results. Okay, so this is the brand new RTX Blackwell 59.

講者43:48 - 44:13

Now, even the mechanical design is a miracle. Look at this, it's got two fans. This whole graphics card is just one giant fan. You know, so the question is, where's the graphics card? Is it literally this big? The voltage regulator design is state-of-the-art, incredible design. The engineering team did a great job. So here it is, thank you.

講者44:19 - 44:49

Okay, so those are the speeds and fees, so how does it compare? Well, this is RTX 4090. I know, I know many of you have one. I know it. Look, it's $1,599. It is one of the best investments you could possibly make. For $1,599, you bring it home to your $10,000

講者44:50 - 45:08

PC Entertainment Command Center. Isn't that right? Don't tell me that's not true. Don't be ashamed. It's liquid cooled. Fancy lights all over it. You lock it when you leave.

講者45:10 - 45:30

It's the modern home theater. It makes perfect sense. And now for $1,500 and $1,599, you get to upgrade that and turbocharge the living daylights out of it. Well, now with the Blackwell family, RTX 5070, $4,090 performance at $549.

講者45:38 - 46:08

Impossible without artificial intelligence. Impossible without the four tops, four teraops of AI tensor cores. Impossible without the G7 memories. Okay, so 5070, 4090 performance, $549, and here's the whole family. Starting from 5070, all the way up to 5090. 5090, twice the performance of a 4090. Starting...

講者46:08 - 46:38

Of course, we're producing a very large scale availability starting January. Well, it is incredible, but we managed to put these gigantic performance GPUs into a laptop. This is a 5070 laptop. For $1299, this 5070 laptop has a 4090 performance. I think there's one here somewhere. Let me show you this.

講者46:40 - 47:02

This is a, look at this thing. Here, let me, here. There's only so many pockets. Ladies and gentlemen, Janine Paul. So can you imagine, you get this incredible graphics card here, Blackwell. We're gonna shrink it and put it in there. Does that make any sense?

講者47:04 - 47:32

Well, you can't do that without artificial intelligence. And the reason for that is because we're generating most of the pixels using our tensor cores. So we ray trace only the pixels we need, and we generate using artificial intelligence all of the other pixels we have. As a result, the energy efficiency is just off the charts. The future of computer graphics is neural rendering, the fusion of artificial intelligence and computer graphics. And what's really amazing,

講者47:35 - 47:54

Oh, here we go, thank you. This is a surprisingly kinetic keynote. And what's really amazing is the family of GPUs we're gonna put in here. And so the 5090, the 5090 will fit into a laptop, a thin laptop. That last laptop was 14.

講者47:54 - 48:22

14.9 millimeters, you got a 5080, 5070 TI, and 5070. Okay, so ladies and gentlemen, the RTX Blackwell family. Well, GeForce brought AI to the world, democratized AI. Now AI has come back.

講者48:23 - 48:45

Let's talk about artificial intelligence. Let's go to somewhere else at NVIDIA. This is literally our office. This is literally NVIDIA's headquarters. Okay, so let's talk about AI.

講者48:46 - 49:12

Is chasing and racing to scale artificial intelligence. And the scaling law is a powerful model. It's an empirical law that has been observed and demonstrated by researchers and industry over several generations. And the scaling laws says that the more data you have,

講者49:12 - 49:40

The training data that you have, the larger model that you have, and the more compute that you apply to it therefore, the more effective or the more capable your model will become. And so the scaling law continues. What's really amazing is that now we're moving towards, of course, and the internet is producing about twice the amount of data every single year as it did last year. I think in the next couple of years we'll produce

講者49:41 - 50:02

Humanity will produce more data than all of humanity has ever produced since the beginning. And so we're still producing a gigantic amount of data, and it's becoming multimodal. Video and images and sound, all of that data could be used to train the fundamental knowledge, the foundational knowledge of an AI.

講者50:03 - 50:25

But there are, in fact, two other scaling laws that has now emerged. And it's somewhat intuitive. The second scaling law is post-training scaling law. Post-training scaling law uses technologies, techniques like reinforcement learning, human feedback. Basically, the AI produces and generates answers.

講者50:25 - 50:52

Based on a human query, the human then of course gives a feedback. It's much more complicated than that, but that reinforcement learning system with a fair number of very high quality prompts causes the AI to refine its skills. It could fine tune its skills for particular domains. It could be better at solving math problems, better at reasoning, so on and so forth. And so it's essentially like

講者50:53 - 51:22

Having a mentor or having a coach give you feedback after you're done going to school. And so you get tests, you get feedback, you improve yourself. We also have reinforcement learning AI feedback, and we have synthetic data generation. These techniques are rather akin to, if you will, self practice. You know the answer to a particular problem, and you continue to try it until you get it right.

講者51:22 - 51:30

And so an AI could be presented with a very complicated and a difficult problem that is verifiable functionally.

講者51:30 - 51:54

And it has an answer that we understand, maybe proving a theorem, maybe solving a geometry problem. And so these problems would cause the AI to produce answers, and using reinforcement learning, it would learn how to improve itself. That's called post-training. Post-training requires an enormous amount of computation, but the end result produces incredible models.

講者51:55 - 52:24

We now have a third scaling law, and this third scaling law has to do with what's called test time scaling. Test time scaling is basically when you're being used. When you're using the AI, the AI has the ability to now apply a different resource allocation. Instead of improving its parameters, now it's focused on deciding how much computation to use to produce the answers it wants to produce.

講者52:25 - 52:55

Reasoning is a way of thinking about this. Long thinking is a way to think about this. Instead of a direct inference or one-shot answer, you might reason about it. You might break down the problem into multiple steps. You might generate multiple ideas and evaluate. Your AI system would evaluate which one of the ideas that you generated was the best one. Maybe it solves the problem step by step, so on and so forth. And so now test time scaling has proven to be incredibly effective.

講者52:55 - 53:06

You're watching this sequence of technology and all of these scaling laws emerge as we see incredible achievements from ChatGPT to

講者53:07 - 53:34

From 01 to 03, and now Gemini Pro, all of these systems are going through this journey step by step by step of pre-training to post-training to test time scaling. Well, the amount of computation that we need, of course, is incredible. And we would like, in fact, we would like, in fact, that society has the ability to scale the amount of computation to produce more and more novel and better intelligence.

講者53:34 - 53:57

Intelligence, of course, is the most valuable asset that we have, and it can be applied to solve a lot of very challenging problems. And so, scaling law. It's driving enormous demand for NVIDIA computing. It's driving enormous demand for this incredible chip we call Blackwell. Let's take a look at Blackwell. Well, Blackwell is in full production.

講者53:59 - 54:15

It is incredible what it looks like. So first of all, every single cloud service provider now have systems up and running. We have systems here from about 15, excuse me,

講者54:15 - 54:37

15 computer makers. It's being made about 200 different SKUs, 200 different configurations. They're liquid-cooled, air-cooled, x86, NVIDIA gray CPU versions, MVLink 36x2, MVLink 72x1, a whole bunch of different types of systems so that we can accommodate just about every single data center in the world.

講者54:37 - 55:04

Well, these systems are being currently manufactured in some 45 factories. It tells you how pervasive artificial intelligence is and how much the industry is jumping onto artificial intelligence in this new computing model. Well, the reason why we're driving it so hard is because we need a lot more computation. And it's very clear, it's very clear that

講者55:07 - 55:29

Janine? You know, it's hard to tell. You don't ever want to reach your hands into a dark place. Hang on a second. Is this a good idea? All right.

講者55:45 - 56:15

Wait for it. Wait for it. I thought I was worthy. Apparently, Yoner didn't think I was worthy. All right. This is my show and tell. This is a show and tell. So this MVLink system, this right here, this MVLink system,

講者56:15 - 56:44

This is GB 200, MV link 72. It is one and a half tons. 600,000 parts. Approximately equal to 20 cars. 120 kilowatts. It has a spine behind it that connects all of these GPU together. Two miles of copper cable.

講者56:46 - 57:09

5,000 cables. This is being manufactured in 45 factories around the world. We build them, we liquid cool them, we test them, we disassemble them, ship them in parts to the data centers because it's one and a half tons. We reassemble it outside the data centers and install them. The manufacturing is insane.

講者57:09 - 57:38

But the goal of all of this is because the scaling laws are driving computing so hard that this level of computation, Blackwell over our last generation, improves the performance per watt by a factor of four. Performance per watt by a factor of four, performance per dollar by a factor of three. That basically says that in one generation, we reduce the cost of training these models by a factor of three.

講者57:38 - 58:08

Or if you wanna increase the size of your model by a factor of three, it's about the same cost. But the important thing is this. These are generating tokens that are being used by all of us when we use ChatGPT or when we use Gemini, use our phones in the future. Just about all of these applications are gonna be consuming these AI tokens. And these AI tokens are being generated by these systems. And every single data center is limited by power. And so if the per watt of Blackwell,

講者58:08 - 58:38

Is four times our last generation, then the revenue that could be generated, the amount of business that could be generated in the data center is increased by a factor of four. And so these AI factory systems really are factories today. Now the goal of all of this is so that we can create one giant chip. The amount of computation we need is really quite incredible. And this is basically one giant chip. If we would have had to build a chip, one, here we go.

講者58:39 - 58:49

Sorry, you guys. You see that? That's cool. Look at that, disco lights in here.

講者58:49 - 59:18

If we had to build this as one chip, obviously this would be the size of the wafer. But this doesn't include the impact of yield. It would have to be probably three or four times the size. But what we basically have here is 72 Blackwell GPUs or 144 dies. This one chip here is 1.4 exaflops. The world's largest supercomputer, fastest supercomputer only recently, this entire room supercomputer only recently achieved an exaflop plus.

講者59:18 - 59:44

This is 1.4 exaflops of AI floating point performance. It has 14 terabytes of memory, but here's the amazing thing. The memory bandwidth is 1.2 petabytes per second. That's basically the entire internet traffic that's happening right now. The entire world's internet traffic is being processed across these chips.

講者59:45 - 01:00:14

Okay, and we have 130 trillion transistors in total, 2,592 CPU cores, whole bunch of networking. And so these, I wish I could do this, I don't think I will. So these are the Blackwells. These are our ConnectX networking chips. These are the NVLink, and we're trying to pretend about the NVLink spine

講者01:00:14 - 01:00:44

But that's not possible. And these are all of the HBM memories, 14 terabytes of HBM memory. This is what we're trying to do. And this is the miracle of the Blackwell system. The Blackwell dies right here. It is the largest single chip the world's ever made. But yet, the miracle is really, in addition to that, this is the Grace Blackwell system. Well, the goal of all of this, of course, is so that we can, thank you, thanks.

講者01:00:48 - 01:01:05

Boy, is there a chair I could sit down for a second? Can I have a Michelob Ultra?

講者01:01:17 - 01:01:46

How is it possible that we're in the Michelob Ultra Stadium? It's like coming to Nvidia and we don't have a GPU for you. So we need an enormous amount of computation because we wanna train larger and larger models. And these inferences used to be one inference, but in the future, the AI is gonna be talking to itself. It's gonna be thinking.

講者01:01:46 - 01:01:54

It's gonna be internally reflecting, processing. So today, when the tokens are being generated at you, so long as it's coming out,

講者01:01:55 - 01:02:24

At 20 or 30 tokens per second, it's basically as fast as anybody can read. However, in the future, and right now with GPT-01, with the new Gemini Pro and the O1, O3 models, they're talking to themselves, reflecting, they're thinking. And so as you can imagine, the rate at which the tokens could be ingested is incredibly high. And so we need the token generation rates to go way up.

講者01:02:24 - 01:02:39

And we also have to drive the cost way down simultaneously so that the quality of service can be extraordinary, the cost to customers can continue to be low, and AI will continue to scale. And so that's the fundamental purpose, the reason why we created MVLink.

講者01:02:39 - 01:03:04

Well, one of the most important things that's happening in the world of enterprise is agentic AI. Agentic AI basically is a perfect example of test time scaling. AI is a system of models. Some of it is understanding, interacting with the customer, interacting with the user. Some of it is maybe retrieving information, retrieving information from storage, a semantic AI system like a RAG.

講者01:03:04 - 01:03:24

Maybe it's going on to the internet, maybe it's studying a PDF file, and so it might be using tools, it might be using a calculator. And it might be using a generative AI to generate charts and such. And it's taking the problem you gave it, breaking it down step by step, and it's iterating through all these different models.

講者01:03:24 - 01:03:47

Well, in order to respond to a customer in the future, in order for AI to respond, it used to be ask a question, answer starts spewing out. In the future, you ask a question, a whole bunch of models are gonna be working in the background. And so test time scaling, the amount of computation used for inferencing is gonna go through the roof. It's gonna go through the roof because we want better and better answers.

講者01:03:48 - 01:04:09

To help the industry build agentic AI, our go to market is not direct to enterprise customers. Our go to market is we work with software developers in the IT ecosystem to integrate our technology to make possible new capabilities just like we did with CUDA libraries. We now want to do that with AI libraries.

講者01:04:10 - 01:04:26

And just as the computing model of the past has APIs that are doing computer graphics or doing linear algebra or doing fluid dynamics, in the future, on top of those acceleration libraries, CUDA acceleration libraries, will have AI libraries.

講者01:04:27 - 01:04:43

We've created three things for helping the ecosystem build agentic AI. NVIDIA NIMS, which are essentially AI microservices all packaged up. It takes all of this really complicated CUDA software, CUDA DNN, Cutlass,

講者01:04:43 - 01:05:08

And so we have models for vision, for understanding languages, for speech, for animation, for digital biology, and we have some new exciting models coming for physical AI.

講者01:05:09 - 01:05:38

Run in every single cloud, because Nvidia's GPUs are now available in every single cloud, it's available in every single OEM. So you could literally take these models, integrate it into your software packages, create AI agents that run on Cadence, or they might be ServiceNow agents, or they might be SAP agents. And they could deploy it to their customers and run it wherever the customers wanna run the software. The next layer is what we call Nvidia NEMO. NEMO is essentially,

講者01:05:39 - 01:06:09

A digital employee onboarding and training evaluation system. In the future, these AI agents are essentially digital workforce that are working alongside your employees, doing things for you on your behalf. And so the way that you would bring these specialized agents, these special agents, into your company is to onboard them, just like you onboard an employee.

講者01:06:09 - 01:06:19

And so we have different libraries that helps these AI agents be trained for the type of language in your company, maybe the vocabularies.

講者01:06:19 - 01:06:48

Unique to your company, the business process is different, the way you work is different. So you would give them examples of what the work product should look like. And they would try to generate it, and you would give a feedback. And then you would evaluate them, so on and so forth. And you would guardrail them. You say, these are the things that you're not allowed to do. These are the things you're not allowed to say. And we even give them access to certain information. So that entire pipeline, digital employee pipeline, is called Nemo.

講者01:06:49 - 01:07:17

In a lot of ways, the IT department of every company is going to be the HR department of AI agents in the future. Today, they manage and maintain a bunch of software from the IT industry. In the future, they'll maintain, nurture, onboard, and improve a whole bunch of digital agents and provision them to the companies to use. And so your IT department is going to become kind of like AI agent HR.

講者01:07:18 - 01:07:38

And on top of that, we provide a whole bunch of blueprints that our ecosystem could take advantage of. All of this is completely open source, and so you could take it and modify the blueprints. We have blueprints for all kinds of different types of agents. Well, today we're also announcing that we're doing something that's really cool and I think really clever.

講者01:07:39 - 01:08:02

We're announcing a whole family of models that are based off of LAMA, the NVIDIA LAMA NEMOTRON language foundation models. LAMA 3.1 is a complete phenomenon. The download of LAMA 3.1 from Meta, 650,000 times, something like that, it has been

講者01:08:04 - 01:08:33

Derived and turned into other models, about 60,000 other different models. It is singularly the reason why just about every single enterprise and every single industry has been activated to start working on AI. Well, the thing that we did was we realized that the LAMA models really could be better fine-tuned for enterprise use. And so we fine-tuned them using our expertise and our capabilities, and we turned them into the LAMA Nemotron suite of open models.

講者01:08:34 - 01:08:58

There are small ones that interact in very fast response time, extremely small. They're what we call super, Lama-Nemotron supers. They're basically your mainstream versions of your models. Or your ultra model, the ultra model could be used to be a teacher model for a whole bunch of other models. It could be a reward model, evaluator.

講者01:08:58 - 01:09:19

A judge for other models to create answers and decide whether it's a good answer or not. Basically give feedback to other models. It could be distilled in a lot of different ways. Basically a teacher model, a knowledge distillation model. Very large, very capable. And so all of this is now available online. Well, these models,

講者01:09:21 - 01:09:38

Are incredible. It's a number one in leaderboards for chat, leaderboard for instruction, leaderboard for retrieval. So the different types of functionalities necessary that are used in AI agents around the world. These are going to be incredible models for you.

講者01:09:40 - 01:09:56

We're also working with the ecosystem. All of our NVIDIA AI technologies are integrated into the IT industry. We have great partners and really great work being done at ServiceNow, at SAP, at Siemens for industrial AI.

講者01:09:56 - 01:10:13

Cadence is doing great work, Synopsys is doing great work. I'm really proud of the work that we do with Perplexity, as you know, they revolutionize search. Yeah, really fantastic stuff. Codium, every software engineer in the world, this is going to be the next giant

講者01:10:13 - 01:10:39

AI application, next giant AI service period is software coding. 30 million software engineers around the world, everybody is going to have a software assistant helping them code. If not, obviously you're gonna be way less productive and create lesser good code. And so this is 30 million. There's a billion knowledge workers in the world. It is very, very clear.

講者01:10:39 - 01:11:05

AI agents is probably the next robotics industry and likely to be a multi trillion dollar opportunity. Well, let me show you some of the blueprints that we've created and some of the work that we've done with our partners with these AI agents. AI agents are the new digital workforce working for and with us.

講者01:11:06 - 01:11:35

AI agents are a system of models that reason about a mission, break it down into tasks, and retrieve data or use tools to generate a quality response. NVIDIA's agentic AI building blocks, NIM pre-trained models, and NEMO framework let organizations easily develop AI agents and deploy them anywhere. We will onboard and train our agentic workforces on our company's methods, like we do for employees.

講者01:11:36 - 01:12:05

AI agents are domain-specific task experts. Let me show you four examples. For the billions of knowledge workers and students, AI research assistant agents ingest complex documents like lectures, journals, financial results, and generate interactive podcasts for easy learning. By combining a UNet regression model with a diffusion model, CoreDiff can downscale global weather forecasts down from 25 kilometers to 2 kilometers.

講者01:12:06 - 01:12:28

Developers, like at NVIDIA, manage software security AI agents that continuously scan software for vulnerabilities, alerting developers to what action is needed. Virtual lab AI agents help researchers design and screen billions of compounds to find promising drug candidates faster than ever.

講者01:12:30 - 01:12:56

NVIDIA Analytics AI Agents built on an NVIDIA Metropolis blueprint, including NVIDIA Cosmos Nematron Vision Language Models, LAMA Nematron LLMs, and NEMO Retriever. Metropolis Agents analyze content from the billions of cameras generating 100,000 petabytes of video per day. They enable interactive search, summarization, and automated reporting.

講者01:12:58 - 01:13:26

And help monitor traffic flows, flagging congestion or danger. In industrial facilities, they monitor processes and generate recommendations or improvement. Metropolis agents centralize data from hundreds of cameras and can reroute workers or robots when incidents occur. The age of agentic AI is here for every organization.

講者01:13:30 - 01:13:49

Okay. That was the first pitch at a baseball game. That was not generated. I just felt that none of you were impressed. Okay, so AI was created in the cloud and for the cloud.

講者01:13:49 - 01:14:08

AI is creating the cloud for the cloud. And for enjoying AI on phones, of course, it's perfect. Very, very soon, we're going to have a continuous AI that's going to be with you. And when you use those meta glasses, you could, of course, point at something, look at something, and ask it whatever information you want.

講者01:14:09 - 01:14:38

And so AI is perfect in the cloud. What's created in the cloud is perfect in the cloud. However, we would love to be able to take that AI everywhere. I've mentioned already that you could take NVIDIA AI to any cloud, but you could also put it inside your company. But the thing that we want to do more than anything is put it on our PC as well. And so, as you know, Windows 95 revolutionized the computer industry. It made possible this new suite of multimedia services, and it changed the way that applications was created forever.

講者01:14:39 - 01:14:53

Windows 95, this model of computing, of course, is not perfect for AI. And so the thing that we would like to do is we would like to have, in the future, your AI basically become your AI assistant.

講者01:14:53 - 01:15:20

And instead of just the 3D APIs and the sound APIs and the video APIs, you would have generative APIs, generative APIs for 3D and generative APIs for language and generative AI for sound and so on and so forth. And we need a system that makes that possible while leveraging the massive investment that's in the cloud. There's no way that the world can create yet another way of programming AI models.

講者01:15:20 - 01:15:45

It's just not gonna happen. And so if we could figure out a way to make Windows PC a world class AI PC, it would be completely awesome. And it turns out the answer is Windows. It's Windows WSL2, Windows WSL2. Windows WSL2 basically is two operating systems within one. It works perfectly.

講者01:15:46 - 01:16:12

It's developed for developers and it's developed so that you can have access to bare metal. WSL2 has been optimized for cloud native applications. It is optimized for, and very importantly, it's been optimized for CUDA. And so WSL2 supports CUDA perfectly out of the box. As a result, everything that I showed you,

講者01:16:12 - 01:16:42

With NVIDIA NIMS, NVIDIA NEMO, the blueprints that we develop that are going to be up in ai.nvidia.com. So long as the computer fits it, so long as you can fit that model, and we're going to have many models that fit, whether it's vision models, or language models, or speech models, or these animation digital human models, all kinds of different types of models are going to be perfect for your PC.

講者01:16:42 - 01:17:06

You download it, and it should just run. And so our focus is to turn Windows WSL2, Windows PC, into a target first class platform that we will support and maintain for as long as we shall live. And so this is an incredible thing for engineers and developers everywhere. Let me show you something that we can do with that. This is one of the examples of a blueprint we just made for you.

講者01:17:09 - 01:17:32

Generative AI synthesizes amazing images from simple text prompts. Yet image composition can be challenging to control using only words. With NVIDIA NIM microservices, creators can use simple 3D objects to guide AI image generation. Let's see how a concept artist can use this technology to develop the look of a scene.

講者01:17:32 - 01:17:58

They start by laying out 3D assets, created by hand or generated with AI. Then use an image generation NIM, such as Flux, to create a visual that adheres to the 3D scene. Add or move objects to refine the composition. Change camera angles to frame the perfect shot. Or reimagine the whole scene with a new prompt.

講者01:18:02 - 01:18:30

Assisted by generative AI and NVIDIA NIM, an artist can quickly realize their vision. NVIDIA AI for your PCs. Hundreds of millions of PCs in the world with Windows, and so we could get them ready for AI. OEMs, all the PC OEMs we work with, just basically all of the world's leading PC OEMs are going to get their PCs ready for this stack. And so AI PCs are coming.

講者01:18:31 - 01:18:59

To a home near you. Linux is good. Okay, let's talk about physical AI. Speaking of Linux, let's talk about physical AI. So, physical AI. Imagine, imagine,

講者01:19:00 - 01:19:22

Whereas your large language model, you give it your context, your prompt on the left, and it generates tokens one at a time to produce the output. That's basically how it works. The amazing thing is this model in the middle is quite large, has billions of parameters.

講者01:19:23 - 01:19:48

The context length is incredibly large because you might decide to load in a PDF. In my case, I might load in several PDFs before I ask it a question. Those PDFs are turned into tokens. The basic attention characteristic of a transformer has every single token find its relationship and relevance against every other token. So you could have hundreds of thousands of tokens,

講者01:19:49 - 01:20:15

And the computational load increases quadratically. And it does this, that all of the parameters, all of the input sequence, process it through every single layer of the transformer, and it produces one token. That's the reason why we needed Blackwell. And then the next token is produced when the current token is done. It puts the current token into the input sequence and takes that whole thing and generates the next token. It does it one at a time.

講者01:20:16 - 01:20:44

This is the transformer model. It's the reason why it is so incredibly effective, computationally demanding. What if instead of PDFs, it's your surrounding? And what if instead of the prompt, a question, it's a request? Go over there and pick up that box and bring it back. And instead of what is produced in tokens as text, it produces action tokens.

講者01:20:44 - 01:21:10

I just described is a very sensible thing for the future of robotics. And the technology is right around the corner. But what we need to do is we need to create effectively the world model as opposed to GPT, which is a language model. And this world model has to understand the language of the world. It has to understand physical dynamics, things like gravity and

講者01:21:11 - 01:21:34

Friction and inertia. It has to understand geometric and spatial relationships. It has to understand cause and effect. If you drop something and it falls to the ground, if you poke at it and it tips over. It has to understand object permanence. If you roll a ball over the kitchen counter, when it goes off the other side, the ball didn't leave into another quantum universe that's still there.

講者01:21:34 - 01:22:03

And so all of these types of understanding is intuitive understanding that we know that most models today have a very hard time with. And so we would like to create a world, we need a world foundation model. Today we're announcing a very big thing. We're announcing NVIDIA Cosmos, a world foundation model that is designed, that was created to understand the physical world. And the only way for you to really understand this is to see it. Let's play it.

講者01:22:10 - 01:22:38

The next frontier of AI is physical AI. Model performance is directly related to data availability, but physical world data is costly to capture, curate, and label. NVIDIA Cosmos is a world foundation model development platform to advance physical AI. It includes auto-regressive world foundation models, diffusion-based world foundation models, advanced tokenizers,

講者01:22:39 - 01:23:03

And an NVIDIA CUDA, an AI-accelerated data pipeline. Cosmos models ingest text, image, or video prompts and generate virtual world states as videos. Cosmos generations prioritize the unique requirements of AV and robotics use cases, like real-world environments, lighting, and object permanence.

講者01:23:04 - 01:23:33

Developers use NVIDIA Omniverse to build physics-based, geospatially accurate scenarios, then output Omniverse renders into Cosmos, which generates photoreal, physically-based synthetic data. Whether diverse objects or environments,

講者01:23:35 - 01:23:57

Conditions like weather, or time of day, or edge case scenarios. Developers use Cosmos to generate worlds for reinforcement learning AI feedback to improve policy models. Or to test and validate model performance. Even across multi-sensor views.

講者01:23:59 - 01:24:28

Cosmos can generate tokens in real time, bringing the power of foresight and multiverse simulation to AI models, generating every possible future to help the model select the right path. Working with the world's developer ecosystem, NVIDIA is helping advance the next wave of physical AI. NVIDIA Cosmos.

講者01:24:29 - 01:24:41

NVIDIA Cosmos, NVIDIA Cosmos, the world's first world foundation model. It is trained on 20 million hours of video.

講者01:24:42 - 01:25:12

The 20 million hours of video focuses on physical dynamic things. So dynamic nature, nature themes, humans walking, hands moving, manipulating things, things that are fast camera movements. It's really about teaching the AI, not about generating creative content, but teaching the AI to understand the physical world. And with this physical AI,

講者01:25:12 - 01:25:29

There are many downstream things that we could do as a result. We could do synthetic data generation to train models. We could distill it and turn it into effectively the seed, the beginnings of a robotics model. You could have it generate multiple physically based

講者01:25:30 - 01:25:57

Physically plausible scenarios of the future, basically do a Doctor Strange. Because this model understands the physical world, of course you saw a whole bunch of images generated, this model understanding the physical world, it also could do, of course, captioning. And so it could take videos, caption it incredibly well, and that captioning and the video could be used to train large language models.

講者01:25:58 - 01:26:12

Multimodality large language models. And so you could use this technology to use this foundation model to train robots as well as large language models. And so this is the NVIDIA Cosmos. The platform has

講者01:26:12 - 01:26:41

An autoregressive model for real time applications, has diffusion model for a very high quality image generation. It's incredible tokenizer, basically learning the vocabulary of real world. And a data pipeline so that if you would like to take all of this and then train it on your own data, this data pipeline, because there's so much data involved, we've accelerated everything end to end for you. And so this is the world's first data processing pipeline that's CUDA accelerated as well as AI accelerated.

講者01:26:41 - 01:27:09

All of this is part of the Cosmos platform. And today, we're announcing that Cosmos is open licensed. It's open available on GitHub. We hope that this moment, and there's a small, medium, large for very fast models, mainstream models, and also teacher models, basically not

講者01:27:09 - 01:27:33

Knowledge transfer models. Cosmos world foundation model being open we really hope will do for the world of robotics and industrial AI what Lama 3 has done for enterprise AI. The magic happens when you connect Cosmos to Omniverse and the reason fundamentally is this. Omniverse is a physics

講者01:27:34 - 01:27:47

Grounded, not physically grounded, but physics grounded. It's algorithmic physics, principled physics simulation grounded system. It's a simulator. When you connect that to Cosmos,

講者01:27:48 - 01:28:16

It provides the grounding, the ground truth that can control and to condition the Osmos generation. As a result, what comes out of Osmos is grounded on truth. This is exactly the same idea as connecting a large language model to a RAG, to a retrieval augmented generation system. You want to ground the AI generation on ground truth. And so the combination of the two gives you a physically

講者01:28:16 - 01:28:41

Simulated a physically grounded multiverse generator. And the application, the use cases are really quite exciting. And of course, for robotics, for industrial applications, it is very, very clear. This omniverse plus cosmos represents the third computer that's necessary for building robotic systems.

講者01:28:42 - 01:29:07

Every robotics company will ultimately have to build three computers. The robotics system could be a factory. The robotics system could be a car. It could be a robot. You need three fundamental computers. One computer, of course, to train the AI. We call it the DGX computer to train the AI. Another, of course, when you're done, to deploy the AI. We call that AGX. That's inside the car, in the robot, or in an AMR,

講者01:29:08 - 01:29:26

You know, in a stadium or whatever it is, these computers are at the edge and they're autonomous. But to connect the two, you need a digital twin. And this is all the simulations that you were seeing. The digital twin is where the AI that has been trained goes to practice.

講者01:29:26 - 01:29:55

To be refined, to do its synthetic data generation, reinforcement learning, AI feedback, such and such. And so it's the digital twin of the AI. These three computers are gonna be working interactively. NVIDIA's strategy for the industrial world, and we've been talking about this for some time, is this three computer system. Instead of a three body problem, we have a three computer solution. And so it's the NVIDIA robotics.

講者01:30:01 - 01:30:30

So let me give you three examples. All right, so the first example is how we apply all of this to industrial digitalization. There are millions of factories, hundreds of thousands of warehouses. That basically is the backbone of a $50 trillion manufacturing industry. All of that has to become software defined. All of it has to have automation in the future. And all of it will be infused with robotics.

講者01:30:30 - 01:30:48

Well, we're partnering with Keyon, the world's leading warehouse automation solutions provider, and Accenture, the world's largest professional services provider, and they have a big focus in digital manufacturing. And we're working together.

講者01:30:48 - 01:31:17

To create something that's really special, and I'll show you that in a second. But our go to market is essentially the same as all of the other software platforms and all the technology platforms that we have. Through the developers and ecosystem partners, and we have just a growing number of ecosystem partners connecting to Omniverse. And the reason for that is very clear. Everybody wants to digitalize the future of industries. There's so much waste.

講者01:31:17 - 01:31:44

So much opportunity for automation in that $50 trillion of the world's GDP. So let's take a look at this one example that we're doing with Kion and Accenture. Kion, the supply chain solution company. Accenture, a global leader in professional services. And Nvidia are bringing physical AI to the $1 trillion warehouse and distribution center market.

講者01:31:45 - 01:32:11

Managing high-performance warehouse logistics involves navigating a complex web of decisions influenced by constantly shifting variables. These include daily and seasonal demand changes, space constraints, workforce availability, and the integration of diverse robotic and automated systems. And predicting operational KPIs of a physical warehouse is nearly impossible today.

講者01:32:12 - 01:32:35

To tackle these challenges, Kion is adopting Mega, an NVIDIA Omniverse blueprint for building industrial digital twins to test and optimize robotic fleets. First, Kion's warehouse management solution assigns tasks to the industrial AI brains in the digital twin, such as moving a load from a buffer location to a shuttle storage solution.

講者01:32:36 - 01:33:02

The robots' brains are in a simulation of a physical warehouse, digitalized into Omniverse using OpenUSD connectors to aggregate CAD, video and image to 3D, LIDAR to point cloud, and AI-generated data. The fleet of robots execute tasks by perceiving and reasoning about their Omniverse digital twin environment, planning their next motion, and acting.

講者01:33:03 - 01:33:30

The robot brains can see the resulting state through sensor simulations and decide their next action. The loop continues while Mega precisely tracks the state of everything in the digital twin. Now, Kion can simulate infinite scenarios at scale while measuring operational KPIs, such as throughput, efficiency, and utilization, all before deploying changes to the physical warehouse.

講者01:33:31 - 01:33:54

Together with NVIDIA, Kion and Accenture are reinventing industrial autonomy. That's incredible. Everything is in simulation. In the future, every factory will have a digital twin. And that digital twin operates exactly like the real factory. And in fact,

講者01:33:54 - 01:34:17

You could use Omniverse with Cosmos to generate a whole bunch of future scenarios, and then an AI decides which one of the scenarios are the most optimal for whatever KPIs, and that becomes the programming constraints, the program, if you will, the AIs that will be deployed into the real factories. The next example, autonomous vehicles. The AV revolution has arrived.

講者01:34:18 - 01:34:47

After so many years, with Waymo's success and Tesla's success, it is very, very clear autonomous vehicles has finally arrived. Well, our offering to this industry is the three computers, the training systems to train the AIs, the simulation systems, and the synthetic data generation systems, Omniverse and now Cosmos, and also the computer that's inside the car. Each car company might work with us in a different way, use one or two or three of the computers.

講者01:34:48 - 01:35:14

We're working with just about every major car company around the world. Waymo and Zoox and Tesla, of course, and their data center. BYD, the largest EV company in the world. JLR's got a really cool car coming. Mercedes has a fleet of cars coming with NVIDIA starting this year going to production. And I'm super, super pleased to announce that today Toyota and NVIDIA are going to partner together to create their next generation AVs.

講者01:35:21 - 01:35:49

Just so many, so many cool companies. Lucid, and Rivian, and Xiaomi, and of course, Volvo, just so many different companies. Wabi is building self-driving trucks. Aurora, we announced this week also that Aurora is gonna use NVIDIA to build self-driving trucks. Autonomous, 100 million cars built each year, a billion cars, vehicles on the road all over the world, a trillion miles that are driven around the world each year.

講者01:35:50 - 01:36:18

That's all going to be either highly autonomous or fully autonomous coming up. And so this is going to be a very large industry. I predict that this will likely be the first multi-trillion dollar robotics industry. This business for us, notice in just a few of these cars that are starting to ramp into the world, our business is already $4 billion and this year probably on a run rate of about $5 billion.

講者01:36:18 - 01:36:46

So really significant business already. This is going to be very large. Well, today we're announcing that our next generation processor for the car, our next generation computer for the car is called Thor. I have one right here. Hang on a second. Okay, this is Thor. This is Thor. This is a robotics computer. This is a robotics computer. It takes sensors.

講者01:36:47 - 01:37:14

Madness amount of sensor information. Process it, umpteen cameras, high resolution, radars, lidars, they're all coming into this chip. And this chip has to process all that sensor, turn them into tokens, put them into a transformer, and predict the next path. And this AV computer is now in full production. Thor is 20 times

講者01:37:15 - 01:37:41

The processing capability of our last generation Orin, which is really the standard of autonomous vehicles today. And so this is just really quite incredible. Thor is in full production. This robotics processor, by the way, also goes into a full robot. And so it could be an AMR, it could be a humanoid robot, it could be the brain, it could be the manipulator. This processor basically is a universal robotics computer.

講者01:37:42 - 01:37:50

The second part of our Drive system that I'm incredibly proud of is the dedication to safety. Drive OS.

講者01:37:51 - 01:38:18

I'm pleased to announce is now the first software-defined programmable AI computer that has been certified up to ASIL-D, which is the highest standard of functional safety for automobiles. The only and the highest. And so I'm really, really proud of this. ASIL-D, ISO 26262. It is the work of some 15,000 engineering years.

講者01:38:19 - 01:38:30

This is just extraordinary work. And as a result of that, CUDA is now a functional, safe computer. And so if you're building a robot, NVIDIA CUDA, yeah.

講者01:38:35 - 01:39:04

Okay, so now I told you I was gonna show you what would we use Omniverse and Cosmos to do in the context of self-driving cars. And today, instead of showing you a whole bunch of videos of cars driving on the road, I'll show you some of that too. But I wanna show you how we use the car to reconstruct digital twins automatically using AI and use that capability

講者01:39:04 - 01:39:31

to train future AI models. Okay, let's play it. The autonomous vehicle revolution is here. Building autonomous vehicles like all robots requires three computers, NVIDIA DGX to train AI models, Omniverse to test drive and generate synthetic data, and Drive AGX, a supercomputer in the car.

講者01:39:32 - 01:39:57

Building safe autonomous vehicles means addressing edge scenarios, but real-world data is limited, so synthetic data is essential for training. The Autonomous Vehicle Data Factory, powered by NVIDIA Omniverse, AI models, and Cosmos, generates synthetic driving scenarios that enhance training data by orders of magnitude.

講者01:39:58 - 01:40:27

First, Omnimap fuses map and geospatial data to construct drivable 3D environments. Driving scenario variations can be generated from replayed drive logs or AI traffic generators. Next, a neural reconstruction engine uses autonomous vehicle sensor logs to create high fidelity 4D simulation environments.

講者01:40:27 - 01:40:46

It replays previous drives in 3D and generates scenario variations to amplify training data. Finally, Edify 3DS automatically searches through existing asset libraries or generates new assets to create sim-ready scenes.

講者01:40:50 - 01:41:18

The Omniverse scenarios are used to condition Cosmos to generate massive amounts of photorealistic data, reducing the sim-to-real gap, and with text prompts, generate near-infinite variations of the driving scenario. With Cosmos Nemotron Video Search, the massively scaled synthetic dataset, combined with recorded drives, can be curated to train models.

講者01:41:21 - 01:41:42

NVIDIA's AI Data Factory scales hundreds of drives into billions of effective miles, setting the standard for safe and advanced autonomous driving. Isn't that incredible? We take...

講者01:41:42 - 01:42:09

Take thousands of drives and turn them into billions of miles. We are going to have mountains of training data for autonomous vehicles. Of course, we still need actual cars on the road. Of course, we will continuously collect data for as long as we shall live. However, synthetic data generation using this multiverse, physically based, physically grounded capability so that we generate

講者01:42:09 - 01:42:35

Data for training AIs that are physically grounded and accurate and plausible so that we could have an enormous amount of data to train with. The AV industry is here. This is an incredibly exciting time, super, super, super excited about the next several years. I think you're gonna see, just as computer graphics was revolutionized at such incredible pace, you're gonna see the pace of AV development increasing tremendously over the next several years.

講者01:42:46 - 01:43:09

I think the next part is robotics. Humanoid robots. My friends.

講者01:43:16 - 01:43:46

The chat GPT moment for general robotics is just around the corner. And in fact, all of the enabling technologies that I've been talking about is going to make it possible for us in the next several years to see very rapid breakthroughs, surprising breakthroughs in general robotics. Now the reason why general robotics is so important is whereas robots with tracks and wheels require special environments to accommodate them, there are three robots

講者01:43:47 - 01:44:15

Three robots in the world that we can make that require no green fields. Brownfield adaptation is perfect. If we could possibly build these amazing robots, we could deploy them in exactly the world that we've built for ourselves. These three robots are, one, agentic robots and agentic AI, because they're information workers. So long as they could accommodate the computers that we have in our offices, it's going to be great.

講者01:44:16 - 01:44:45

Number two, self-driving cars. And the reason for that is we spent 100 plus years building roads and cities. And then number three, human or robots. If we have the technology to solve these three, this will be the largest technology industry the world's ever seen. And so we think that robotics era is just around the corner. The critical capability is how to train these robots. In the case of human or robots,

講者01:44:46 - 01:45:14

The imitation information is rather hard to collect. And the reason for that is, in the case of car, you just drive it. We're driving cars all the time. In the case of these human or robots, the imitation information, the human demonstration is rather laborious to do. And so we need to come up with a clever way to take hundreds of demonstrations, thousands of human demonstrations, and somehow use artificial intelligence and omniverse.

講者01:45:15 - 01:45:32

To synthetically generate millions of synthetically generated motions. And from those motions, the AI can learn how to perform a task. Let me show you how that's done.

講者01:45:43 - 01:46:13

Developers around the world are building the next wave of physical, AI-embodied robots, humanoids. Developing general-purpose robot models requires massive amounts of real-world data, which is costly to capture and curate. NVIDIA Isaac Groot helps tackle these challenges, providing humanoid robot developers with four things, robot foundation models, data pipelines, simulation frameworks,

講者01:46:14 - 01:46:42

and a Thor robotics computer. The NVIDIA Isaac Groot Blueprint for synthetic motion generation is a simulation workflow for imitation learning, enabling developers to generate exponentially large datasets from a small number of human demonstrations. First, Groot Teleop enables skilled human workers to portal into a digital twin of their robot using the Apple Vision Pro.

講者01:46:43 - 01:47:09

This means operators can capture data even without a physical robot, and they can operate the robot in a risk-free environment, eliminating the chance of physical damage or wear and tear. To teach a robot a single task, operators capture motion trajectories through a handful of tele-operated demonstrations, then use Groot Mimic to multiply these trajectories into a much larger data set.

講者01:47:11 - 01:47:34

Next, they use GroupGen, built on Omniverse and Cosmos, for domain randomization and 3D to real upscaling, generating an exponentially larger dataset. The Omniverse and Cosmos Multiverse Simulation Engine provides a massively scaled dataset to train the robot policy.

講者01:47:35 - 01:47:58

Once the policy is trained, developers can perform software-in-the-loop testing and validation in Isaac Sim before deploying to the real robot. The age of general robotics is arriving. Powered by NVIDIA Isaac Group. We're going to have mounds of data to train robots with.

講者01:48:01 - 01:48:30

NVIDIA Isaac Groot. NVIDIA Isaac Groot. This is our platform to provide technology elements to the robotics industry to accelerate the development of general robotics. And well, I have one more thing that I want to show you. None of this would be possible if not for this incredible project that we started about a decade ago. Inside the company was called Project Digits.

講者01:48:31 - 01:49:00

Deep learning, GPU, intelligence training system, digits. Well, before we launched it, I shrunk at the DGX. And to harmonize it with RTX, AGX, OVX, and all of the other Xs that we have in the company. And it really revolutionized, DGX1 really revolutionized

講者01:49:00 - 01:49:24

Where's DGX1? DGX1 revolutionized artificial intelligence. The reason why we built it was because we wanted to make it possible for researchers and startups to have an out of the box AI supercomputer. Imagine the way supercomputers were built in the past. You really have to build your own facility and you have to go build your own infrastructure and really engineer it into existence.

講者01:49:24 - 01:49:52

And so we created a supercomputer for AI development for researchers and startups that comes literally one out of the box. I delivered the first one to a startup company in 2016 called OpenAI. And Elon was there, and Ilya Suskovor was there, and many of the engineers were there. And we celebrated the arrival of DGX1, and obviously, it revolutionized artificial intelligence computing.

講者01:49:52 - 01:50:22

But now artificial intelligence is everywhere. It's not just in researchers and startup labs. Artificial intelligence, as I mentioned in the beginning of our talk, this is now the new way of doing computing. This is the new way of doing software. Every software engineer, every engineer, every creative artist, everybody who uses computers today as a tool will need an AI supercomputer. And so I just wish that DGX1 was smaller.

講者01:50:22 - 01:50:46

And so imagine, ladies and gentlemen, this is NVIDIA's latest AI supercomputer.

講者01:50:50 - 01:51:18

And it's finally called Project Digits right now. And if you have a good name for it, reach out to us. Here's the amazing thing. This is an AI supercomputer. It runs the entire NVIDIA AI stack. All of NVIDIA software runs on this. DGX Cloud runs on this. This sits, well, somewhere, and it's wireless or

講者01:51:18 - 01:51:44

You know, connect it to your computer. It's even a workstation if you like it to be. And you could access it, you could reach it like a cloud supercomputer, and NVIDIA's AI works on it. And it's based on a super secret chip that we've been working on called GB110, the smallest Grace Blackwell that we make. And I have, well, you know what, let's show everybody inside.

講者01:52:11 - 01:52:31

Isn't this just so cute? And this is the chip that's inside. It is in production. This top secret chip we did in collaboration. The CPU, the gray CPU, is built for NVIDIA in collaboration with MediaTek.

講者01:52:32 - 01:53:01

They're the world's leading SOC company, and they worked with us to build this CPU, this CPU SOC, and connect it with chip-to-chip NVLink to the Blackwell GPU. And this little thing here is in full production. We're expecting this computer to be available around May timeframe, and so it's coming at you. It's just incredible what we could do, and it's just, I think it's, you really,

講者01:53:04 - 01:53:33

I was trying to figure out, do I need more hands or more pockets? All right, so imagine this is what it looks like. Who doesn't want one of those? And if you use PC, Mac, anything. Because it's a cloud platform, it's a cloud computing platform that sits on your desk. You could also use it as a Linux workstation if you like. If you would like to have double digits,

講者01:53:34 - 01:53:55

This is what it looks like. And you connect it together with ConnectX, and it has Nickel, GPU Direct, all of that out of the box. It's like a supercomputer. Our entire supercomputing stack is available. And so, NVIDIA Project Digits.

講者01:54:04 - 01:54:25

Okay. Well, let me tell you what I told you. I told you that we are in production with three new Blackwells. Not only is the Grace Blackwell supercomputers and V-Link 72s in production all over the world, we now have three new Blackwell systems in production.

講者01:54:26 - 01:54:49

One amazing AI world foundation model, the world's first physical AI foundation model is open, available to activate the world's industries of robotics and such, and three robots working on agentic AI, humanoid robots, and self-driving cars.

講者01:54:50 - 01:55:00

It's been an incredible year. I want to thank all of you for your partnership. Thank all of you for coming. I made you a short video to reflect on last year and look forward to the next year. Play, please.

講者01:57:51 - 01:57:58

Have a great CES, everybody! Happy New Year! Thank you!

在線免費將音頻和視頻轉換為文本

- 將音頻和視頻文件在幾秒鐘內轉換為準確的文本。
- 創建摘要、思維導圖和關鍵問題。

免費開始

NVIDIA CEO Jensen Huang Keynote at CES 2025

00:00

01:59:00