Huang Renxun's Latest CES Keynote: Unveiling Three Mass-Produced Blackwell Chips; AI Agent's Potential Reaching Trillions of Dollars
Original Article Title: "In-Depth | Jensen Huang's Latest CES Keynote: Three New Mass-Produced Blackwell Chips, World's First Physical AI Model, and Three Major Breakthroughs in Robotics"
Original Source: Newin
At CES 2025, which opened this morning, NVIDIA Founder and CEO Jensen Huang delivered a groundbreaking keynote, unveiling the future of AI and computing. From the core Token concept driving generative AI, to the launch of the new Blackwell architecture GPU, and on to an AI-driven digital future, this keynote will profoundly impact the entire industry with a cross-disciplinary perspective.
1) From Generative AI to Agentic AI: The Dawn of a New Era
· Birth of Token: As the core driver of generative AI, the token transforms text into knowledge, injects life into images, and opens up a new digital expression.
· Evolution Path of AI: From perceptual AI and generative AI to Agentic AI capable of reasoning, planning, and action, AI technology continues to reach new heights.
· Revolution of Transformers: Since its introduction in 2018, this technology has redefined the way computation is done, completely disrupting the traditional tech stack.
2) Blackwell GPU: Breaking Performance Limits
· Next-Gen GeForce RTX 50 Series: Based on the Blackwell architecture, featuring 920 billion transistors, 4000 TOPS AI performance, and 4 PetaFLOPS of computing power, it offers three times the performance of its predecessor.
· Fusion of AI and Graphics: Achieving the first-ever combination of programmable shaders with neural networks, introducing neural texture compression and material coloring technologies, delivering stunning rendering effects.
· High-Performance for All: The RTX 5070 laptop achieves RTX 4090 performance at a price of $1299, driving the democratization of high-performance computing.
3) Multi-Domain Expansion of AI Applications
· Enterprise AI Agents: NVIDIA provides tools like Nemo and Llama Nemotron to help businesses build self-reasoning digital employees, enabling intelligent management and services.
· Physic AI: Through the Omniverse and Cosmos platform, AI is integrated into the industrial, autonomous driving, and robotics fields, redefining global manufacturing and logistics.
· Future Computing Scenarios: NVIDIA is bringing AI from the cloud to personal devices and enterprise environments, covering all computing needs from developers to end users.
The following are the key points of Jensen Huang's speech:
This is the birthplace of intelligence, a new kind of factory—a generator that produces tokens. It is the building block of AI, opening up a new domain and taking the first step into an extraordinary world. Tokens transform text into knowledge, breathe life into images; they turn creativity into video, help us navigate any environment safely; teach robots to move like masters and inspire us to celebrate victories in a whole new way. When we need it most, tokens can also bring inner peace. They give digital meaning, help us better understand the world, predict potential dangers, and find ways to address internal threats. They can make our vision a reality and restore all that we have lost.
All of this with AI began in 1993 when NVIDIA introduced its first product—the NV1. We wanted to create a computer that could do things ordinary computers couldn't, making having a gaming console in a PC possible. Then, in 1999, NVIDIA invented the programmable GPU, ushering in over 20 years of technological advancement, making modern computer graphics possible. Six years later, we introduced CUDA, enriching the algorithmic expression of GPU programmability. This technology was initially hard to explain, but by 2012, the success of AlexNet validated CUDA's potential, driving the groundbreaking development of AI.
Since then, AI has developed at an astonishing pace. From perceptual AI to generative AI, to Agentic AI, which can perceive, reason, plan, and act, AI's capabilities continue to grow. In 2018, Google introduced Transformer, and the AI world truly took off. Transformer not only fundamentally changed the landscape of AI but also redefined the entire field of computing. We realized that machine learning is not just a new application or business opportunity but a fundamental innovation in the way we compute. From manually coding instructions to optimizing neural networks with machine learning, every layer of the tech stack has undergone tremendous change.
Today, AI applications are ubiquitous. Whether it's understanding text, images, sound, or translating amino acids and physics, it can do it all. Almost all AI applications can be reduced to three questions: What mode of information did it learn? What mode of information did it translate to? What mode of information did it generate? This fundamental concept drives every AI-driven application.
All of these achievements would not have been possible without the support of GeForce. GeForce has brought AI to the masses, and now, AI is coming back to GeForce. With real-time ray tracing technology, we can render graphics with stunning effects. Through DLSS, AI can even surpass frame generation, predicting future frames. Out of 33 million pixels, only 2 million pixels are computed, while the rest are predicted and generated by AI. This miraculous technology showcases the powerful capability of AI, making computation more efficient and showing us endless possibilities for the future.
This is why so many amazing things are happening now. We have used GeForce to drive the advancement of AI, and now, AI is completely revolutionizing GeForce. Today, we are announcing the next generation product—the RTX Blackwell family. Let's take a look together.
This is the brand-new GeForce RTX 50 series, based on the Blackwell architecture. This GPU is a performance beast, with 920 billion transistors, 4000 TOPS of AI performance, and 4 PetaFLOPS of AI computing power, three times higher than the previous Ada architecture. All of this is to generate those amazing pixels I just showcased. It also features 380 ray tracing Teraflops for rendering the most beautiful quality for pixels that require computation, along with 125 shader Teraflops. This graphics card uses Micron's G7 memory, reaching speeds of 1.8TB per second, double the performance of the previous generation.
We can now combine AI workloads with computer graphics workloads, and an extraordinary feature of this generation is that programmable shaders can also process neural networks. This has led us to invent Neural Texture Compression and Neural Texture Shading. These technologies use AI to learn textures and compression algorithms, ultimately producing stunning image effects that only AI can achieve.
Even in terms of mechanical design, this graphics card is a marvel. It adopts a dual-fan design, with the entire card resembling a giant fan, and the internal voltage regulation module is state-of-the-art. Such outstanding design is entirely attributed to the efforts of the engineering team.
Next is the performance comparison. The familiar RTX 4090, priced at $1599, is the core investment in a home PC entertainment center. Now, the RTX 50 series offers higher performance starting at just $549, ranging from the RTX 5070 to the RTX 5090, with performance twice that of the RTX 4090.
What's even more amazing is that we have put this high-performance GPU into a laptop. The RTX 5070 laptop is priced at $1299 but delivers the performance of an RTX 4090. This design combines AI and computer graphics technology, enabling both high energy efficiency and high performance.
The future of computer graphics will be Neural Rendering—a fusion of AI and computer graphics. The Blackwell series can even be implemented in laptops as thin as 14.9 millimeters, and the full range of products from RTX 5070 to RTX 5090 can be adapted to ultra-thin notebooks.
GeForce has been driving the popularization of AI, and now AI, in turn, has fundamentally transformed GeForce. This is a mutual promotion of technology and intelligence, and we are moving towards a higher realm.
Three Scalling Laws of AI
Next, let's talk about the development direction of AI.
1) Pre-training Scalling Law
The AI industry is accelerating its expansion, driven by a powerful model known as the "Scalling Law." This empirical rule, repeatedly validated by researchers and the industry, indicates that the larger the scale of training data, the larger the model, and the more computational power invested, the stronger the model's capabilities.
The rate of data growth is exponentially increasing. It is estimated that in the coming years, the amount of data produced by humans annually will surpass the sum of all data produced in human history. This data is becoming more multimodal, including forms such as video, images, and sound. This massive amount of data can be used to train the fundamental knowledge system of AI, providing a solid knowledge foundation for AI.
2) Post-training Scalling Law
In addition, two other Scalling Laws are emerging.
The second Scalling Law is the "Post-training Scalling Law," which involves technologies such as reinforcement learning and human feedback. Through this method, AI generates answers based on human queries and continuously improves from human feedback. This reinforcement learning system, through high-quality prompts, helps AI enhance skills in specific areas, such as becoming better at solving mathematical problems or engaging in complex reasoning.
The future of AI is not just perception and generation, but a process of continuous self-improvement and boundary-breaking. It is like having a mentor or coach who provides feedback after you complete a task. Through testing, feedback, and self-improvement, AI can also progress through similar reinforcement learning and feedback mechanisms. This post-training stage of reinforcement learning combined with synthetic data generation techniques is similar to a process of self-practice. AI can face complex and validating challenges, such as proving theorems or solving geometric problems, continuously optimizing its answers through reinforcement learning. Although this post-training requires massive computational power, it can ultimately create extraordinary models.
3) Test Time Scaling Law
The Test Time Scaling Law is also gradually emerging. This law has shown unique potential when AI is actually used. AI can dynamically allocate resources during inference, no longer limited to parameter optimization, but focusing on compute allocation to produce the desired high-quality answers.
This process is similar to reasoning rather than direct inference or one-shot answers. AI can break down the problem into multiple steps, generate multiple solutions for evaluation, and ultimately choose the optimal solution. This long-duration reasoning has a significant impact on enhancing model capabilities.
We have seen the evolution of this technology, from ChatGPT to GPT-4, and now to the current Gemini Pro, all of which have undergone gradual development in pre-training, fine-tuning, and test time expansion. Achieving these breakthroughs requires immense computational power, which is the core value of the NVIDIA Blackwell architecture.
Latest Introduction of Blackwell Architecture
The Blackwell systems are in full production, and their performance is astounding. Today, every cloud service provider is deploying these systems, manufactured in 45 global factories, supporting up to 200 configurations, including liquid-cooled, air-cooled, x86 architecture, and NVIDIA Grace CPU versions.
The core NVLink system itself weighs up to 1.5 tons, with 600,000 parts, equivalent to the complexity of 20 cars, connected by 2 miles of copper wire and 5000 cables. The entire manufacturing process is extremely complex, but the goal is to meet the growing demand for computing power.
Compared to the previous generation architecture, Blackwell has increased performance per watt by 4 times and performance per dollar by 3 times. This means that at the same cost, the scale of training models can be increased by 3 times, and the key behind these improvements is the generation of AI tokens. These tokens are widely used in ChatGPT, Gemini, and various AI services and are the foundation of future computing.
Built on this foundation, NVIDIA has pioneered a new computing paradigm: neural rendering, seamlessly integrating AI with computer graphics. The 72 GPU units under the Blackwell architecture form the world's largest single-chip system, providing up to 1.4 ExaFLOPS of AI floating-point performance with a memory bandwidth of an astonishing 1.2 PB/s, equivalent to the total sum of global internet traffic. This supercomputing capability enables AI to handle more complex reasoning tasks while significantly reducing costs, laying the groundwork for more efficient computing.
AI Agent System and Ecosystem
Looking to the future, AI's reasoning process will no longer be a simple single-step response but will be closer to an "inner dialogue." Future AI will not only generate answers but will also engage in reflection, reasoning, and continuous optimization. With the increase in AI token generation speed and the reduction in costs, the quality of AI services will significantly improve, meeting a broader range of application needs.
To assist businesses in building AI systems with autonomous reasoning capabilities, NVIDIA offers three key tools: NVIDIA NeMo, AI Microservices, and Accelerated Libraries. By packaging complex CUDA software and deep learning models into containerized services, enterprises can deploy these AI models on any cloud platform, rapidly develop domain-specific AI Agents, such as service tools supporting enterprise management or interactive digital employees.
These models have opened up new possibilities for enterprises, not only lowering the barrier to entry for AI applications but also propelling the entire industry solidly in the direction of Agentic AI. Future AI will become digital employees that can easily integrate into enterprise tools such as SAP, ServiceNow, providing intelligent services to customers in different environments. This is the next milestone in AI expansion and is at the core of NVIDIA's technological ecosystem vision.
Training Evaluation System. In the future, these AI Agents will essentially work alongside employees as a digital workforce to complete tasks for you. Therefore, introducing these specialized Agents into your company is akin to onboarding a new employee. We offer various toolkits to help these AI Agents learn the company's unique language, vocabulary, business processes, and workflow. You need to provide them with examples of work outcomes, which they will attempt to generate, and then you can provide feedback, conduct evaluations, and so on. At the same time, you will also set restrictions, such as specifying which actions they cannot perform, what they cannot say, and control the information they can access. This entire digital employee process is referred to as Nemo. To some extent, every company's IT department will become the HR department of AI Agents.
Today, IT departments manage and maintain a large amount of software; in the future, they will manage, cultivate, onboard, and improve a large number of digital Agent employees to provide services for the company. Therefore, IT departments will gradually evolve into the HR department of AI Agents.
Furthermore, we provide many open-source blueprints for ecosystem use. Users are free to modify these blueprints. We offer blueprints for various types of Agents. Today, we have also announced a very cool and smart thing: we have launched a brand-new model family based on Llama, the NVIDIA Llama Nemo Tron Language Foundation Models series.
Llama 3.1 is a blockbuster model. The Meta Llama 3.1 has been downloaded approximately 350,650,000 times and has spawned around 60,000 other models. This is one of the core reasons driving nearly every enterprise and industry to start researching AI. We realized that the Llama model could be better fine-tuned for enterprise use cases. Leveraging our expertise and capability, we fine-tuned it into the Llama Nemotron Open Model Suite.
These models come in different sizes: small models for quick responses; the flagship Super Llama Nemotron is a general-purpose model; and the Ultra Model, an extra-large model, can serve as a teacher model for evaluating other models, generating answers, and determining their quality, or as a knowledge distillation model. All these models are now live.
These models have shown outstanding performance, leading the rankings in areas such as dialogue, commands, and information retrieval, making them highly suitable for AI Agent functionality on a global scale.
Our collaboration with the ecosystem is also very close, such as partnerships in industrial AI with ServiceNow, SAP, Siemens. Companies like Cadence and Perplexity are also conducting excellent projects. Perplexity has disrupted the search field, while Codium serves 30 million software engineers worldwide. AI assistants will greatly enhance the productivity of software developers, which is the next huge application area for AI services. With 1 billion knowledge workers globally, the AI Agent could be the next robotics industry, with potential reaching trillions of dollars.
AI Agent Blueprint
Next, we showcase some AI Agent blueprints completed with partners.
The AI Agent is a new digital workforce capable of assisting or replacing humans in tasks. NVIDIA's Agentic AI building blocks, NEM pre-trained models, and the Nemo framework help organizations easily develop and deploy AI Agents. These Agents can be trained as task-specific domain experts.
Here are four examples:
· Research Assistant Agent: Capable of reading complex documents such as lectures, journals, financial reports, and generating interactive podcasts for easier learning;
· Software Security AI Agent: Helps developers continuously scan for software vulnerabilities and provides alerts to take appropriate actions;
· Virtual Lab AI Agent: Accelerates compound design and screening to quickly identify potential drug candidates;
· Video Analysis AI Agent: Based on the NVIDIA Metropolis blueprint, analyzes data from billions of cameras to generate interactive search, summaries, and reports. For example, monitoring traffic flow, facility processes, providing improvement suggestions, etc.;
The Arrival of the Physics AI Era
We hope to bring AI from the cloud to every corner, including within companies and personal PCs. NVIDIA is working to transform Windows WSL 2 (Windows Subsystem) into the preferred platform for supporting AI. This will enable developers and engineers to more conveniently leverage NVIDIA's AI technology stack, including language models, image models, animation models, etc.
In addition, NVIDIA has launched Cosmos, the first physical-world foundational model development platform focused on understanding the dynamic properties of the physical world, such as gravity, friction, inertia, spatial relationships, causality, etc. It can generate videos and scenes that adhere to the laws of physics, widely used in robot, industrial AI, and multi-modal language model training and validation.
Cosmos leverages NVIDIA Omniverse to provide physical simulation, generating realistic and credible simulation results. This combination is a core technology for robot and industrial application development.
NVIDIA's industrial strategy is based on three computing systems:
· DGX systems for AI training;
· AGX systems for AI deployment;
· Digital twin systems for reinforcement learning and AI optimization;
Through the collaboration of these three systems, NVIDIA is driving the development of robots and industrial AI, building the future digital world. Rather than calling this a three-body problem, we have a "three-computer" solution.
NVIDIA's vision for robots allows me to show you three examples.
1) Application of Industrial Visualization
Currently, there are millions of factories and hundreds of thousands of warehouses worldwide, making up the backbone of the $50 trillion manufacturing industry. In the future, everything needs to be software-defined, automated, and integrated with robotics technology. We are partnering with the world's leading warehouse automation solution provider Keon and the world's largest professional services provider Accenture, focusing on digital manufacturing, jointly creating some very special solutions. Our go-to-market strategy is similar to other software and technology platforms, carried out through developers and ecosystem partners, and more and more ecosystem partners are joining the Omniverse platform. This is because everyone wants to visualize the future of industry. In this $50 trillion global GDP, there is so much waste and so many opportunities for automation.
Let's take a look at this example of Keon and Accenture partnering with us:
Keon (a supply chain solutions company), Accenture (a global professional services leader), and NVIDIA are bringing physical AI into the trillion-dollar warehouse and distribution center market. Managing efficient warehouse logistics requires dealing with a complex decision network influenced by constantly changing variables such as daily and seasonal demand fluctuations, space constraints, labor supply, and the integration of diverse robots and automation systems. Today, predicting key performance indicators (KPIs) of a physical warehouse is nearly impossible.
To address these issues, Keon is adopting Mega (a blueprint of NVIDIA Omniverse) to build an industrial digital twin to test and optimize robot fleets. Initially, Keon's warehouse management solution assigns tasks to the industrial AI brain in the digital twin, such as moving goods from buffer locations to shuttle storage solutions. In the Omniverse's physical warehouse simulation environment, the robot fleet executes tasks through perception and reasoning, plans the next steps, and takes action. The digital twin environment uses sensor simulation, allowing the robot brain to see the state after task execution and decide on the next actions. Under Mega's precise tracking, the entire cycle continues while measuring operational KPIs such as throughput, efficiency, and utilization, all completed before making changes to the physical warehouse.
With NVIDIA's collaboration, Keon and Accenture are redefining the future of industrial autonomy.
In the future, every factory will have a digital twin that is fully synchronized with the actual factory. You can use Omniverse and Cosmos to generate numerous future scenarios, and AI will decide on the optimal KPI scenarios and set them as constraints and AI programming logic for actual factory deployment.
2) Autonomous Vehicles
The era of autonomous driving has arrived. After years of development, the success of companies like Waymo and Tesla has demonstrated the maturity of autonomous driving technology. Our solution provides three types of computer systems for this industry: systems for training AI (such as DGX systems), systems for simulation testing and generating synthetic data (such as Omniverse and Cosmos), and in-vehicle computer systems (such as AGX systems). Nearly all major global automotive companies are partnering with us, including Waymo, Zoox, Tesla, and the world's largest electric vehicle company BYD. Companies like Mercedes, Lucid, Rivian, Xiaomi, and Volvo, which are about to launch innovative models, are also part of this collaboration. Aurora is utilizing NVIDIA technology to develop autonomous driving trucks.
Every year, 100 million cars are manufactured, with 1 billion cars driving on global roads, accumulating trillions of miles in annual mileage. These vehicles will gradually become highly or fully automated, with this industry expected to become the first robotic industry worth trillions of dollars.
Today, we are excited to announce the launch of the next-generation in-vehicle computer, Thor. It is a general-purpose robotic computer capable of handling large amounts of data from sensors such as cameras, high-resolution radars, and LiDAR. Thor is an upgraded version of the current industry standard, Orin, with 20 times the computing power and is now in full production. Additionally, NVIDIA's Drive OS is the first AI computer operating system to be certified to the highest functional safety standard (ISO 26262 ASIL D).
Autonomous Driving Data Factory
NVIDIA has leveraged Omniverse AI models and the Cosmos platform to create an Autonomous Driving Data Factory, significantly expanding training data through synthetic driving scenarios, including:
· OmniMap: Fusion of map and geospatial data to build drivable 3D environments;
· Neural Reconstructor Engine: Utilization of sensor logs to generate high-fidelity 4D simulation environments and create scene variants for training data;
· Edify 3DS: Searching or generating new assets from an asset library to create scenes for simulation.
Through these technologies, we have expanded thousands of driving scenarios into billions of miles of data, used for the development of safer and more advanced autonomous driving systems.
3) General-Purpose Robotics
The era of general-purpose robotics is approaching. The key to driving breakthroughs in this field lies in training. For humanoid robots, acquiring imitation data is relatively challenging, but NVIDIA's Isaac Groot provides a solution. It generates massive datasets through simulation and, combined with the multi-universe simulation engine of Omniverse and Cosmos, conducts policy training, validation, and deployment.
For instance, developers can remotely operate robots via Apple Vision Pro to capture data without the need for physical robots and teach task actions in a risk-free environment. Through Omniverse's domain randomization and 3D-to-real-world scene expansion capabilities, an exponentially growing dataset is generated, providing abundant resources for robot learning.
In conclusion, whether in industrial visualization, autonomous driving, or general-purpose robotics, NVIDIA's technology is leading the future transformation of physical AI and the robotics field.
Finally, I have one more important piece of content to share, and all of this is interconnected with a project we kicked off internally at the company ten years ago called Project Digits, its full name being Deep Learning GPU Intelligence Training System, aka Digits.
Prior to its official release, we fine-tuned the DGX to align with the company's internal RTX, AGX, OVX, and other product lines. The advent of the DGX1 truly changed the course of AI development and marked a milestone in NVIDIA's AI journey.
The Revolution of DGX1
The initial purpose of the DGX1 was to provide researchers and startups with an out-of-the-box AI supercomputer. Imagine, in the past, supercomputers required users to construct dedicated facilities, design and build intricate infrastructures to even exist. The DGX1, on the other hand, was a supercomputer tailored for AI development, requiring no complex setup and ready to use out of the box.
I still remember in 2016 when I delivered the first DGX1 to a startup—OpenAI. Elon Musk, Ilya Sutskever, and many NVIDIA engineers were present, and we celebrated the arrival of the DGX1 together. This device significantly propelled AI computing forward.
Today, AI is omnipresent. It is no longer confined to research institutions and startup labs. As I mentioned earlier, AI has become a new paradigm in computing and software development. Every software engineer, creative artist, and even the average computer user leveraging tools now needs an AI supercomputer. But I always wished the DGX1 could be smaller.
The Latest AI Supercomputer
Here is NVIDIA's latest AI supercomputer. It is still part of Project Digits, and we are currently seeking a better name for it, welcoming suggestions. This is a truly remarkable device.
This supercomputer can run NVIDIA's entire AI software stack, including DGX Cloud. It can serve as a cloud-based supercomputer, a high-performance workstation, or even a desktop analytics station. Most importantly, it is built on a new chip we secretly developed, codenamed GB110, the smallest Grace Blackwell we have produced.
I have a chip in my hand to show everyone its internal design. This chip was developed in collaboration with the global leading SoC company, MediaTek. The CPU SoC was custom-made for NVIDIA, using NVLink chip-to-chip interconnect technology to connect to the Blackwell GPU. This small chip is now in full production. We expect this supercomputer to be officially launched around May.
We even offer a "double computing power" configuration, where these devices can be connected together via ConnectX, supporting GPU Direct technology. It is a complete supercomputing solution that can meet various needs for AI development, analytical work, and industrial applications.
Furthermore, the mass production of three new Blackwell system chips has been announced, along with the world's first physical AI base model and breakthroughs in three major robot areas— autonomous AI Agent robots, humanoid robots, and autonomous driving cars.
欢迎加入律动 BlockBeats 官方社群:
Telegram 订阅群:https://t.me/theblockbeats
Telegram 交流群:https://t.me/BlockBeats_App
Twitter 官方账号:https://twitter.com/BlockBeatsAsia