close
close
NVIDIA presents innovations to increase data center performance and energy efficiency at Hot Chips

An in-depth technology conference for processor and system architects from industry and academia has become a major forum for the trillion-dollar data center computing market.

At Hot Chips 2024 next week, senior NVIDIA engineers will present the latest advances on the NVIDIA Blackwell platform, as well as research on liquid cooling for data centers and AI agents for chip design.

They will explain to you how:

  • NVIDIA Blackwell brings together multiple chips, systems, and NVIDIA CUDA software to power the next generation of AI across use cases, industries, and geographies.
  • NVIDIA GB200 NVL72 – a liquid-cooled, multi-node rack-scale solution combining 72 Blackwell GPUs and 36 Grace CPUs – raises the bar for AI system design.
  • NVLink interconnect technology provides all-to-all GPU communication, enabling record-breaking high throughput and low-latency inference for generative AI.
  • The NVIDIA Quasar quantization system pushes the boundaries of physics to accelerate AI computations.
  • NVIDIA researchers create AI models that help build processors for AI.

A NVIDIA Blackwell talk on Monday, August 26, will also highlight new architectural details and examples of generative AI models running on Blackwell silicon.

Before that, there will be three tutorials on Sunday, August 25, explaining how hybrid liquid cooling solutions can help data centers transition to more energy-efficient infrastructure and how AI models, including agents based on Large Language Models (LLM), can help engineers develop the next generation of processors.

Together, these presentations illustrate how NVIDIA engineers are innovating across all areas of computing and data center design to deliver unprecedented performance, efficiency, and optimization.

Be ready for Blackwell

NVIDIA Blackwell is the ultimate full-stack computing challenge. It includes several NVIDIA chips, including the Blackwell GPU, Grace CPU, BlueField compute unit, ConnectX network interface card, NVLink switch, Spectrum Ethernet switch, and Quantum InfiniBand switch.

Ajay Tirumala and Raymond Wong, directors of architecture at NVIDIA, will provide a first look at the platform and explain how these technologies work together to deliver a new standard for AI and accelerated computing performance while increasing energy efficiency.

The NVIDIA GB200 NVL72 multi-node solution is a perfect example. LLM inference requires low-latency, high-throughput token generation. GB200 NVL72 acts as a unified system that delivers up to 30x faster inference for LLM workloads and unlocks the ability to run trillion-parameter models in real time.

Tirumala and Wong will also explain how the NVIDIA Quasar Quantization System—which combines algorithmic innovations, NVIDIA software libraries and tools, and Blackwell’s second-generation Transformer Engine—supports high accuracy in low-precision models, highlighting examples using LLMs and visual generative AI.

Keeping data centers cool

The traditional hum of air-cooled data centers could become a thing of the past as researchers develop more efficient and sustainable solutions that rely on hybrid cooling, a combination of air and liquid cooling.

Liquid cooling techniques remove heat from systems more efficiently than air cooling, making it easier for computer systems to stay cool even when processing large workloads. Liquid cooling equipment also takes up less space and uses less power than air cooling systems, allowing data centers to fit more server racks and therefore more computing power into their facilities.

Ali Heydari, director of data center cooling and infrastructure at NVIDIA, will present several designs for hybrid cooled data centers.

Some designs retrofit existing air-cooled data centers with liquid cooling units, providing a quick and easy solution to add liquid cooling capabilities to existing racks. Other designs require installing liquid cooling piping directly to the chip using cooling distribution units or by fully submerging the servers in immersion cooling tanks. Although these options require a larger initial investment, they result in significant savings in both energy consumption and operating costs.

Heydari will also present his team’s work as part of COOLERCHIPS, a U.S. Department of Energy program to develop advanced cooling technologies for data centers. As part of the project, the team is using the NVIDIA Omniverse platform to create physics-based digital twins that will help them model energy consumption and cooling efficiency to optimize their data center designs.

AI agents contribute to processor design

Designing semiconductors on a microscopic scale is a mammoth task. Engineers developing cutting-edge processors try to pack as much computing power as possible into a piece of silicon just a few centimeters in size, pushing the boundaries of what is physically possible.

AI models support their work by improving design quality and productivity, increasing the efficiency of manual processes, and automating some time-consuming tasks. The models include prediction and optimization tools that help engineers quickly analyze and improve designs, as well as LLMs that can help engineers answer questions, generate code, debug design issues, and more.

Mark Ren, director of design automation research at NVIDIA, will provide an overview of these models and their application in a tutorial. In a second session, he will focus on agent-based AI systems for chip design.

AI agents based on LLMs can perform tasks autonomously, enabling cross-industry applications. In microprocessor design, NVIDIA researchers are developing agent-based systems that can reason and take action using customized circuit design tools, interact with experienced designers, and learn from a database of human and agent experiences.

NVIDIA experts are not only developing this technology, they are also using it. Ren will present examples of how engineers can use AI agents for time report analysis, cell cluster optimization processes, and code generation. The work on cell cluster optimization was recently awarded best paper at the first IEEE International Workshop on LLM-Aided Design.

Register for Hot Chips, August 25-27 at Stanford University and online.

By Olivia

Leave a Reply

Your email address will not be published. Required fields are marked *