Nvidia is pumping up the power in its line of artificial intelligence chips with the announcement Monday of its Blackwell GPU architecture at its first in-person GPU Technology Conference (GTC) in five years.
According to Nvidia, the chip, designed for use in large data centers — the kind that power the likes of AWS, Azure, and Google — offers 20 PetaFLOPS of AI performance which is 4x faster on AI-training workloads, 30x faster on AI-inferencing workloads and up to 25x more power efficient than its predecessor.
Compared to its predecessor, the H100 “Hopper,” the B200 Blackwell is both more powerful and energy efficient, Nvidia maintained. To train an AI model the size of GPT-4, for example, would take 8,000 H100 chips and 15 megawatts of power. That same task would take only 2,000 B200 chips and four megawatts of power.
“This is the company’s first big advance in chip design since the debut of the Hopper architecture two years ago,” Bob O’Donnell, founder and chief analyst of Technalysis Research, wrote in his weekly LinkedIn newsletter.
Repackaging Exercise
However, Sebastien Jean, CTO of Phison Electronics, a Taiwanese electronics company, called the chip “a repackaging exercise.”
“It’s good, but it’s not groundbreaking,” he told TechNewsWorld. “It will run faster, use less power, and allow more compute into a smaller area, but from a technologist perspective, they just squished it smaller without really changing anything fundamental.”
“That means that their results are easily replicated by their competitors,” he said. “Though there is value in being first because while your competition catches up, you move on to the next thing.”
“When you force your competition into a permanent catch-up game, unless they have very strong leadership, they will fall into a ‘fast follower’ mentality without realizing it,” he said.
“By being aggressive and being first,” he continued, “Nvidia can cement the idea that they are the only true innovators, which drives further demand for their products.”
Although Blackwell may be a repackaging exercise, he added, it has a real net benefit. “In practical terms, people using Blackwell will be able to do more compute faster for the same power and space budget,” he noted. “That will allow solutions based on Blackwell to outpace and outperform their competition.”
Plug-Compatible With Past
O’Donnell asserted that the Blackwell architecture’s second-generation transformer engine is a significant advancement because it reduces AI floating point calculations to four bits from eight bits. “Practically speaking, by reducing these calculations down from 8-bit on previous generations, they can double the compute performance and model sizes they can support on Blackwell with this single change,” he said.
The new chips are also compatible with their predecessors. “If you already have Nvidia’s systems with the H100, Blackwell is plug-compatible,” observed Jack E. Gold, founder and principal analyst with J.Gold Associates, an IT advisory company in Northborough, Mass.
“In theory, you could just unplug the H100s and plug the Blackwells in,” he told TechNewsWorld. “Although you can do that theoretically, you might not be able to do that financially.” For example, Nvidia’s H100 chip costs $30,000 to $40,000 each. Although Nvidia didn’t reveal the price of its new AI chip line, pricing will probably be along those lines.
Gold added that the Blackwell chips could help developers produce better AI applications. “The more data points you can analyze, the better the AI gets,” he explained. “What Nvidia is talking about with Blackwell is instead of being able to analyze billions of data points, you can analyze trillions.”
Also announced at the GTC were Nvidia Inference Microservices (NIM). “NIM tools are built on top of Nvidia’s CUDA platform and will enable businesses to bring custom applications and pretrained AI models into production environments, which should aid these firms in bringing new AI products to market,” Brian Colello, an equity strategist with Morningstar Research Services, in Chicago, wrote in an analyst’s note Tuesday.
Helping Deploy AI
“Big companies with data centers can adopt new technologies quickly and deploy them faster, but most human beings are in small and medium businesses that don’t have the resources to buy, customize, and deploy new technologies. Anything like NIM that can help them adopt new technology and deploy it more easily will be a benefit to them,” explained Shane Rau, a semiconductor analyst with IDC, a global market research company.
“With NIM, you’ll find models specific to what you want to do,” he told TechNewsWorld. “Not everyone wants to do AI in general. They want to do AI that’s specifically relevant to their company or enterprise.”
While NIM is not as exciting as the latest hardware designs, O’Donnell noted that it is significantly more important in the long run for several reasons.
“First,” he wrote, “it’s supposed to make it faster and more efficient for companies to move from GenAI experiments and POCs (proof of concepts) into real-world production. There simply aren’t enough data scientists and GenAI programming experts to go around, so many companies who’ve been eager to deploy GenAI have been limited by technical challenges. As a result, it’s great to see Nvidia helping ease this process.”
“Second,” he continued, “these new microservices allow for the creation of an entire new revenue stream and business strategy for Nvidia because they can be licensed on a per GPU/per hour basis (as well as other variations). This could prove to be an important, long-lasting, and more diversified means of generating income for Nvidia, so even though it’s early days, this is going to be important to watch.”
Entrenched Leader
Rau predicted that Nvidia will remain entrenched as the AI processing platform of choice for the foreseeable future. “But competitors like AMD and Intel will be able to take modest portions of the GPU market,” he said. And because there are different chips you can use for AI — microprocessors, FPGAs, and ASICs — those competing technologies will be competing for market share and growing.”
“There are very few threats to Nvidia’s dominance in this market,” added Abdullah Anwer Ahmed, founder of Serene Data Ops, a data management company in San Francisco.
“On top of their superior hardware, their software solution CUDA has been the foundation of the underlying AI segments for over a decade,” he told TechNewsWorld.
“The main threat is that Amazon, Google, and Microsoft/OpenAI are working on building their own chips optimized around these models,” he continued. “Google already has their ‘TPU’ chip in production. Amazon and OpenAI have hinted at similar projects.”
“In any case, building one’s own GPUs is an option only available to the absolute largest companies,” he added. “Most of the LLM industry will continue to buy Nvidia GPUs.”