{"id":8152,"date":"2025-12-31T01:41:03","date_gmt":"2025-12-31T01:41:03","guid":{"rendered":"https:\/\/microvibenews.com\/?p=8152"},"modified":"2025-12-31T01:41:03","modified_gmt":"2025-12-31T01:41:03","slug":"nvidias-groq-bet-shows-that-the-economics-of-ai-chip-building-are-still-unsettled","status":"publish","type":"post","link":"https:\/\/microvibenews.com\/?p=8152","title":{"rendered":"Nvidia\u2019s Groq bet shows that the economics of AI chip-building are still unsettled"},"content":{"rendered":"<p><img src=\"https:\/\/fortune.com\/img-assets\/wp-content\/uploads\/2025\/11\/GettyImages-2194622819-e1763588530115.jpg?w=2048\" \/><\/p>\n<p>Nvidia built its AI empire on GPUs. But its $20 billion bet on Groq suggests the company isn\u2019t convinced GPUs alone will dominate the most important phase of AI yet: running models at scale, known as inference.&nbsp;<\/p>\n<p>The battle to win on AI inference, of course, is over its economics. Once a model is trained, every useful thing it does\u2014answering a query, generating code, recommending a product, summarizing a document, powering a chatbot, or analyzing an image\u2014happens during inference. That\u2019s the moment AI goes from a sunk cost into a revenue-generating service, with all the accompanying pressure to reduce costs, shrink latency (how long you have to wait for an AI to answer), and improve efficiency.<\/p>\n<p>That pressure is exactly why inference has become the industry\u2019s next battleground for potential profits\u2014and why Nvidia, in a deal announced just before the Christmas holiday, licensed technology from Groq, a startup building chips designed specifically for fast, low-latency AI inference, and hired most of its team, including founder and CEO Jonathan Ross.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Inference is AI\u2019s \u2018industrial revolution\u2019<\/strong><\/h2>\n<p>Nvidia CEO Jensen Huang has been explicit about the challenge of inference. While he says Nvidia is \u201cexcellent at every phase of AI,\u201d he told analysts at the company\u2019s Q3 earnings call in November that inference is \u201creally, really hard.\u201d Far from a simple case of one prompt in and one answer out, modern inference must support ongoing reasoning, millions of concurrent users, guaranteed low latency, and relentless cost constraints. And AI agents, which have to handle multiple steps, will dramatically increase inference demand and complexity\u2014and raise the stakes of getting it wrong.&nbsp;<\/p>\n<p>\u201cPeople think that inference is one shot, and therefore it\u2019s easy. Anybody could approach the market that way,\u201d Huang said. \u201cBut it turns out to be the hardest of all, because thinking, as it turns out, is quite hard.\u201d <\/p>\n<p>Nvidia\u2019s support of Groq underscores that belief, and signals that even the company that dominates AI training is hedging on how inference economics will ultimately shake out.&nbsp;<\/p>\n<p>Huang has also been blunt about how central inference will become to AI\u2019s growth. In a recent conversation on the <em>BG2<\/em> podcast, Huang said inference already accounts for more than 40% of AI-related revenue\u2014and predicted that it is \u201cabout to go up by a billion times.\u201d<\/p>\n<p>\u201cThat\u2019s the part that most people haven\u2019t completely internalized,\u201d Huang said. \u201cThis is the industry we were talking about. This is the industrial revolution.\u201d<\/p>\n<p>The CEO&#8217;s confidence helps explain why Nvidia is willing to hedge aggressively on how inference will be delivered, even as the underlying economics remain unsettled.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Nvidia wants to corner the inference market<\/strong><\/h2>\n<p>Nvidia is hedging its bets to make sure that they have their hands in all parts of the market, said Karl Freund, founder and principal analyst at Cambrian AI Research. \u201cIt\u2019s a little bit like Meta acquiring Instagram,\u201d he explained. \u201cIt\u2019s not that they thought Facebook was bad, they just knew that there was an alternative that they wanted to make sure wasn\u2019t competing with them.\u201d\u00a0<\/p>\n<p>That, even though Huang had made strong claims about the economics of the existing Nvidia platform for inference. \u201cI suspect they found that it either wasn\u2019t resonating as well with clients as they\u2019d hoped, or perhaps they saw something in the chip-memory-based approach that Groq and another company called D-Matrix has,\u201d said Freund, referring to another fast, low-latency AI chip startup backed by Microsoft that recently raised $275 million at a $2 billion valuation.\u00a0<\/p>\n<p>Freund said Nvidia\u2019s move into Groq could lift the entire category. \u201cI\u2019m sure D-Matrix is a pretty happy startup right now, because I suspect their next round will go at a much higher valuation thanks to the [Nvidia-Groq deal],\u201d he said.\u00a0<\/p>\n<p>Other industry executives say the economics of AI inference are shifting as AI moves beyond chatbots into real-time systems like robots, drones, and security tools. Those systems can\u2019t afford the delays that come with sending data back and forth to the cloud, or the risk that computing power won\u2019t always be available. Instead, they favor specialized chips like Groq\u2019s over centralized clusters of GPUs.&nbsp;<\/p>\n<p>Behnam Bastani, founder and CEO of OpenInfer, which focuses on running AI inference close to where data is generated\u2014such as on devices, sensors, or local servers rather than distant cloud data centers\u2014said his startup is targeting these kinds of applications at the \u201cedge.\u201d\u00a0 <\/p>\n<p>The inference market, he emphasized, is still nascent. And Nvidia is looking to corner that market with its Groq deal. With inference economics still unsettled, he said Nvidia is trying to position itself as the company that spans the entire inference hardware stack, rather than betting on a single architecture.<\/p>\n<p>\u201cIt positions Nvidia as a bigger umbrella,\u201d he said.&nbsp;<\/p>\n<p>This story was originally featured on Fortune.com<\/p>\n<p>#Nvidias #Groq #bet #shows #economics #chipbuilding #unsettled<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Nvidia built its AI empire on &hellip; <\/p>\n","protected":false},"author":1,"featured_media":8153,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[2],"tags":[157,6631,1202,5780,955,936,6632],"_links":{"self":[{"href":"https:\/\/microvibenews.com\/index.php?rest_route=\/wp\/v2\/posts\/8152"}],"collection":[{"href":"https:\/\/microvibenews.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/microvibenews.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/microvibenews.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/microvibenews.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=8152"}],"version-history":[{"count":0,"href":"https:\/\/microvibenews.com\/index.php?rest_route=\/wp\/v2\/posts\/8152\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/microvibenews.com\/index.php?rest_route=\/wp\/v2\/media\/8153"}],"wp:attachment":[{"href":"https:\/\/microvibenews.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=8152"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/microvibenews.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=8152"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/microvibenews.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=8152"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}