<- Back
Comments (517)
- stego-techThe Unified Memory pool is what will continue to be the “game changer” in systems architecture, especially outside of data centers.The reality is even cutting edge games and consumer workloads don’t actually take full use of the PCIe bandwidth of the GPU or the bandwidth of its GDDR memory. Even local AI use cases don’t substantially or meaningfully benefit from faster memory, at least to average consumers.A unified memory pool does two things:1) Lets systems optimize utilization based on need, rather than be confined to specific pools2) Reduce overall memory cost, by letting system builders purchase a single type of memory in bulk instead of having to figure out GDDR vs DDR memory placement (important for SFF/portable machines)So at a time when memory is expensive, unified pools make more sense. Even when memory becomes cheap and plentiful again, it’s just practical at this point to allocate a larger overall pool instead of managing discrete sets.The one big drawback is security. A shared memory pool means side-channel attacks against memory from the GPU or CPU could potentially compromise the other as well, meaning memory-safe designs are going to be critical to security going forward (which is good for Rust adherents, I figure).
- infecto"I am not sure how many people will run AI models locally. It still seems like a niche application to me. However, it will make decent machines to play video games."I don't know who will be the winner but with some of the recent releases from gemma it seems more probable that you may run some models locally if only from a cost perspective, not even considering business security. Not sure how this type of architecture would make for good gaming though, puts into question the whole statement."Ranked in the top 2% of scientists globally (Stanford/Elsevier 2025) and among GitHub's top 1000 developers" - side note but this guy puts this everywhere, gives me probably the inverse of what he is marketing for.
- dagmxThis feels fluff to me on the part of the author (whose work I don’t want to trivialize) but I don’t think they’ve actually looked deeper than a paper spec sheet on this.1. Yes it has the same number of cores as a 5070 mobile. It’s also running at a shared peak of 2/3 the bandwidth and a shared peak of 2/3 the TDP. The GPU by itself will likely perform at half the dedicated units performance2. Apple may not have SVE2 but they do have the AMX (private) and SME. I don’t see why he thinks the SVE2 will give him more performance than the SME.3. He mentions a single core type but doesn’t mention the total makeup. We already have known for a year how the DGX Spark compares to Apple chips. For CPU it’s roughly equivalent to an M3 Pro and for GPU compute (not rasterization) it’s between an M4 Pro and M4 Max without considering bandwidth.The real advantage to these is that they run CUDA. That’s it. Otherwise when they launch they’ll be 2-3 generations behind where Apple is and 1 gen behind AMD.The other super power of the DGX Spark was the NIC for pairing them together. But that’s been removed here too.
- modelessThe Qualcomm Snapdragon X2 Elite Extreme trounces Nvidia's chip in single core CPU performance. It beats Intel and AMD's best, too. It has unified memory. It's the only CPU in the same league as Apple's M-series in both CPU performance and power efficiency. And it's available in laptops today, not later this year. People are sleeping on Qualcomm.
- dofmHere is the press release for the actual machine:https://nvidianews.nvidia.com/news/nvidia-microsoft-windows-...I have been somewhat surprised at the lack of commentators observing that this is Microsoft and above all NVIDIA launching a device that is fundamentally at odds with the metered cloud model of AI.When you look at the other announcements and murmurings (better offline BYOK for Copilot, talk of an unmetered AI future) I think it’s clear that these two firms understand that cloud-only AI is not sustainable or inherently in their interests. But their willingness to undermine OpenAI with a product like this is notable.
- 1970-01-01Local models becoming thousands of dollars instead of millions to run is a story the public genuinely seems to be unaware of. If the order of magnitude falls again, the markets are cooked. The cheap chips barrier is even artificial and unsustainable. The next big story in local AI adoption will be big players doing chip hoarding-as-a-strategy.
- GodelNumberingI do not see how it is a 'beast of a' anything. It has 300GB/s memory bandwidth, barely above AMD Strix halo (256GB/s) with the same 128GB RAM and less than half memory bandwidth of M5 Max 128GB (614GB/s). Emphasizing memory bandwidth because most people interested in it I suppose are AI enthusiasts. Also, Windows.
- fg137I don't think this is going to get any traction in the general consumer world, even less relevant than Apple Vision Pro.(HN reaction to Vision Pro back in 2024 is almost hilarious if not ridiculous, looking at it today. I knew it would be a flop and I was so right.)
- GuestFAUniverseAnd who in 2026 is still anal-fixated on a "Windows" PC?It's just a personal computer. It normally runs multiple operating systems just fine.Windows PC sounds like people talking about tech who are either payed by M$, or embed pictures into Word documents to send them.Nobody has to kill the fun those OS agnostic machine allow, by artificially bind them to a shitty OS.
- SwtCyberThe interesting part to me isn't really the Cortex-X925 vs AVX-512 comparison, but Nvidia trying to make the GPU the center of a Windows PC rather than an add-in card
- gghhSorry for lazy question but some people here know this off top of their head so asking. Memory bandwidth of this chip?Last time I check an NVidia situation was for DGX Spark (the GB10 chip), it has regular LPDDR5X which by JEDEC standard cannot go beyond ~270 GB/sec, ie 8533 Mbit/s on a 256 lanes bus.So yeah Lemire seems to go "OMG unified memory, they're following Apple path..." ok, but Apple pulled off a much faster interconnect, 800 GB/s ballpark, and I'm trying to understand (not really, I'm asking you to try understand, he he) how is this laptop faring in that regard.
- ftchd
- xpctI follow Daniel Lemire and like his contributions, I also understand that the HN thread was created for discussion purposes, but I'd really appreciate having a reference to the spec or a source to the claims made, either here on HN or on the tweet itself.I dislike the cycle of propagating news and assuming that someone else double-checked it.
- kcbThe same chip that's been available in the DGX Spark for like 8 months now...why are we pretending like its the next big thing.
- siliconc0wI can't really see wide adoption of local LLMs unless prices really start to climb. It makes sense to use cheaper hosted smaller models like Sonnet or even Kimi but these won't run a Kimi-class model and that is really the floor for non-toy agentic tasks. Spending 5k to avoid a $20 subscription really only makes sense for niche security reasons.
- toshnb: poster is Daniel Lemire (https://lemire.me), who is very skilled in getting performance out of compute hardware (e.g. via simd, cache usage etc)
- comandillosI dont really get the hype with all the N1X thing when in reality this is the same almost 1 yr old GB10 that was released with the DGX Spark and proved to be quite a disappointment
- mohamedkoubaaThe commenters here seem to have forgotten that computers can do things other than inference
- seanalltogetherIs it really unified memory? AMD Strix Halo is "unified" but you still have to allocate memory separately for cpu vs gpu. Apple Silicon is true unified memory.
- SchnitzHow is this different from something like the AMD Ryzen AI Max that can already be purchased and supports 128GB unified memory? Seriously curious.
- embedding-shape> up to 6,144 state-of-the-art CUDA coresA RTX Pro 6000 has ~24K 5th generation tensor cores, I'm guessing this would then be 1/4 of the count but 6th generation? Wasn't clear from the images.
- dh2022A beast if a Windows PC to do what? Run Trams, Excel, Outlook, and a browser all at the same time? We could do that just fine in 2010…
- marioptI think most people are not understanding what this kind of laptop will provide.Before we get local AI, we'll be using hybrid AI.Running big models locally is unrealistic ($$$$$) but, if you imagine an Agentic Workflow where some bits run on the cloud and other smaller tasks locally, it's an amazing deal. You don't need Opus/Code/DeepSeek/Kimi/etc to do basic stuff that models like Gemma4:12b/Qwen-27b can do locally with much less latency.Having a laptop where I can use a remote big model and combine it with 5 local domain specific models, is something I would love to do today. Imagine using OpenCode and you've a small model deciding which tasks run locally, then decides if you've a good local model for XYZ task or if we use a cloud model.My main concern is: Is this hardware powerfull enough to allow local quick models switch? Unlikely but I hope I'm wrong
- WaterluvianIt’s an opportunity for them to start doing away with the whole ATX thing where owners had freedom to mix and match at their own pleasure.
- VortexLainI really hope this will have proper GNU/Linux support, otherwise it will end up the same way Qualcomm ARM PCs did.
- AmazingTurtlewhile unified memory may offer better performance than unsoldered DDR system memory, it still won't be as great as 1.8TB/s bandwidth on high end consumer GPUs right now.nvidias master plan may be making it the new normal to have "only" 400GB/s bandwidth, thus gatekeeping local model usage further behind "more memory but not as fast as the cloud can do it"
- PedroBatistaDon't want to be too harsh, maybe I'm missing something, but the CPU is at least 2 years old, internally it has been a complete shitshow and that's a minor hiccup when compared to the firmware and software situation.It's an interesting "newcomer" and the more the better but calling this a "beast" and a "game changer" is ridiculous to say the least.Then there is the price..
- AnimatsHow much is this supposed to cost, fully populated with 128GB of RAM? How much would this laptop cost?It's not that the NVidia chip has that much RAM built in, after all. It's that it can address that much. RAM is sold separately.
- amacbrideIt's effectively the same as the GB10 in the DGX Spark (Blackwell architecture, 6,144 CUDA cores, perf-wise comparable to an RTX 5070).I've found it very useful for running big models, but it's not a screaming powerhouse in terms of raw compute.
- vegabookI ran two gens of Jetson board and I have zero confidence in this. NVDA is printing in the data centre and everything else has no staying power.
- proxysnaIt's just a DGX spark with faster memory and a windows boot?
- ozgrakkurtSays running local llms isn’t relevant. Than says it is decent for games, which is just correct if you compare any gpu remotely similarly priced. I don’t understand what is the point he is making
- JBiserkovOfftopic, but "Twitter. Now that's a name I've not heard in a long time. A long time".
- noveltyaccount> “Our goal is to deliver unmetered intelligence to every home and every desk with Windows,” said Satya Nadella, chairman and CEO of Microsoft. “RTX Spark marks a real breakthrough towards that vision.”I expect computers with this chip will be about $4000. If Microsoft can deliver on local AI models that can orchestrate Windows and have solid real world intelligence, that will be an inexpensive business purchase compared to pay as you go tokens. I'm excited to see how this plays out.
- yoyohello13If it runs well with Linux, I’m sold. A Windows pc will never see the inside of my network.
- AperockyWhy is it only for Windows PC, can we not run Linux or at minimum SteamOS?
- alberthIs this essentially an Apple M-Series chip in concept?
- daft_pinkSounds good, but how much does it cost? Is this going to be an affordable laptop or $6000.
- YasuoTanaka128GB of unified memory is a dream come true for local LLMs. VRAM has been the ultimate bottleneck for developers.
- effnorwoodbeast is right
- cyberzikogood to know, hope the price will be affordable, having a pc becoming a luxury :)
- BoredPositronMediatek and Nvidia the horsemen of abandoning hardware after a year. The Jetson family still left a bad taste in my mouth.
- oldnetguySGI had unified memory back in 1996.
- htkThe M1 Max from 2021 has better memory bandwidth. The M3 Max can be specced to 128GB.Nothing new here, apart from being able to use CUDA on a less power hungry system.
- cryo32Yeah when laptops are shipping 8Gb and Microsoft is suddenly interested in native apps, nope.Tech companies have strangled their own market.
- npnIs this somehow satire? This is just the dgx spark with keyboard and monitor in a convenient format. Since it has more stuff, I'm sure that the price mark up will increase too.Up to $5000 because why not?With that money you can build a real PC with rtx 5090!
- snvzzIt is not RISC-V.We aren't so naive as to move from a locked IP ISA like x86 to another locked IP ISA such as ARM.Right?
- derefr> The game changer is the unified 128 GB memory. That is the path Apple took years ago. Instead of separate memory for the CPU and GPU, everything shares a single pool. It is increasingly popular.> The memory is not as fast as dedicated GPU memory, but it is cheap enough while delivering enough bandwidth to run AI models locally.So, the reason "dedicated GPU memory" is fast, isn't because it's "dedicated"; it's because the types of memory built into GPU cards — GDDR and HBM — are designed for throughput over latency.Which is to say, GDDR and HBM memory could be shared with the CPU in UMA while still being "fast" (for GPU use-cases.) In fact, the PS4/5 and Xbox 360 / One X / Series consoles have UMA architectures that use GDDR memory as their main memory, with no regular DDR memory to be found.What I don't understand: why don't we see UMA architectures where there's both regular DDR and GDDR/HBM memory mapped into the address space of the CPU+GPU? That seems like the best of both worlds: you'd have some memory that's "tuned" for random-access CPU usage (regular DDR), and some memory that's "tuned" for streaming GPU usage (GDDR/HBM), but either type of memory can still be put to the use it wasn't "tuned" for, just with slightly-worse performance.I guess you'd need to do a bit of software work:1. a bit of work in the OS kernel / malloc library to get CPU workloads to "prefer" allocating DDR memory over the GDDR/HBM memory until they've exhausted DDR memory (or maybe not, if you just tell the kernel the GDDR/HBM memory is something like a zswap thinpool);2. and a bit of work in supported ML frameworks, to teach them about a hybrid strategy between UMA "allocate anywhere, it's all the same" and NUMA "keep assets in VRAM if possible; if you spill assets to RAM, then they must stream into VRAM on access" (i.e. "at allocation time, allocate as if the system were NUMA, VRAM first then spilling to RAM; but at execution time, use the UMA codepaths, no need to copy RAM into VRAM.")...but once that's done, it's done.
- sherazp995Wait a minute!Nvidia going from GPU to CPU now?
- buffer_overlordCan it run Ubuntu?
- ChrisArchitectRelated:A powerful new chapter for Windows PCs, accelerated by Nvidia RTX Sparkhttps://news.ycombinator.com/item?id=48352693Nvidia RTX Sparkhttps://news.ycombinator.com/item?id=48352939
- neuroelectronIt's going to be amazing. Almost twice as fast for only 10 times the heat. Consumers aren't concerned with efficiency they only care about performance.
- danielovichdkA hardware company that propose to buy more hardware from them.Must be a new business model.....Step into my officeWhy ?Because you are fucking fired
- anonundefined
- dcreaterThey announced RTX spark days ago. Why is this post linking to a "leak" tweet on the frontpage now?
- epolanskiNot gonna lie, I'm buying one of the 128GB ram ones for local inference if price is human.
- jmyeetThis is the RTX Spark [1].The obvious comparison here is the M5 Max where you can buy a Macbook Pro with 128GB of also unified memory. Obviously CUDA cores are specific to NVidia so it's hard to directly compare but I've seen claims that the M5 Max is roughly equivalent to ~4000 CUDA cores. This obviously depends on workload and whether the CPU supports the precision you want to use (eg FP4).The M5 Max has memory bandwidth of 819GB/s. The RTX Spark I believe is ~600. So it might be slightly better than the current generation of Macs but likely worse than the expected M5 Ultras of the new Mac Studios (likely Q3 2026).For comparison, a 5090 has >20k CUDA cores and 1800GB/s memory bandwidth with 32GB of VRAM. The RTX 6000 Pro (at ~$10k) has 96GB of VRAM, same bandwidth and ~24k CUDA cores.We have to see what RTX Spark systems sell for but the DGX Spark is in the Mac Studio price range (~$4k).I do think Apple has a real opportunity here but there offerings aren't quite there yet. The M5 Ultras might be a really attractive option for local LLMs. I expect them to be in high demand.[1]: https://news.ycombinator.com/item?id=48352939
- thranceWill it support Linux?
- anonundefined
- 2OEH8eoCRo0Are their enterprise orders slowing down? Why use precious maxed out fab capacity on consumer stuff when it could be an enterprise chip?
- jqpabc123I am not sure how many people will run AI models locally. It still seems like a niche application to me.I'd say this relates directly to the cost of running AI models remotely.And we won't know what the actual cost will be until AI vendors recover the huge pile of cash they've dumped into development (plus interest).
- sometimelurkercant wait til someone figures how to run Linux on one of these
- einpoklumIntel's basic architecture keeps accelerators away from main system memory, unlike, for example, IBM's POWER architecture where the CPU and GPU are equal 'users' of memory. It's not a great breakthrough to suggest something different. The problem is - it's different, and not compatible with a lot, or most, or all, existing hardware. Also, there are some security concerns, as @stego-tech noted.
- shevy-javaAnd it will be expensive - right?Nvidia is milking the market now. We need more competition again - currently we have a mafia control the prices, not just Nvidia but all the AI companies. The price increases should be paid for them, not by us. "Free market" is being manipulated by them here.
- emsignThey are useless if RAM prices are this high. $800 laptops with maximum 8GB are currently the norm, Windows 11 can't run on them decently. No matter how fast the SoC is with overpriced RAM they are slow. Systems that can make good use of them with 64-128GB are not affordable anymore thanks to Nvidia and co. This is a smokescreen. They'll probably sell them packaged as compute modules anyway.
- llm_nerdDoes this person know that this is the same GB chip in the DGX Spark? It isn't some proposed thing, it's a chip loads of people have on their desk right now, and there are endless benchmarks of it.Decent single core (a long ways from Apple level, but decent), but it makes up for it in cores to provide M5 level performance, CPU wise. Memory bandwidth it is kind of starved, at 1/6th many GPUs.They got Microsoft to customize Windows for the RTX Spark, and will likely have to brutally throttle it when running as a laptop (it's literally a 140W TDP chip), and that's neat. It's going to be a very expensive laptop.
- throwaway5752"Major banana producer suggests shifting more ice cream store menus to banana splits, and increasing the amount of bananas per serving"
- theturtle[dead]
- sylware[dead]
- sisve> I am not sure how many people will run AI models locally. It still seems like a niche application to me.Bill Gates had a quote some years ago...People have still not learned how fast we improve our tech and how much cheaper thing gets I guess :)
- PeterStuer"I am not sure how many people will run AI models locally. It still seems like a niche application to me."Clip me :). You are currently living through the final stages of unrestricted computing in the hands of the 'public'. Our regimes are going to pull up the drawbridge in the name of 'safety'. Download the open models asap and prepare for an airgapped computing environment. That will be your frontier in not extremely neutered AI in the near future.I am so hoping I'm completely wrong on this btw.