The AI PC era has a benchmarking problem
<!–
–>

In summary:
- PCWorld highlights how AI-focused hardware like Nvidia’s RTX Spark creates challenges for traditional PC benchmarking methods that may no longer adequately assess performance.
- Current benchmarks struggle to evaluate devices designed for hybrid computing, where workloads split between local hardware and cloud services.
- The industry needs new benchmarking approaches that answer whether AI PCs are right for individual users’ specific needs.
I like numbers. Data feels sure. What could be better than measurable progress, a way of quantifying the world to stop arguments before they start?
But as we all know, people find plenty of reason still to fight about performance, even in spite of PC benchmarks. (Half the time, it’s because of the benchmarks.) Neither the humans referring to the results nor the companies producing the hardware in question have much interest in tidy interpretations. And now we have Nvidia’s RTX Spark in the mix.
Welcome to The Full Nerd newsletter—your weekly dose of hardware talk from the enthusiasts at PCWorld. Missed the surprising topics on our YouTube show or latest news from across the web? You’re in the right place.
Want this newsletter to come directly to your inbox? Sign up on our website!
I’ve heard more than one person express frustration about Nvidia, Microsoft, and other companies pushing AI-focused hardware on consumers during Computex. This YouTube comment from user @vr0k3n sums up the vibe pretty well: “Referring to anything consumer related when presenting these clearly AI B2B products is complete deception….the only reason they did that was to frame [these] as a consumer product so people would still be interested in it when it launches.”
But I’m not so certain this take is quite on the mark. At Microsoft Build, which ran concurrently with Computex this year, one Surface Laptop Ultra demo showed off a split workload—the generation of a 3D art asset through use of local AI and cloud AI tools, each handling different tasks. When I asked about this hybrid work style in an interview shortly thereafter with Andrew Hill, corporate vice president of Surface, he became notably animated and spoke more at length, telling me that such an approach is “exactly what we’re trying to give people options for.” I genuinely believe that Nvidia and Microsoft see a future where “people evolve how they think about what work happens where,” as Hill put it.
Consumers have already begun stepping in this direction, splitting workloads between their local system and the cloud. (For example, I game off my local hardware, but I write using an online document editor.) It’s in part the reason why Chromebooks and aged hardware have become not just viable, but common solutions for everyday computing. So if that’s the case, what’s the performance we should be measuring?

Adam Patrick Murray / Foundry
When Nvidia’s RTX Spark CPU launches, people will put it through its paces in all manners of ways. AI workloads, gaming, common productivity tasks, content creation—the whole nine yards and then some. But for me, the point will be less Nvidia’s chip and its specific audience. Instead I’ll be looking at it and wondering what precedent it will set for benchmarking such hardware, meant for tasks split between online and offline tools.
We can’t stop the vision the RTX Spark represents. The companies will push such chips on us. What we can do is thoughtfully respond with how we evaluate them, especially if more and more of consumer computing shifts to the cloud—because new chip production also ends up centering AI more and more.
We may have to let go of certain benchmarks we’re accustomed to, or demand new ones. We may need to adjust the way we form opinions based on the numbers, and what we focus on. Ultimately, testing can answer a million granular queries, yet also to fail the broadest, most important anyone can ask about performance: Is this right for me?
Adam asked more than once recently if we’ve reached a point where PC computing has become good enough for most people, where a need for more performance doesn’t truly exist. I don’t think so, personally. For enthusiasts, we’re bottomless pits when it comes to seeing tech evolve. But we may be in danger of losing attention if we assume that we’ve gotten everything we can get—or react as we always have, using the same approach we have.
As I said, I love numbers. But I’m reminded of my dad, ever the practical person. (If I have any claim to sense, it’s the little that rubbed off on me from him.) He’s the one who always asks, “What are you going to do with that?” So as much as I find joy in poring over charts and how else a piece of hardware will respond, I know it’s most important to an…
Leave a Reply