The Future Of Benchmarks

That's a lot of data to parse. Too much for most people.
I ran a lot of benchmarks back in the day.

Early in my technology writing career, I found myself standing in the graphics card aisle of a major computer superstore, contemplating the raft of 3D accelerators available. I noticed this guy walk in, carrying a Computer Gaming World magazine, back in that publication’s heyday. He had the magazine turned to the my most recent roundup of 3D graphics card. So I waited and watched.

He walked up and down the row of cards, stacked on eight-foot high shelves. He would look at the magazine, flip through the pages, grab a package off the shelf, then put it back. The cycle of reading a review, looking at a card, then returning it continued for a good fifteen minutes. Finally, he snapped the magazine shut, looked at the rows of cards one more time. Then he picked up the cheapest card — one I hadn’t even reviewed — and headed towards the cash register.

I felt a little deflated, needless to say. All that work benchmarking a dozen or more graphics cards. All that seemed to do was confuse the poor guy looking for an upgrade for his PC.

Fast forward to today, and benchmarking is practically a big business. Companies like Futuremark, Basemark, and Kishonti built businesses on creating benchmarks. Tech sites of all stripes, PC, mobile, and mainstream, run benchmarks and produce endless charts of results. I’m not criticizing their work; I’ve certainly run thousands of benchmarks over the years and learned a lot about system performance. I read a lot of what the modern enthusiast and tech sites put out, and check out the benchmark charts pretty frequently.

I think, however, most users really don’t care about performance. They may care about responsiveness — how quickly the system responds to something they do — but not about raw performance. On top of not caring, most users find their performance perfectly adequate. Unless you’re creating high end content, running performance-intensive games, or compiling most modern PCs and mobile devices have all the performance people need. For mobile in particular, users tend to value battery life above performance by a wide margin.

So while I might find benchmarks useful to me, I’m part of that tiny fraction of users who care about performance.

Threshold Performance Testing

It’s also possible performance testing may become obsolete — not because systems will ever have enough performance for high end software, but because it may be impossible to actually test performance in the future. Brad Chacos from PC World steered me to an article by Ryan Shrout, who runs PC Perspective, on the difficulties of benchmarking DirectX 12 games. Shrout’s done a lot of good work in developing performance tests that attempt to replicate what people might actually see in-game. But he, and other reviewers catering to performance enthusiasts, may be running problems generated by Microsoft’s attempts to create better game development environments and better user experiences. To quote Ryan:

“Down the road, it appears that Microsoft thinks that running all games through the compositing engine will allow for unique features and additions to PC games, including multi-plane overlays. Multi-plane overlays allow two different render screens, one with the 3D game and another with the UI, for example, to be rendered at different resolutions or even updated at different rates, merging together through the Windows engine. Pushing games through the MS Windows engine will also help to improve on power efficiency, a trait that is more important as PCs mobile into the realm of mobile devices.”

However, these new features create problems with traditional benchmark tools, particularly those that use in-game overlays to display and capture data. If you’re really into PC performance testing, go check out the full article — it’s a deep, but interesting read.

Valve’s SteamVR test. All you find out is if your system is “VR Ready”, instead of reams of data.

What may come to pass in the future is something I’m calling threshold benchmarking. Threshold benchmarks check to see if your system and components can handle specific workloads within predetermined parameters. One example of this is Valve’s new SteamVR test. If your system doesn’t drop below 90fps, you’re supposedly ready for virtual reality gaming.

Microsoft’s emerging approach towards unifying its gaming platform turns the PC into a glorified gaming console, albeit one that’s much more flexible and general purpose. I’ll consider how Microsoft’s new initiatives may affect users in a different post. But when it comes to benchmarking, the idea of applications which tell you the maximum performance of a system may be on the way out. Instead, you’ll simply know if your system is capable of running a particular application of class of application.

What you lose with this approach is any idea of future headroom. It’s great that my current rig is now VR ready — at least with today’s VR systems and games. But I’m unable to plan or account for future feature growth. In turn, that may affect how hardware manufacturers sell their components. Selling high-end components with the idea that you’ll be able to run future games well, not just today’s game, will become a tougher sell.

Then again, the idea that we somehow achieve future-proofing by buying high end hardware has always been an illusion. As new API features emerge, or new technology such as 4K displays arrive on the scene, high-end hardware often becomes obsolete anyway, no matter how well they perform with yesterday’s games.




Leave a Reply