View Single Post
Old 06-16-10, 06:04 PM   #2
nV News Alumni
ragejg's Avatar
Join Date: Sep 2002
Location: Finger Lakes of Hell, NY
Posts: 5,399
Send a message via AIM to ragejg
Default Re: GPU Benchmarking Methods Investigated: Fact vs. Fiction

Full blurb (for the lazy):

So here we are after more pages than this article had any right being and believe it or not our conclusion is actually pretty straightforward: there is no “perfect” way to benchmark a game. There are just too many variables that can affect the outcome and it is impossible to take all of them into account. However, I think we’ve proven there are methods and methodologies which should be avoided at all costs in order to ensure accurate results.

Let’s begin with the most popular way of benchmarking games: in-game or stand-alone rolling benchmarks that spit out a result with minimal user involvement. In general, they don’t mean jack all when it comes to determining a GPU’s performance within the game itself. Stand alone benchmarks are heavy offenders for a number of reasons including a lack of patches and sequences which aren’t at all representative of in-game conditions. In-game rolling benchmarks receive all of the necessary game engine patches but more often than not still fall short when it comes to displaying actual gameplay. However, there are currently a small number of games like DiRT 2 and HawX which incorporate benchmark sequences that accurately recreate in-game scenarios. In this category we believe stand-alone benchmarks should be avoided altogether while in-game benchmarks should only be used if they represent actual gameplay and don’t include a “flythrough”.

Next up we have timedemos. For the most part we have seen accurate results when timedemos are compared to in-game sequences and it is a shame we are seeing less and less games with the possibility of recording and playing back these sequences. One of the most important aspects of timedemos is their ability to accurately repeat the exact same sequence over and over again. This is invaluable for benchmarking purposes since even a manual run-through can’t be exactly repeated every time regardless of what some would have you think.

The most important thing about accurately benchmarking a game comes down to one word: research. It isn’t often that the first level or in-game benchmark will give an accurate representation of GPU performance. Using one of these aforementioned methods could lead a writer to come up with the wrong conclusion if he takes the easy way out. Knowing the game one is testing through actual playing time is essential. There is no way anyone should be basing a conclusion off of games they aren’t totally familiar with because as we have seen, the risk of projecting the wrong information through incorrect benchmarks is very high indeed. Many wrongfully think it is fine to load up a few stand-alone benchmarks, hope the results line up with gameplay and be done with it.

This all boils down to one thing: transparency. There are too many times where publications will throw up a GPU review while making no mention of the levels being used or even whether their results are from a built-in or stand alone benchmark. A conclusion should never be based upon results gleaned from stand-alone benchmarks while in-game benchmarks should only be used if their results line up with those from an actual gameplay sequence. This is why we believe that it is imperative publications state exactly what tools they are using for their benchmarks.

To counteract this air of “secrecy” we used to exemplify, we have launched our Guide to the Hardware Canucks GPU Benchmarking Process. We suggest you check it out since it takes the lessons learned throughout the course of this article and opens up our benchmarking process to the public.

If anything we hope this article allows you to look at reviews and benchmarks in general with a more critical eye. With the ability to influence the buying decisions of consumers publications need to invest the time necessary to ensure their readers are getting the best possible information. Some of these methods may take a lot of time, but in the highly controversial world of graphics card reviewing, things need to be done right and discussed openly.

2010-2011 Reviews: GTX 570 | GTX 580 | GTS 450 | GTX 460 | GTX 465
Pre-2010 Reviews: 6600 GT | XMS 4400 DDR | SilenX Cooler | 6800 | 5900 XT | Personal Cinema | 5900 NU

Phenom II x6 1090t @ 4.0 ghz | Asus M498TD-EVO Am3 SLI nForce 980a | 2x EVGA GTX 560 SLI | 2x4gb DDR3-1333 | Antec EarthWatts EA650 PSU | 60gb Mushkin Calisto Enhanced Sandforce SSD | 2x WD2500KS RAID 0 | Sunbeam Tuniq 3 case | 24" Asus 19x10 LED LCD | 26" Panasonic 720p TV | Sidewinder X5 mouse | Logitech MX5500 & Revolution mouse | Altec Lansing 5.1 THX-Certified audio | Win 7 Ultimate | desk | couch

ragejg is offline   Reply With Quote