PDA

View Full Version : Fair testing practices


Venturi
01-25-06, 07:19 AM
Well,
I took the time to read the 1900 and the crossfire reviews. Went to many sites and evaluated the results. Most comparisons were with the 7800gtx 512.


Many sites said that the 1900 beats the 7800gtx 512 and topples it in the sli crossfire comparisons...yet, when I look at the results, such as those posted by www.hardocp.com, it seems that the sli config beats the crossfire more times than not. Specifically, it seems that the margin improves with the greater the processing and memory requirements.

Case and point: test the cards at the extreme level so as to really weed out the hardware and driver builds. Do this at 2560x1600 with 8xAA and 8xAF, then run some benchies.

Don't look at max frame rate, but look at AVERAGE frame rate.

Then look at the MEAN DEVIATION from that frame rate. What? Quite simply, see how the performance dives under heavy loads.

Just a rant -I see myself unbiased. I think the cards should be allowed the best hardware and the most demanding conditions for the test especially if the tester is going to make statements like 'beats' 'bests' 'trumps' and 'smokes'.

Put both cards in an sli or crossfire motif, give them the best processors (amd dual cores), equip the rigs with 8 or 16gb of ram, run them on Win Srv 64-bit, run some labor intensive OpenGL (DX is not consistent enough for a litmus test), and run them at their maximum resolutions with as much eye candy as possible turned on. Hence take out all the excuses, have a real showdown.

You really want a fair test? Test the cards in 32 and 64 bit and do it in Windows and Linux. Then talk to me about fair hardware tests. Can't complete a test? Then the card gets a score of Zero and average that in with the final grade.

What card gives the highest performance in Windows and Linux 32 and 64 bit?
What card gives the highest and consistent performance at 2560x1600 with 8xAA and 8xAF?
What drivers are available for Linux, Windows, and Mac?
What cards have Vista/Vienna drivers?

Dont make the only test a 1280x1024 win xp sp2 DX signify enough data for a decision.

Yes I know, there will be those that say that those are not the normal testing environments or that those are not the average configurations of most users. Then again we are talking about users that sli / crossfire and have generous monitor real estate as well as a passion for big and fast.

If you are comparing and testing two sport cars: are you going to pit them together on Beltway traffic at 22 mph or would you put them through their paces on different types of tracks? How can you find the extreme limits and the weakest links with any less of a scenario?

Hence testing should provide average scenarios as well as extreme conditions, that way consumers can really make informed decisions.


Then let the results speak for themselves.


I challenge the 'evaluation committee' to be unbiased and more scientific.

saturnotaku
01-25-06, 07:49 AM
One thing you have to remember is that the vast majority of sites that do these tests are run by people who have no background in the media/journalism and have no other experience in doing testing other than what they do at their site. The online computer hardware Web site industry is as ruthlessly competitive as the automotive industry, if not moreso.

While I agree with some of your comments about changing the way people do testing, I think your views are a bit extreme. For instance, when testing high-end hardware, I absolutely agree that the minimum resolution for testing should be 1280x1024 with 4x AA and 16x AF. Going below that is pointless.

You have to understand that the enthusiast market is narrow enough already that if you start going into testing above resolutions like 2048x1536 and having systems that are Cray Supercomuter-like, all that's going to do is alienate your audience. The idea is to give people an idea of how a video card is going to perform on systems that a lot of people own or are thinking about owning. A pair of FX-60 processors, 16 GB of RAM and a 2560x1920 display will tell me less about how a higher-end GPU will run vs. an X2 4200+, 2 GB of RAM and a 1600x1200 display when my system is an A64 3500+, 2 GB of RAM and so forth.

I like your idea of testing 64-bit Windows, but it should only be done with games that offer full support without needing the wow layer.

Testing Linux is of little to no value for most people who buy top-end hardware. Take a look through the Linux forum here, you don't see too many users with 7800 series cards. Further, ask people at Rage3D about ATI's Linux support...or lackthereof.

You make some very valid points, and you are certainly entitled to your opinion. However, I think you need to be a bit more realistic when talking about how the types of hardware/software configurations that are being used to test these cards. :)

Venturi
01-25-06, 01:08 PM
I think that the concept was missed.

If a tester is going to say that a given card is better than another card - then they should pose their opinion on fairly obtained empirical data, not just a very narrow point of view.

If you want to say one card is better than another, then go into a testing zone that really taxes the hardware. Find the cpu and memory bottle necks. See when one card gives up the ghost and the other keeps going.

Either way, this was not a bashing session, that would be too easy.

No, this was more about making a common grid. Have the several OS variants on there, have various resolutions, have multiple and single cpu configurations and have both 32 bit and 64 bit tests. Then compare them on applications that take advantage of the various levels. That seems far more a fair evaluation that what I usually read from these sites.


While I agree with your comments, I would say that those apply to the 350 -400 or below cost factor of cards. But when one is spending almost 1400 dollars in an sli or crossfire config, then I would like to see testing at a higher level. The higher the card cost, the more informed I would like to be on making choices.


THX

OWA
01-25-06, 02:29 PM
I think a large part of it is, not enough time.

saturnotaku
01-25-06, 02:46 PM
I think a large part of it is, not enough time.

This is indeed true as well. With the exception of the megahuge sites like Anandtech, doing something like this is not a full-time occupation. When you have to get your information out on a specific day in order not to get beaten to the punch, you have to work with what you've got within your available means.

Medion
01-25-06, 04:54 PM
Do this at 2560x1600 with 8xAA and 8xAF, then run some benchies.

I'd rather see them use benchmarks that we're actually going to game at. For SLI/Crossfire, 1600x1200 seems to be a favorite, but there are a few higher (and widescreen) resolutions used too. 25x16 is higher than most game at, even with that configuration.

run some labor intensive OpenGL (DX is not consistent enough for a litmus test)

DX is what most games are run on. I want to see how a card compares in the games that I play, not some OGL non-gaming app.

Most people buy these graphics cards for gaming. They'll want to see how they perform in games they play at settings theyll play at.

Venturi
01-25-06, 05:03 PM
Well, the OpenGL game list is quite substantial.

This includes all the Id software stuff such as Quake 4 and Doom 3.

Some of the higher flight and space sims such as sturmovik are OpenGL.

Unreal is native to glide and OpenGL including 2004 and the new version. (to enable this use the config file)

and many more.

not to mention that OpenGL can scale really well with FPU capabilities.

Also many OpenGL games work well in the linux environment

OWA
01-25-06, 07:41 PM
but there are a few higher (and widescreen) resolutions used too. 25x16 is higher than most game at, even with that configuration.

Yeah, I'd like to start seeing widescreen resolutions included. Widescreens only seem to be gaining in popularity but one issue there is that we need all the games to natively support it also.

saturnotaku
01-25-06, 09:18 PM
Unreal is native to glide and OpenGL including 2004 and the new version. (to enable this use the config file)

The original Unreal was native to Glide, but 2003-04 were written as DirectX 8 titles. OpenGL was thrown in as a token gesture and it's never worked completely right.

Medion
01-26-06, 03:42 AM
Well, the OpenGL game list is quite substantial.


You then procede to list nearly every modern OpenGL game...and it takes up two lines. (Forgot CoH/CoV)

And like I said, I'm ok with OpenGL game benches, just not general apps. I care about a gaming card's performance in 3D rendering aps (non-gaming) as much as I care about my PS2's DVD playback.

All I care about, due to my monitor, is how well a card performs at 1280x1024 4xAA (or better) 8xAF (or better). However, I'd expect a comprehensive review to include benches at 16x12, and modern widescreen resolutions as well, since many games can and do use that setup.

Do this at 2560x1600 with 8xAA and 8xAF

Pointless, unless a card can make that playable.

run some labor intensive OpenGL (DX is not consistent enough for a litmus test)

And consistency or not, DX is a great litmus test. For every modern OpenGL title (2003 or later release) you can name, I could probably list at least 50 DX games.

Venturi
01-26-06, 06:44 AM
Well gentlemen,

the testers and their methodologies are seemingly correct and quite on the money. They do indeed cater to a target audience.

Thank you

jAkUp
01-28-06, 02:26 AM
Yeah, I'd like to start seeing widescreen resolutions included. Widescreens only seem to be gaining in popularity but one issue there is that we need all the games to natively support it also.

Yes this is still mind boggling to me. It is very easy to allow widescreen resolutions from within games, but even today, many games do not support widescreen resolutions. *cough* Fear *cough*

BTW Venturi, nice chatting with you the other day :D

Venturi
01-28-06, 08:19 PM
Nice chatting with you jAkUp.

When I first started this thread I fuigured I was right and the methodologies of testing sites was wrong. I am persuaded that they cater to the appropriate audience, however, I still think that claims of 'better' or 'best' card should be reserved for when thorough testing takes place and other parameters are taken into consideration.

a12ctic
01-28-06, 08:28 PM
The original Unreal was native to Glide, but 2003-04 were written as DirectX 8 titles. OpenGL was thrown in as a token gesture and it's never worked completely right.
really? it always works exeptionaly with opengl for me...

Sazar
01-28-06, 08:31 PM
Yeah, I'd like to start seeing widescreen resolutions included. Widescreens only seem to be gaining in popularity but one issue there is that we need all the games to natively support it also.

Completely agree.

Heck if a buggy engine like hl2's can support it so well, why can't others?

:)

Medion
01-28-06, 08:34 PM
When I first started this thread I fuigured I was right and the methodologies of testing sites was wrong.

I can agree with you there to an extent. There is no perfect way of testing. It's also not feasible for them to test every product (game, applciation) at every resolution, on every possible hardware configuration.

So, while their testing doesn't suit your needs (or heck, mine for that matter), they do cater to a general audience.

I still think that claims of 'better' or 'best' card should be reserved for when thorough testing takes place and other parameters are taken into consideration.


I don't think there is a "best" so much as there is a "best for this or that situation."

As you stated, just because Card A is faster at 16x12 does not mean Card B won't be faster at 25x16. That's why I key in on the 1280x1024 AA+AF benches, and ignore the pure speed benches. They don't apply to me.