Inside ATI's R420
May 4, 2004 / by aths / Page 4 of 4
Brute force vs features?
As we already stated, R420 (like the R300) still is much easier to optimize for than any GeForce, included Series 6. This is an advantage for ATI. But effective raw performance is not all. Even though NV40 is harder to optimize for, this GPU can do a lot what R420 still has to "emulate." Take so-called HDR (which is in our books in fact Medium Dynamic Rendering, MDR). R420 has to fetch at least four FP texture samples, filter them in the pixelshader, and render to an FP texture. NV40 has built-in FP filtering up to 16x AF (done by TMU, offloading the pixelshaders) and can render to an FP frame buffer (this disables multisampling AA, we should mention).
The "back-end" in HDR/MDR, tone mapping, can be done in NV40's RAMDAC-Logic while R420 has to utilize its pixelshaders. NV40 offers partly OpenEXR-support built-in with the chip. Real cinematic effects relying on such technologies (whatever names they may have). While R420 can render such content, it is not really designed for it. So we have to consider the R420 as an intermediate step.
Because the feature set is good enough for today's games (like Far Cry) and tomorrow's alike (say, Halflife 2), R420 an interesting product indeed.
Technical conclusion
ATI delivers what gamers want. While 3Dc (obviously based on DXT5) looks like quite a poor improvement compared to it's competitor's floating point TMUs, FP frame buffers, RAMDACs capable of applying tone mapping, full shader model 3.0 support including texture access from the vertex shader etc., normal map compression can be used instantly by developers and can boost performance in games using large amounts of bump mapping.
Naturally, ATI is proud of nearly having doubled the performance of the former king of the hill, their own 9800 XT. And it really is quite a feat. While R420 comes with 160 million transistors (about 30% less than another recently launched GPU) it includes everything you'll need to play games with good image quality at highest frame rates possible today.
Editor's personal conclusion
Regarding features and performance, R100 was prodigious, R200 even better, and R300 finally turned the tide regarding performance for good. With very small improvements to this chip (leading to R360), ATI managed to embarrass Nvidia again and again. Even NV38, a higher-clocked version of NV35 which is a quite an improvement over NV30, could not really compete with 9800 XT. Now Nvidia brings a completely new architecture with massive improvements regarding performance, they are still going to lose the battle once more, this time by a smaller margin, though.
However, performance should not be the one and only benchmark. Any FX 5200 with 64 bit memory interface for about 70 bucks offers way more shader features than an X800 XT. While the 5200 is extremely slow, it still is about factor 100 faster than software-rendering, while the X800 can only render comparatively simple shaders. I will not get into this topic in-depth here, and won't try to elaborate whether you'll need NV40's feature set. But it's a fact: Nvidia delivers both performance and features, while ATI provides more performance only. While X800 seems faster than Nvidia's latest part, it does not automatically mean it is the better card. On the other hand, since we'll probably not see actual advantages of SM 3.0 or other NV40 exclusive features now, the more "feature-complete" product is not automatically the better part either.
It is nice to see both leading GPU-developers aiming for different goals. We now can not only discuss the differences of various technological strategies, but also experience them.
So, how to rate a GPU? Personally, while performance is good and features are nice to have, I am satisfied with "enough performance" and "usable features" and focus on image quality. R420 offers rather old, but still the best antialiasing technology available. Because I am real sensitive of aliasing artefacts, I am fully pleased with the outstanding 6x sparsed edge antialiasing quality.
Also, I care about texture quality. Spoiled with the GeForce4 Ti's excellent 8x AF, I don't accept either R420's or NV40's heavily angle-dependent AF patterns. Regarding X800, we talk about a card with 16 (or 12) pipes, good FP24 precision (FP32 for some tasks) and many nifty features still waiting to be exploited in future games. But we also talk about a card with a bunch of precision problems regarding texture filtering. While (except the strong angle-dependency) each single issue is hardly noticeable in games, all of them together lead to still good, but not the brilliant texture quality I am used to.
After the previous Radeons, I had expected more. Not necessarily 3.0 shader, but for example substantial improvements in the pipelines. This is all ATI shows after nearly two years: More performance, but no real feature or architecture upgrade over the Radeon 9700. Also, no improvements regarding image quality. While this is still enough to get in the game, personally I am not convinced that the focus on what gamers "need" today is a good long-term strategy. But for "here and now", our benchmarks prove that even the X800 Pro is an excellent choise.
We would like to thank the 3DCenter-Forum, especially Demirug and ram, for their great input.
Also, we would like to thank ATI, especially Rene Fröleke and Friederike Weiss, for their kind support.