Zum 3DCenter Forum
Inhalt




ATI's filtering tricks

November 21, 2003 / by aths / page 3 of 3 / translated by 3DCenter Translation Team


   Save to the max

ATI hardware is very carefully designed. The cut corners in texture filtering we've been criticizing are hardly noticed in practice. It seem's to have been a basic design philosophy for R300 to offer only the precision that's needed. This is how we'd like to prove it:

ATI's pixel shading hardware is running at 24 bits of floating point (FP) precision, formally "s16e7" (ie sign, 16 bits for the normalized mantissa, 7 bits for a two-based exponent - we defer a discussion about what exactly that means to a later point in time).

Textures usually yield 8 bits of integer precision per color channel. That's exactly frame buffer precision. It's still desirable to use higher levels of precision for shader calculations, because they can involve many operations.

Nvidia's 16 bit floating point format is sufficient for a lot of cases but not for all: floating point is not per se better than fixed point, there are cases where a 16 bit fixed point format (FX16) is better than FP16. Thus Nvidia will presumably offer FX16 on NV40. (Update: Now, as the NV40 is out, we learned that this CPU still uses FX12 for both DirectX Multitexturing and Pixelshader 1.x.) FP24's precision however is always as good or higher than the CineFX proprietary FX12, FX16 and of course FP16.

So far, so good. Apart from color operations, there are also operations that modify sampling points, eg matrix multiplications (2D transformations) or dependent reads (for eg environment mapped bump mapping). This is the realm of texture coordinates, modified inside the shader. Texture coords usually lie in the [0...1] range, values outside of that range are however just as valid.

With a 16 bit mantissa we get 2^16=65536 distinct values in the [0.5...1] range. Sounds like a lot of precision. With large textures, eg 2048x2048, that leaves us with 64 steps between two texels, ie 6 bits of "fraction". Isn't that enough?

It is - for isotropic filters. When using AF, this coordinate is a starting point for calculating more sampling positions. These calculations are most likely performed with FP32. We already learned that the bilinear filter (the fundamental building block for both trilinear and anisotropic filters) has been heavily "optimized", it's not as precise as it should be.

Because of AF's adaptive nature, many samples are generated from smaller resolution mip map levels, so precision issues are somewhat reduced. FP24 is adequate for coordinate calculations with the level of effects we expect from pixel shader 2.0, even though some pixel shader 1.1 texture operations are already performed with FP32 on the GeForce3.

Unless there's heavy magnification involved, the block artifacts in ATI's bilinear filtering only show up as slight color skew and is hardly visible. AF still requires exact positioning of AF samples. As far as we know, the FP32 precision that's used for texture lookups doesn't quite produce the same quality as competing products.

It all fits together: there's only FP24 for operations on texture coordinates, yet there's no apparent disadvantage because of the simplifications in BF, TF and AF. This is how R300 offers unmatched performance, but doesn't deliver the best image quality. From an "ethics" point of view (whatever that means to ATI and Nvidia) the competition can easily reduce image quality through drivers, to squeeze a bit of extra performance out of the chips and keep up.

Of course, hardware solutions must work in practice, and optimal texture filtering implementations come at a premium in transistors. It's not per se wrong to investigate this aspect for possible savings. As long as side effects are negligible. Even though ATI's isotropic filtering is mostly sufficient for gaming, we criticize their going below textbook quality (ie SGI's 8 bit). Isotropic filtering close to perfection is something we expect from any gaming hardware.

ATI, however, are agressive when it comes to savings: only 5 bits of LOD fraction, only 6 bits of resolution for filter weights. On top of that a rather peculiar LOD determination scheme that seems to lack a lot of precision in some situations.

When it comes to AF we're also skeptical of the extreme angle dependency: it's something we're used to with ATI, but this doesn't make it a valid solution going forward to have great pixel shading on the one hand, and on the other hand to be restricted to performance optimized filtering schemes with definite drawbacks in quality.

R300's texture filtering logic is an improvement upon R200's: BF block artifacts are still there, AF was an improvement over R200 but still won't match the competition's quality. The presumed LOD bug in R200's AF makes a comeback in R300, albeit in a slightly different form. With LOD biases greater than zero there's LOD clamping instead of biasing (ie the highest resolution mip map level is never used alone, which reduces maximum detail).

We believe it's a hardware issue, even though it can likely be worked around with drivers. Even though the LOD bias should always be exactly zero: R300's texture filtering does leave a lot of room for improvement, both for isotropic and anisotropic modes.

This is somewhat disappointing: whatever area of R300 we were poking at, we've always found something that could have been done better, as demonstrated by the competition's textbook quality. It's likely that there are even more filtering simplifications on the Radeon, that we simply haven't found yet.

We'll make filtering quality an important part of our testing methods for future hardware, regardless of vendor. Frame rate is only one part of the equation, a fair benchmark should always consider image quality as well. Texture filtering is the foundation of all 3D realtime rendering, regardless of whether you use a multitexture pipeline or pixel shaders. Optimum filtering quality isn't an added bonus, it's a requirement.



Thanks go out to Demirug, whose ideas were invaluable to our investigation on filter quality, to Xmas for hints, corrections and screen shots, to Argo Zero and Axel for screen shots, and also to Ailuros, who unintentionally got the ball rolling :-)






comment this article in our english forum - registration not required Zurück / Back 3DCenter-Artikel - Index Home

Shortcuts
nach oben