ATI's filtering tricks
For several reasons, the development of graphics chips is constrained by transistor budgets. Features are subject to priorities, some are important, some are less important. Chip complexity has been and always will be limited so that developers need to strike a balance between performance and functionality - which includes image quality.
ATI have traditionally cut corners in the texture filtering department. Take, for instance, R200 (Radeon 8500): on the one hand quite sophisticated - for its time - pixel shading, support for overbright lighting (and apparently increased internal color precision), on the other hand there are disadvantages when it comes to texture filtering, in particular when using anisotropic filtering ("AF").
We do not only refer to the AF being bilinear only, or its extreme dependence on surface angle, but also to obvious flaws with the determination of mip map lod: as far as we know ATI's lod calculations can sometimes lead to texture aliasing with AF enabled. As we have never done in-depth research on this we don't want to press this point. Instead, we'd like to focus on filtering issues with more current hardware: The R3x0 family.
ATI finally realized that trilinear AF does make sense, and they also reduced the dependency on surface angle. Yet there are still certain angles that get only a 2x AF treatment with 16x AF selected. Although it is old news with ATI, we'd like to point out - again - that activating 16x AF will yield not 16x AF but only 2x AF for certain angles. Does this truly deserve the name "anisotropic filtering"?
Yes, it does. Anisotropic means just that: not isotropic. The R300's AF isn't isotropic (at any surface angle). When looking for a more precise, binding definition of AF we came up empty, but: the quality to be expected from "textbook AF" is no secret. We will deal with the inner workings of anisotropic filtering in a future article. At this point, we just want to state how a "perfect AF" needs to look. It must look exactly like trilinear filtering (TF) but with the mip maps pushed "away" by one level for each doubling of the degree of anisotropy. But enough rambling, "perfect" AF is virtually not available on any current hardware.
The one laudable exception is 2x AF on the GeForce series, but 4x AF is already sligtly "optimized" (at 45° angles), which becomes more apparent at 8x AF (also at 45° angles). Kyro graphics cards are limited to 2x AF, but handle this mode with perfection. The formula for optimal AF is so complex that it is common to tweak it, deviations from "pure AF" are accepted for the savings in transistor budget.
Saving transistors has apparently been ATI's goal, too. Implementation details are not available to the public, of course, but it is likely that ATI shied away from implementing the transistor consuming square root function required for AF. If you leave that out, the resulting circuit can only apply anisotropy at angles that are multiples of 90° (which handily explains the angle-dependency of R100 and R200). With a rather simple extension, still not implementing the square root circuitry, multiples of 45° become doable.
We arrived at this theory based on some of Demirug's ideas. Leaving theory aside for a moment, two things are for sure:
- ATI reduced the transistor count of the AF sampling point calculation.
- Fillrate is saved because the full degree of anisotropy is only partially applied over the full circle.
This makes it is easy to assert ATI has the faster AF. For a long time, AF was more or less just a checklist feature with questionable implementation. GeForce series graphics cards were about the first which delivered "reasonable" AF but were limited to 2x. The GeForce 3 could do up to 8x AF which, however, wasn't exposed by marketing. Presumably it wasn't deemed beneficial to recommend this "make everything slow" feature. Properly investigating this feature early on, and judging it by quality instead of frame rate, like the majority at the time, was an achievement of 3D Concept founder Raphael auf der Maur.
ATI's original Radeon (R100) hit the market earlier than the GeForce3 and could do 16x AF – albeit with limitations so severe that we don't think it's a useful "AF solution". The GeForce4 Ti's (NV25/NV28) bilinear AF performance is hampered, apparently because of transistor count considerations. And the GeForce FX (NV3x) can exchange AF quality for performance.
That isn't exactly our idea of progress. When balancing fill rate against quality, "textbook AF" is simply the optimum. Since NV20, Nvidia implements a pretty nice calculation of AF sampling points anyway, so it's a real pity that "8x AF" is offered at performance levels of 4x AF but with overall quality that's worse than that. It's fair enough to call this mode "8x AF" because some pixels are treated this way, but, of course, we'd rather utilize a given mode to its fullest before activating a higher one. Maybe Nvidia merely reacted to the pressure ATI put them under and wanted to show off high frame rates rather than making any given AF levels work as good as possible (as it should actually be expected out of a quality feature).
Before we jump in, we'd like to clear up a possible source of confusion up front: We don't take issue with ATI's 16x AF not filtering everything at 16x. 1x, 2x, 4x, 8x, and finally 16x filtering is selected dynamically, as required (Nvidia's hardware performs the same dynamic selection). The selection process depends upon the angle of incidence of the surface in question. If the texture is distorted at a 1:2 ratio, 2x AF is enough. There aren't any tangible quality gains to be had beyond that. What's different at ATI is the extreme dependency on surface angle. A weaker form of this dependency can be observed on Nvidia hardware, too, starting at 4x AF.
Unfortunately ATI went a bit further than the competition did with respect to texture filtering logic simplifications and, as a consequence, deviations from textbook quality.