What's going on at S3?
November 22, 2005 / by robbitop / page 2 of 4 / translated by the 3DCenter Translation Team
The Chrome 2x series can even top this: Per pixel pipeline it features two pixel shader ALUs. According to information we received these ALUs are the same as their counterparts used by the GammaChrome series. With GammaChrome and the Chrome 2x series, S3 follows a recent trend in pipeline development, where arithmetic power is preferred over pure texturing capacity.
A closer inspection of (DeltaChrome's) anisotropic filter revealed two serious issues. The first is that trilinear filtering seems not to work in combination with the anisotropic filter. As a consequence there is only bilinear anisotropic filtering which may lead to annoying MIP banding. According to information we received from Beyond3D, this issue has not been fixed with GammaChrome.
The second unpleasant issue is the anisotropic filter's tendency to produce texture shimmering when applied on high-frequency textures. The trilinear filter does not produce this kind of artifacts. As soon as the anisotropic filtering is applied on high-frequency textures, these tend to produce shimmering and Moiré patterns.
The floor in the back shows a visible Moiré pattern. Animation further aggravates the issue (clicking opens larger images). |
A positive LOD (level of detail) improves the situation, but this also nullifies the anisotropic filter's texture sharpening effect. As we do not have any GammaChrome test sample, we are not in a position to confirm if the shimmering issue has been resolved or not.
As mentioned before, DeltaChrome uses sparse grid supersampling anti-aliasing (SGSAA) with 2, 5, or 9 samples. SSAA, which is used in all Columbia products, renders the scene at a higher internal resolution. As an example 1024x768 with 2xSSAA is the result of rendering the scene internally at 1024x1536 or 2048x768 pixels, and scaling the image back to 1024x768 for output. As the maximum render target size is restricted to 2048x2048 it is obvious that the number of usable AA modes at higher resolution is capped.
In practice DeltaChrome's performance is more often than not insufficient to produce playable frame rates with the higher (5x or 9x) SSAA modes anyway. However, a look at the sub-pixel pattern, the uneven AA modes and the maximum render target size got us thinking. The implementation of SGSSAA on a graphics chip is difficult and transistor consuming. It would be illogical to run this investment into the wall by so severely limiting the maximum render target size. Besides, the subpixels are distributed in an atypically regular way.
This all implies that DeltaChrome actually does use only an ordered grid. We believe DeltaChrome simply rotates the scene by the desired angle around the Z axis and then renders it. The subpixel sampling of course remains unchanged. After the scene has been rendered the pixel shader rotates the scene back into the original position, which in practise results in a sparse grid. The drawback of this procedure is a substantial overhead which once could again results in a performance loss. But before rendering the drawback, most of it is culled away by the HSR units (driver marked those pixels for culling).
The driver rotates the scene geometrically before rendering.
The part indicated by the frame in the above picture is cropped out and rotated back. Bam - sparse grid anti-aliasing.
Our investigations revealed still another peculiarity: Since the release of the G1 driver there seems to be an error in the above procedure used to rotate the grid.
How it should be ... | ... and how it really is. |
The orientation of the resulting subpixel pattern is now less than optimal. Both samples can now be found exactly on the pixel's border and are overlapping between the two adjacent pixels. At first glance the pattern looks like a 4 sample one. Only if you take a closer look you can see that this is indeed not the case, but the samples are overlapping between different pixels. This is far from optimal, as polygon edges are no longer correctly, but only poorly, smoothed. In our view this is a big faux pas which, despite the fact that we have been signaling it since 6 months to S3, has not yet been remedied.
Like with DeltaChrome, GammaChrome's 2xFSAA is rather useless as it uses an identical subpixel mask. S3, however, has increased the render target size to 4096x4096, so that as a consequence 5xAA is now generally available in all resolutions. Unfortunately, none of the S3 products has sufficient power to allow reasonable frame rates with activated 5xAA.
Further investigations on DeltaChrome revealed another issue: Alpha blending is very expensive, much more so than would be usual. As soon as alpha blending is applied, games which previously ran at fine frame rates, are turning into a slide show. As an example, smoke caused by a wheel spin extremely deteriorates the performance in Need for Speed: Underground. The same problem can be seen in Max Payne 2. While the game generally runs extraordinary well on DeltaChrome, just a soupcon of smoke causes an extreme performance loss.
Alpha blending is a method used to enable transparency effects with 256 degrees of intensity (8 bit integer). This can be among other smoke, fume, or transparent fog, which occludes the landscape in the back. In order to deliver the desired transparency, alpha blending requires a strictly sorted rendering order for objects. This drains CPU cycles and bandwidth (texture changes cause texture cache flushes). Alpha blending is far from being for free.
S3 Graphics have confirmed to us that this is caused by a bug in the DeltaChrome design. Adressing it requires a driver fix, specially adapted for each title, which is an extremely time consuming procedure. With GammaChrome, the bug has been fixed at the hardware level.