I think the problem is that DirectX offers no way to expose FX12 functionality (12-bit integer).

Since the FX can execute FX12 and floating-point ops in serial, this is a major problem for the performance of the FX.
