I've read that after the final 3xxx series driver, 3123, NVidia switched to a new 2d acceleration architecture, instead of the standard XAA. This seems to be the case from my log entries. It appears that this new method isn't yet as fast as the older XAA method.
Personally, I've downgraded back to 3123 after using the latest drivers because the 2D is so much faster. For example, scrolling www.penny-arcade.com
in Konqueror is jerky under 4363, but smooth under 3123.
Note: Using kernel 2.4.20 from gentoo I had to edit nv.c in the kernel module code to get 3123 to compile: I replaced a pte_offset with pte_offset_kernel, as the name of the function seems to have been changed in the kernel.