It's a waste of bandwidth to be moving things from video ram to system ram when there's no acceleration being done in VRAM anyway. I wouldn't call it an accident, it's more like a blessing since subpixel font aa is pretty complicated to accelerate anyway, maybe GTK developers realize this and implemented it as such.
1.) GTK creates short-living pixmaps wich are never moved to VRAM - this SLOWS DOWN GTK BY 25% overall. GTK devs have reasons why they do so (although I think it has some major drawbacks), but it seems nvidia drivers cope really bad with this case.
I don't see a reason to do all rendering by the CPU, just because one frequently used operation is not accalerated (but could be) in hardware - and thats why GPUs and vram exist, to do rendering _there_.

2.) No one says subpixel-AA has to be accalerated, but it should be "fast enough".
With fast enough I mean that I can work without noticing ugly slowness with "normal" applications, who cares about benchmarks.

3.) Well then I whish you a lot of fun with XFCE. Keep in mind you're GPU practically does nothing more than displaying the stuff calculated by you CPU *lol
Its a bit like changing your car just because with the tires you put onto it you can't driver more than 50km/h.

best whishes, lg Clemens
