hrm, well i'm about to go away, so I'll tell you how i'm doing it on friday when I get home..
But anyway.. I'm using a gf2 (since my gf3 died).. So I got a friend to try it out using a gf4 - he sees about a 4 fold increase in performance.. I see about a 4 fold decrease.
So maybe the gf2 has a buggy vertex buffer implementation?
I'll email nvidia when I get home I guess.