I havent been looking into this issue recently BUT
maybe there is some concurrency broken i.e.
this kind of operation breaks parallelism somewhere
somehow.... just a random thought ....
NV engineers give us the hw layout please :-)
(back to the boring report now...)