Dynamic parallelism is a great idea. It will work well for API calls that have well defined layouts (i.e. one function calls the next calls the next, etc.). It will certainly help with slower CPU's, but the feature will be very dependent on driver optimization.
If the Southern Islands and Kepler GPU's have taught us anything, it's that driver advantages will become more and more pronounced now that the two companies have similar architectures and a comparable number of "stream cores"/"CUDA cores".
This makes me slightly nervous about the future of GPU's as both companies have demonstrated deficiencies in their drivers sets, AMD more so than Nvidia.
I am looking forward to GK110 for GPGPU and games.