The application I used on Windows is the video player from the sample directory of CUDA on windows, which is used to show the decoding capability of CUDA Video API. I think the sample just uses the HW acceleration, because it is just a sample code, the design is not necessarily consider multiple players run on a single GPU in Windows XP. But I will check that anyway.
I wrote emails to firstname.lastname@example.org
, asking for the same video decoding API on Linux platform. However, there is no such an offer in Linux CUDA.
I think it is possible to decode multiple stream on a single GPU, as the decoding context are maintained in the user's memory space, which is the VDPH264PictureInfo Structure. Different stream uses different decoding context, but share the same HW pipeline. Just like a printer shared by many people. All the people can use that HW, but they certainly need to decide what they are going to input to the printer.
Not sure whether this is just a software implementation limit or a hardware limit ....