Go Back   nV News Forums > Linux Support Forums > NVIDIA Linux


                    

Sponsored Ad

Reply
 
Thread Tools
Old 07-17-08, 05:59 PM   #1
Linuxhippy
Registered User
 
Join Date: Feb 2004
Posts: 580
Default Developer's XRender performance report - 173.14.09 / GF6600

Hello,

I am currently developing a Java2D pipeline which uses XRender a lot, and here are my findings about XRender performance. Maybe this helps driver developers a bit to tune their drivers.

I am using:
- OpenSuse 10.3 Alpha 3
- NVidia 173.14.09
- Geforce 6600, 256MB vram

1.) Repeating a small picture (RepeatNormal) over large areas is horrible slow. My intel-945GM is able to fill a ~1000x700 area with a 10x10 picture with about ~100fps, on my GF6600 it takes 1s.

2.) Transformations (all or just a few) are not accalerated, which is a real problem for the software I am developing.

3.) Gradients are not accalerated.
They are also not accalerated with EXA, but it would be really helpful.
2-stop gradients can be implemented even withought shaders, linear and radial gradients could be done with shaders easily (Java's OpenGL backend does this already).

4.) per-composition overhead is extremly high, this hurts performance of my pipeline on nvidia hardware quite a lot. Its not uncommon for my pipeline to composite only a 10x10 area.

5.) X11-Lines are accalerated and really fast. I need them, so here nvidia definitivly has a performance advantage over EXA based drivers

lg Clemens
Linuxhippy is offline   Reply With Quote
Old 07-17-08, 08:44 PM   #2
AaronP
NVIDIA Corporation
 
AaronP's Avatar
 
Join Date: Mar 2005
Posts: 1,925
Default Re: Developer's XRender performance report - 173.14.09 / GF6600

Please attach your test case. We're working on a number of improvements for repeating and transformed source images for a future driver release. I filed an enhancement request a while ago for gradients, but it's unlikely that they'll be implemented very soon.

Zero-width lines are just about the fastest thing the GPU can do, aside from drawing the root weave.
AaronP is offline   Reply With Quote

Sponsored Ads - Guests Only

Old 07-17-08, 08:50 PM   #3
Plagman
NVIDIA Corporation
 
Plagman's Avatar
 
Join Date: Sep 2007
Posts: 79
Default Re: Developer's XRender performance report - 173.14.09 / GF6600

Accelerated support for all repeating modes and all transformations for Composite operations is implemented and will be available in a future driver release.

Per-composition overhead should be negligible for accelerated operations on accelerated Pictures. Make sure you're using InitialPixmapPlacement=2 and avoid creating Pictures in your inner loop. If you don't feel like you're doing anything wrong and are still bottlenecked, please write a standalone X test case and post it here.
Plagman is offline   Reply With Quote
Old 07-18-08, 06:48 AM   #4
Linuxhippy
Registered User
 
Join Date: Feb 2004
Posts: 580
Default Re: Developer's XRender performance report - 173.14.09 / GF6600

Quote:
Originally Posted by AaronP View Post
Please attach your test case.
The pipeline is quite a lot of code itself integrated into even more code and I am quite busy for now. Once my deadline is over I will try to write test-cases for the other mentioned problems too.
Once its in a useful state I could send you a link along with a description howto try it out?

Quote:
We're working on a number of improvements for repeating and transformed source images for a future driver release.
Good to know, thank you for working on this

Quote:
I filed an enhancement request a while ago for gradients, but it's unlikely that they'll be implemented very soon.
Well I know its probably a lot of work, but I can only guess gradients will be used more and more.
Although its just a simple UI, gradients kill its performance completly: http://bp1.blogger.com/_Y_-jaz-4d00/...bus_better.png

Quote:
Zero-width lines are just about the fastest thing the GPU can do, aside from drawing the root weave.
Sad that this is not implemented in EXA. Although I agree that with current GPUs its not worth to worry about complex geometry, the trapezoid approach is a bit to minimalistic for my taste (and also too slow ^^).
This is one of the reasons the pipeline performs very well on nvidia GPUs

Quote:
Per-composition overhead should be negligible for accelerated operations on accelerated Pictures. Make sure you're using InitialPixmapPlacement=2 and avoid creating Pictures in your inner loop. If you don't feel like you're doing anything wrong and are still bottlenecked, please write a standalone X test case and post it here.
Well, I ran the test and it was really *fast* ... so I played a bit with the mask-size and found out that if the mask is smaller that arround 64x64 composition overhead is really high (maybe such small pixmaps are not migrated to vram?).
I attached the simple benchmark.

Thanks for your helpful comments, lg Clemens
Attached Files
File Type: gz compbench.tar.gz (5.8 KB, 106 views)
Linuxhippy is offline   Reply With Quote
Reply




Shop Online


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump




All times are GMT -4. The time now is 01:46 PM.


Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
nV News - Copyright ©1998-2010. All rights reserved.