|
|
#97 | |
|
Registered User
Join Date: Nov 2009
Posts: 44
|
to mrn and cjg:
the laptop is a rebrand of a Clevo w860CU. in all my nvidia-bug-report you can find more specific information. summarizing: -CPU i7-820QM -4Gb ram DDR3 1333Mhz -nvidia GTX 260M - Gentoo Linux, kernel 2.6.34 - Xorg-server 1.8.1-r1 - Awesome WM Ok glxgears is not a bencmark but is quite indicative of the overall perfomance. In the results that I posted xcompmgr (the compositing manager) was disabled. I have CUDA Sdk installed so tell me which test you prefer or what other benchmark program you want. |
|
|
|
|
|
|
#98 | |
|
Registered User
Join Date: Apr 2010
Posts: 9
|
Quote:
./FDTD3d --dimx=144 --dimy=144 --dimz=144 for which I get: gtx260m - 284 MPoints/s gt240 - 754 MPoints/s c1060 - 1909 MPoints/s The gtx260m should really be close to the gt240. This one's not a sterling benchmark either, but it's fairly close... |
|
|
|
|
|
|
#99 | |
|
Registered User
Join Date: Nov 2009
Posts: 44
|
Quote:
./FDTD3d Starting... Set-up, based upon target device GMEM size... getTargetDeviceGlobalMemSize cudaGetDeviceCount cudaGetDeviceProperties calloc host_output malloc input malloc coeff generateRandomData FDTD on 144 x 144 x 144 volume with symmetric filter radius 4 for 5 timesteps... fdtdReference... calloc intermediate Host FDTD loop t = 0 t = 1 t = 2 t = 3 t = 4 fdtdReference complete calloc device_output fdtdGPU... cudaGetDeviceCount cudaSetDevice (device 0) cudaMalloc bufferOut cudaMalloc bufferIn set block size to 16x16 cudaMemcpy (HostToDevice) bufferIn cudaMemcpy (HostToDevice) bufferOut cudaMemcpyToSymbol (HostToDevice) stencil cudaEventCreate cudaEventCreate GPU FDTD loop t = 0 launch kernel t = 1 launch kernel t = 2 launch kernel t = 3 launch kernel t = 4 launch kernel cudaThreadSynchronize cudaMemcpy (DeviceToHost) cudaEventElapsedTime FDTD3d, Throughput = 179030.9718 MPoints/s, Time = 0.00002 s, Size = 2820096 Points, NumDevsUsed = 1, Blocksize = 256 |
|
|
|
|
|
|
#100 | |
|
Registered User
Join Date: Apr 2010
Posts: 9
|
...and it reported PASSED at the end? This number would imply a minimum memory bandwidth of 716 GB/s (the Tesla C1060 has a peak of 106, and the gtx60m has 61; good if we can sustain 10 or so) and the FDTD algo is not quite that efficient. Something's not quite right...
|
|
|
|
|
|
|
#101 | |
|
Registered User
Join Date: Nov 2009
Posts: 44
|
Quote:
CompareData (tolerance 0.000100)... Data error at point (4,4,4) 0.047833 instead of 10.829188 FAILED !!! Error # 0 at line 48 , in file src/FDTD3d.cpp !!! Exiting... ----------------------------------------------------------- |
|
|
|
|
|
|
#102 | |
|
Registered User
Join Date: Apr 2010
Posts: 9
|
Well, believe me, I'd be one rich little sucker tomorrow if I could get that number for real. Meantime, try changing 144 to 128 (the smallest setting it allows) for all three numbers, and see if you get a good result; thanks
|
|
|
|
|
|
|
#103 | |
|
Registered User
Join Date: Nov 2009
Posts: 44
|
Quote:
./FDTD3d Starting... Set-up, based upon target device GMEM size... getTargetDeviceGlobalMemSize cudaGetDeviceCount cudaGetDeviceProperties calloc host_output malloc input malloc coeff generateRandomData FDTD on 128 x 128 x 128 volume with symmetric filter radius 4 for 5 timesteps... fdtdReference... calloc intermediate Host FDTD loop t = 0 t = 1 t = 2 t = 3 t = 4 fdtdReference complete calloc device_output fdtdGPU... cudaGetDeviceCount cudaSetDevice (device 0) cudaMalloc bufferOut cudaMalloc bufferIn set block size to 16x16 cudaMemcpy (HostToDevice) bufferIn cudaMemcpy (HostToDevice) bufferOut cudaMemcpyToSymbol (HostToDevice) stencil cudaEventCreate cudaEventCreate GPU FDTD loop t = 0 launch kernel t = 1 launch kernel t = 2 launch kernel t = 3 launch kernel t = 4 launch kernel cudaThreadSynchronize cudaMemcpy (DeviceToHost) cudaEventElapsedTime FDTD3d, Throughput = 90619.4714 MPoints/s, Time = 0.00002 s, Size = 1966080 Points, NumDevsUsed = 1, Blocksize = 256 cudaThreadExit fdtdGPU complete CompareData (tolerance 0.000100)... Data error at point (4,4,4) 0.073752 instead of 10.781720 FAILED !!! Error # 0 at line 48 , in file src/FDTD3d.cpp !!! Exiting... ----------------------------------------------------------- |
|
|
|
|
|
|
#104 | |
|
Registered User
Join Date: Apr 2010
Posts: 9
|
Quote:
|
|
|
|
|
|
|
#105 | |
|
Registered User
Join Date: Nov 2009
Posts: 44
|
Quote:
./matrixMul Starting... Device 0: "GeForce GTX 260M" with Compute 1.1 capability Using Matrix Sizes: A(80 x 160), B(80 x 80), C(80 x 160) Run Kernels... matrixMul, Throughput = 47.7019 GFlop/s, Time = 0.00004 s, Size = 2048000 Ops, NumDevsUsed = 1, Workgroup = 256 Check against Host computation... PASSED |
|
|
|
|
|
|
#106 |
|
Registered User
Join Date: Apr 2010
Posts: 9
|
Cool. As far as I can tell, that's a good number (better than I get from a desktop gt 240, and about half what I get from a c1070). My Alienware is reporting about 19.8. Maybe I'll have to try gentoo.... thanks for the effort!
|
|
|
|
|
|
#107 | |
|
Registered User
Join Date: Nov 2009
Posts: 44
|
Quote:
I use only the CUDA volume rendering stuff and I think there was a marked improvement in performance from 256.29 drivers with the PowerMizer/ACPI fix. In any case in my experience, compiz and compositing in general significantly reduces the the video card performance... so I can live without some transparency if this leads to better video performace ![]() If you need some configuration files such as mine kernel .config for gentoo just ask... Gio |
|
|
|
|
|
|
#108 |
|
Registered User
Join Date: Nov 2009
Posts: 44
|
When I boot the laptop with AC power and I continue using it on battery mode Xorg crashes and that's is the error reported in dmesg:
NVRM: Xid (0002:00): 6, PE0001 NOTE: ACPI change correctly it's status from AC to Battery but nvidia-setting don't switch. It's WM indipendent, I have the same crash with Gnome or Awesome WM. |
|
|
|
![]() |
| Thread Tools | |
|
|