nV News Forums

 
 

nV News Forums (http://www.nvnews.net/vbulletin/index.php)
-   NVIDIA Linux (http://www.nvnews.net/vbulletin/forumdisplay.php?f=14)
-   -   Understanding & getting started with CUDA for Linux (http://www.nvnews.net/vbulletin/showthread.php?t=166023)

xps8700 09-07-11 07:15 PM

Understanding & getting started with CUDA for Linux
 
Hello,

I would like to utilize my 2 laptop's Geforce 8700M GT for "background" processing or forefront processsing and teaming up with the main CPU for more intensive tasks.

I however have a problem when I try to run deviceQuery from the ~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/ folder. I get:

Code:

bash-4.1$ ./deviceQuery
[deviceQuery] starting...
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 10
-> invalid device ordinal
[deviceQuery] test results...
FAILED

Press ENTER to exit...

Now to get to this point, here's what I did:

1-I tried to install the devdriver "devdriver_4.0_linux_32_270.41.19.run" taken from the CUDA download page, but I am running Kernel 3.0.4 so the driver wouldnt compile & install. I followed the instructions of http://forums.opensuse.org/english/g...st2381467.html to patch the driver and I try to install it. It seemed to work but then when I started X, it crashed and I had to revert to my standard "NVIDIA-Linux-x86-280.13.run" driver...

2-Using the standard Nvidia driver, I continued with the installation of the CUDA toolkit by installing "cudatoolkit_4.0.17_linux_32_ubuntu10.10.run". Note: there is no Slackware package availabe and according to multiple sites such as http://lmf-ramblings.blogspot.com/20...kware-131.html installing the ubuntu or fedora package straight in slacware works. I tried and it seemed to worked well.

3-I then installed the GPU computing SDK ("gpucomputingsdk_4.0.17_linux.run") and all went fine.

4-I am trying to run the deviceQuery script, but it fails...

Can somebody help me??

Thanks!!!

Dizzle7677 09-07-11 07:19 PM

Re: Understanding & getting started with CUDA for Linux
 
nVidia's CUDA forums might give you some insight.

http://forums.nvidia.com/index.php?showforum=62

xps8700 09-08-11 08:47 PM

Re: Understanding & getting started with CUDA for Linux
 
Dizzle7677, it helped Yes & No... Its a good reference and starting point but regarding my problem with the deviceQuery, I performed a search on the nvidia CUDA forums, and not much at all... Googling the problem provided a few sites where people reported the same problem but no apparent solution...

I dont get it because I meet the requirements for CUDA as well as everything seems to be in place.

Anybody?

xps8700 09-20-11 11:21 AM

Re: Understanding & getting started with CUDA for Linux
 
OK I am back in this topic, and I believe the problem might be related to the driver I am using. Anybody used the 280 series of the nvidia driver? (the general use one, not the CUDA specific driver)...

Like I previously said, trying to install the 270.XX driver for CUDA works, but I get a black screen at login time (when X loads) and the X server crashes... Tonight I'll retry installing the CUDA driver and try to boot with it, and if it fails, I'll post the details of the X log.

In the meantime, anybody thinks the driver is NOT the problem?

xps8700 09-20-11 07:44 PM

Re: Understanding & getting started with CUDA for Linux
 
/var/log/Xorg.0.log does not say much...

Code:

[  144.338] (EE) NVIDIA(GPU-1): Failed to initialize the NVIDIA GPU at PCI:4:0:0.  Please
[  144.338] (EE) NVIDIA(GPU-1):    check your system's kernel log for additional error
[  144.338] (EE) NVIDIA(GPU-1):    messages and refer to Chapter 8: Common Problems in the
[  144.338] (EE) NVIDIA(GPU-1):    README for additional information.
[  144.338] (EE) NVIDIA(GPU-1): Failed to initialize the NVIDIA graphics device!

Any ideas?

johnc 09-21-11 03:16 PM

Re: Understanding & getting started with CUDA for Linux
 
Quote:

Originally Posted by xps8700 (Post 2481981)
Any ideas?

Is there anything in the kernel log?

xps8700 09-21-11 08:21 PM

Re: Understanding & getting started with CUDA for Linux
 
dmesg:

Code:

[    6.392343] nvidia: module license 'NVIDIA' taints kernel.
[    6.393170] nvidia: module license 'NVIDIA' taints kernel.
[    6.393173] Disabling lock debugging due to kernel taint
[    7.403407] nvidia 0000:03:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[    7.403417] nvidia 0000:03:00.0: setting latency timer to 64
[    7.403423] vgaarb: device changed decodes: PCI:0000:03:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
[    7.403609] nvidia 0000:04:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[    7.403620] nvidia 0000:04:00.0: setting latency timer to 64
[    7.403772] NVRM: loading NVIDIA UNIX x86 Kernel Module  275.09.07  Wed Jun  8 15:42:20 PDT 2011
[  155.942018] vmap allocation for size 16781312 failed: use vmalloc=<size> to increase size.
[  155.943943] NVRM: RmInitAdapter failed! (0x26:0xffffffff:1076)
[  155.944040] NVRM: rm_init_adapter(1) failed
[ 3616.768294] nvidia 0000:03:00.0: PCI INT A disabled
[ 3616.768321] nvidia 0000:04:00.0: PCI INT A disabled
[ 3633.390432] nvidia 0000:03:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 3633.390444] nvidia 0000:03:00.0: setting latency timer to 64
[ 3633.390448] vgaarb: device changed decodes: PCI:0000:03:00.0,olddecodes=none,decodes=none:owns=io+mem
[ 3633.390566] nvidia 0000:04:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 3633.390573] nvidia 0000:04:00.0: setting latency timer to 64
[ 3633.390690] NVRM: loading NVIDIA UNIX x86 Kernel Module  270.41.19  Mon May 16 23:31:36 PDT 2011
[ 3633.397432] nvidia 0000:03:00.0: PCI INT A disabled
[ 3633.397458] nvidia 0000:04:00.0: PCI INT A disabled
[ 3665.329440] nvidia 0000:03:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 3665.329453] nvidia 0000:03:00.0: setting latency timer to 64
[ 3665.329457] vgaarb: device changed decodes: PCI:0000:03:00.0,olddecodes=none,decodes=none:owns=io+mem
[ 3665.329591] nvidia 0000:04:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 3665.329599] nvidia 0000:04:00.0: setting latency timer to 64
[ 3665.329727] NVRM: loading NVIDIA UNIX x86 Kernel Module  270.41.19  Mon May 16 23:31:36 PDT 2011
[ 3668.397206] vmap allocation for size 16781312 failed: use vmalloc=<size> to increase size.
[ 3668.399156] NVRM: RmInitAdapter failed! (0x26:0xffffffff:1050)
[ 3668.399231] NVRM: rm_init_adapter(1) failed
[ 3697.176799] vmap allocation for size 16781312 failed: use vmalloc=<size> to increase size.
[ 3697.178727] NVRM: RmInitAdapter failed! (0x26:0xffffffff:1050)
[ 3697.178746] NVRM: rm_init_adapter(1) failed
[ 3755.322306] vmap allocation for size 16781312 failed: use vmalloc=<size> to increase size.
[ 3755.324229] NVRM: RmInitAdapter failed! (0x26:0xffffffff:1050)
[ 3755.324248] NVRM: rm_init_adapter(1) failed
[ 3777.096214] vmap allocation for size 16781312 failed: use vmalloc=<size> to increase size.
[ 3777.098146] NVRM: RmInitAdapter failed! (0x26:0xffffffff:1050)
[ 3777.098164] NVRM: rm_init_adapter(1) failed
[ 3839.604563] vmap allocation for size 16781312 failed: use vmalloc=<size> to increase size.
[ 3839.606488] NVRM: RmInitAdapter failed! (0x26:0xffffffff:1050)
[ 3839.606506] NVRM: rm_init_adapter(1) failed

/var/log/Xorg.0.log
Code:

[  3839.605] (EE) NVIDIA(GPU-1): Failed to initialize the NVIDIA GPU at PCI:4:0:0.  Please
[  3839.605] (EE) NVIDIA(GPU-1):    check your system's kernel log for additional error
[  3839.605] (EE) NVIDIA(GPU-1):    messages and refer to Chapter 8: Common Problems in the
[  3839.605] (EE) NVIDIA(GPU-1):    README for additional information.
[  3839.605] (EE) NVIDIA(GPU-1): Failed to initialize the NVIDIA graphics device!
[  3839.605]
Backtrace:
[  3839.605] 0: /usr/bin/X (xorg_backtrace+0x3b) [0x80e72fb]
[  3839.605] 1: /usr/bin/X (0x8048000+0x5dbf5) [0x80a5bf5]
[  3839.605] 2: (vdso) (__kernel_rt_sigreturn+0x0) [0xffffe40c]
[  3839.605] Segmentation fault at address (nil)
[  3839.605]
Fatal server error:
[  3839.605] Caught signal 11 (Segmentation fault). Server aborting
[  3839.605]
[  3839.606]
Please consult the The X.Org Foundation support
        at http://wiki.x.org
 for help.
[  3839.606] Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[  3839.606]

What caught my attention is this: NVRM: RmInitAdapter failed! (0x26:0xffffffff:1050)

What do you think?

johnc 09-21-11 09:12 PM

Re: Understanding & getting started with CUDA for Linux
 
Regarding those errors, and since you appear to be using a 32-bit kernel, consider the following:

http://www.warp1337.com/content/ubun...failure-solved

http://us.download.nvidia.com/XFree8...ownissues.html (Scroll down to "kernel virtual address space exhaustion" section.)

xps8700 09-22-11 11:29 AM

Re: Understanding & getting started with CUDA for Linux
 
Solved it! Thanks!!!

It was RAM exhaustion. You see, I am using a 32bit kernel (3.0.4) with 4GB of RAM. The vmalloc was around 128MB (as provided by cat /proc/meminfo ==> VmallocTotal: 122560 kB)

Now its set for 256MB and passed automatically to the kernel witrh LILO.

The driver successfully boots both GPU's and X starts. The deviceQuery now works.

I conclude the standard nvidia driver is not meant for CUDA computations and thats probably why the devices were not detected by the deviceQuery script.

Unless I am wrong, I think that I am ready to start using CUDA.

Thanks for the help everybody!


All times are GMT -5. The time now is 12:14 PM.

Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.