Go Back   nV News Forums > Linux Support Forums > NVIDIA Linux

Newegg Daily Deals

Reply
 
Thread Tools
Old 10-18-10, 01:03 PM   #1
asenjo
Registered User
 
Join Date: Oct 2010
Posts: 3
Default Unable to install Nvidia driver in SLES 11

Hi all,

I'm sorry if this is not the appropriate forum to post the following issue. I'm trying to use Cuda in a HP Proliant DL580 server (32 cores, 128GB main memory) connected to a brand new Tesla s2050 with 4 Fermi GPUs. The server is running SLES 11 SP1 x86_64 in runlevel 3 with this kernel:

Code:
yuca:~ # uname -a
Linux yuca 2.6.32.23-0.3-default #1 SMP 2010-10-07 14:57:45 +0200 x86_64 x86_64 x86_64 GNU/Linux
The lspci command reports the following:

Code:
yuca:~ # lspci
00:00.0 Host bridge: Intel Corporation 5520/5500/X58 I/O Hub to ESI Port (rev 22)
00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 22)
00:02.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 2 (rev 22)
....
....
81:00.0 PCI bridge: NEC Corporation uPD720400 PCI Express - PCI/PCI-X Bridge (rev 06)
81:00.1 PCI bridge: NEC Corporation uPD720400 PCI Express - PCI/PCI-X Bridge (rev 06)
86:00.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 (rev a3)
87:00.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 (rev a3)
87:01.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 (rev a3)
87:02.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 (rev a3)
87:03.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 (rev a3)
8a:00.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 (rev a3)
8f:00.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 (rev a3)
90:00.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 (rev a3)
90:01.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 (rev a3)
90:02.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 (rev a3)
90:03.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 (rev a3)
93:00.0 PCI bridge: nVidia Corporation NF200 PCIe 2.0 switch for Quadro Plex S4 / Tesla S870 / Tesla S1070 / Tesla S2050 (rev a3)
So it seams that the server can see the Tesla s2050 through the PCI bus.

However, if I try to install the latest NVIDIA driver (NVIDIA-Linux-x86_64-260.19.12.run.sh), I first get a warning "You do not appear to have an NVIDIA GPU supported by the 260.19.12 NVIDIA Linux graphics driver installed in this system". I don't understand why, because in this page: http://www.nvidia.com/object/product...-S2050-us.html, SLES 11 is one of the OS supported for the s2050, and here: http://www.nvidia.com/object/linux-d...12-driver.html the s2050 is listed as supported.

Then, although the module is successfully compiled, it can not be loaded: "ERROR: Unable to load the kernel module 'nvidia.ko'.". Looking at the /var/log/nvidia-installer.log file, I found these two important messages:

Code:
-> Kernel module load error: insmod: error inserting './kernel/nvidia.ko': -1
   No such device
and, in the kernel messages section:

Code:
   [276290.049770] NVRM: No NVIDIA graphics adapter found!
I would be really really grateful if you can help me or point out what I'm doing wrong. Thank you very much in advance.

The whole nvidia-installer.log file follows:
Code:
yuca:~ # cat /var/log/nvidia-installer.log 
nvidia-installer log file '/var/log/nvidia-installer.log'
creation time: Mon Oct 18 17:44:07 2010
installer version: 260.19.12

PATH:
/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin
/X11:/usr/X11R6/bin:/usr/games:/usr/lib/mit/bin:/usr/lib/mit/sbin

option status:
  license pre-accepted               : false
  update                             : false
  force update                       : false
  expert                             : false
  uninstall                          : false
  driver info                        : false
  precompiled interfaces             : true
  no ncurses color                   : false
  query latest version               : false
  no questions                       : false
  silent                             : false
  no recursion                       : false
  no backup                          : false
  kernel module only                 : false
  sanity                             : false
  add this kernel                    : false
  no runlevel check                  : false
  no network                         : false
  no ABI note                        : false
  no RPMs                            : false
  no kernel module                   : false
  force SELinux                      : default
  no X server check                  : false
  no cc version check                : false
  run distro scripts                 : true
  no nouveau check                   : false
  run nvidia-xconfig                 : false
  sigwinch work around               : true
  force tls                          : (not specified)
  force compat32 tls                 : (not specified)
  X install prefix                   : (not specified)
  X library install path             : (not specified)
  X module install path              : (not specified)
  OpenGL install prefix              : (not specified)
  OpenGL install libdir              : (not specified)
  compat32 install chroot            : (not specified)
  compat32 install prefix            : (not specified)
  compat32 install libdir            : (not specified)
  utility install prefix             : (not specified)
  utility install libdir             : (not specified)
  installer prefix                   : (not specified)
  doc install prefix                 : (not specified)
  kernel name                        : (not specified)
  kernel include path                : (not specified)
  kernel source path                 : (not specified)
  kernel output path                 : (not specified)
  kernel install path                : (not specified)
  precompiled kernel interfaces path : (not specified)
  precompiled kernel interfaces url  : (not specified)
  proc mount point                   : /proc
  ui                                 : (not specified)
  tmpdir                             : /tmp
  ftp mirror                         : ftp://download.nvidia.com
  RPM file list                      : (not specified)
  selinux chcon type                 : (not specified)

Using: nvidia-installer ncurses user interface
WARNING: You do not appear to have an NVIDIA GPU supported by the 260.19.12
         NVIDIA Linux graphics driver installed in this system.  For further
         details, please see the appendix SUPPORTED NVIDIA GRAPHICS CHIPS in
         the README available on the Linux driver download page at
         www.nvidia.com.
-> License accepted.
-> Installing NVIDIA driver version 260.19.12.
-> Performing CC sanity check with CC="cc".
-> Performing CC version check with CC="cc".
-> Kernel source path: '/lib/modules/2.6.32.23-0.3-default/source'
-> Kernel output path: '/lib/modules/2.6.32.23-0.3-default/build'
-> Performing rivafb check.
-> Performing nvidiafb check.
-> Performing Xen check.
-> Cleaning kernel module build directory.
   executing: 'cd ./kernel; make clean'...
-> Building kernel module:
   executing: 'cd ./kernel; make module SYSSRC=/lib/modules/2.6.32.23-0.3-defau
   lt/source SYSOUT=/lib/modules/2.6.32.23-0.3-default/build'...
   NVIDIA: calling KBUILD...
   make -C /lib/modules/2.6.32.23-0.3-default/build \
   	KBUILD_SRC=/usr/src/linux-2.6.32.23-0.3 \
   	KBUILD_EXTMOD="/tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel" -f /u
   sr/src/linux-2.6.32.23-0.3/Makefile \
   	modules
   test -e include/linux/autoconf.h -a -e include/config/auto.conf || (		\
   	echo;								\
   	echo "  ERROR: Kernel configuration is invalid.";		\
   	echo "         include/linux/autoconf.h or include/config/auto.conf are mis
   sing.";	\
   	echo "         Run 'make oldconfig && make prepare' on kernel src to fix it
   .";	\
   	echo;								\
   	/bin/false)
   mkdir -p /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/.tmp_versions
   ; rm -f /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/.tmp_versions/
   *
   make -f /usr/src/linux-2.6.32.23-0.3/scripts/Makefile.build obj=/tmp/selfgz3
   5869/NVIDIA-Linux-x86_64-260.19.12/kernel
     cc -Wp,-MD,/tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/.nv.o.d  
   -nostdinc -isystem /usr/lib64/gcc/x86_64-suse-linux/4.3/include -Iinclude -I
   include2 -I/usr/src/linux-2.6.32.23-0.3/include -I/usr/src/linux-2.6.32.23-0
   .3/arch/x86/include -include include/linux/autoconf.h   -I/tmp/selfgz35869/N
   VIDIA-Linux-x86_64-260.19.12/kernel -D__KERNEL__ -Wall -Wundef -Wstrict-prot
   otypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-func
   tion-declaration -Wno-format-security -fno-delete-null-pointer-checks -O2 -m
   64 -mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccumulat
   e-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe -Wno-
   sign-compare -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fno-stack-protector -fo
   mit-frame-pointer -fasynchronous-unwind-tables -g -Wdeclaration-after-statem
   ent -Wno-pointer-sign -fno-strict-overflow   -I/tmp/selfgz35869/NVIDIA-Linux
   -x86_64-260.19.12/kernel -Wall -MD -Wsign-compare -Wno-cast-qual -Wno-error 
   -D__KERNEL__ -DMODULE -DNVRM -DNV_VERSION_STRING=\"260.19.12\" -mcmodel=kern
   el -mno-red-zone -UDEBUG -U_DEBUG -DNDEBUG  -DMODULE -D"KBUILD_STR(s)=#s" -D
   "KBUILD_BASENAME=KBUILD_STR(nv)"  -D"KBUILD_MODNAME=
   KBUILD_STR(nvidia)" -D"DEBUG_HASH=10" -D"DEBUG_HASH2=49" -c -o /tmp/selfgz35
   869/NVIDIA-Linux-x86_64-260.19.12/kernel/.tmp_nv.o /tmp/selfgz35869/NVIDIA-L
   inux-x86_64-260.19.12/kernel/nv.c
     cc -Wp,-MD,/tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/.nv_gvi.o
   .d  -nostdinc -isystem /usr/lib64/gcc/x86_64-suse-linux/4.3/include -Iinclud
   e -Iinclude2 -I/usr/src/linux-2.6.32.23-0.3/include -I/usr/src/linux-2.6.32.
   23-0.3/arch/x86/include -include include/linux/autoconf.h   -I/tmp/selfgz358
   69/NVIDIA-Linux-x86_64-260.19.12/kernel -D__KERNEL__ -Wall -Wundef -Wstrict-
   prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-
   function-declaration -Wno-format-security -fno-delete-null-pointer-checks -O
   2 -m64 -mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccum
   ulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe -
   Wno-sign-compare -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fno-stack-protector
   -fomit-frame-pointer -fasynchronous-unwind-tables -g -Wdeclaration-after-sta
   t
   ement -Wno-pointer-sign -fno-strict-overflow   -I/tmp/selfgz35869/NVIDIA-Lin
   ux-x86_64-260.19.12/kernel -Wall -MD -Wsign-compare -Wno-cast-qual -Wno-erro
   r -D__KERNEL__ -DMODULE -DNVRM -DNV_VERSION_STRING=\"260.19.12\" -mcmodel=ke
   rnel -mno-red-zone -UDEBUG -U_DEBUG -DNDEBUG  -DMODULE -D"KBUILD_STR(s)=#s" 
   -D"KBUILD_BASENAME=KBUILD_STR(nv_gvi)"  -D"KBUILD_MODNAME=KBUILD_STR(nvidia)
   " -D"DEBUG_HASH=10" -D"DEBUG_HASH2=49" -c -o /tmp/selfgz35869/NVIDIA-Linux-x
   86_64-260.19.12/kernel/.tmp_nv_gvi.o /tmp/selfgz35869/NVIDIA-Linux-x86_64-26
   0.19.12/kernel/nv_gvi.c
     cc -Wp,-MD,/tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/.nv-vm.o.
   d  -nostdinc -isystem /usr/lib64/gcc/x86_64-suse-linux/4.3/include -Iinclude
   -Iinclude2 -I/usr/src/linux-2.6.32.23-0.3/include -I/usr/src/linux-2.6.32.23
   -0.3/arch/x86/include -include include/linux/autoconf.h   -I/tmp/selfgz35869
   /NVIDIA-Linux-x86_64-260.19.12/kernel -D__KERNEL__ -Wall -Wundef -Wstrict-pr
   ototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-fu
   nction-decl
   aration -Wno-format-security -fno-delete-null-pointer-checks -O2 -m64 -mtune
   =generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccumulate-outgoin
   g-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe -Wno-sign-comp
   are -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fno-stack-protector -fomit-frame
   -pointer -fasynchronous-unwind-tables -g -Wdeclaration-after-statement -Wno-
   pointer-sign -fno-strict-overflow   -I/tmp/selfgz35869/NVIDIA-Linux-x86_64-2
   60.19.12/kernel -Wall -MD -Wsign-compare -Wno-cast-qual -Wno-error -D__KERNE
   L__ -DMODULE -DNVRM -DNV_VERSION_STRING=\"260.19.12\" -mcmodel=kernel -mno-r
   ed-zone -UDEBUG -U_DEBUG -DNDEBUG  -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_B
   ASENAME=KBUILD_STR(nv_vm)"  -D"KBUILD_MODNAME=KBUILD_STR(nvidia)" -D"DEBUG_H
   ASH=10" -D"DEBUG_HASH2=49" -c -o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19
   .12/kernel/.tmp_nv-vm.o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kerne
   l/nv-vm.c
   /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/nv-vm.c: In function ?
   ??nv_sg_map_buffer’:
   /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/nv-vm.c:151: warning: 
   assignment makes integer from pointer without a cast
   /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/nv-vm.c:236: warning: 
   label ‘done’ defined but not used
   /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/nv-vm.c:144: warning: 
   unused variable ‘count’
     cc -Wp,-MD,/tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/.os-agp.o
   .d  -nostdinc -isystem /usr/lib64/gcc/x86_64-suse-linux/4.3/include -Iinclud
   e -Iinclude2 -I/usr/src/linux-2.6.32.23-0.3/include -I/usr/src/linux-2.6.32.
   23-0.3/arch/x86/include -include include/linux/autoconf.h   -I/tmp/selfgz358
   69/NVIDIA-Linux-x86_64-260.19.12/kernel -D__KERNEL__ -Wall -Wundef -Wstrict-
   prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-
   function-declaration -Wno-format-security -fno-delete-null-pointer-checks -O
   2 -m64 -mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccum
   ulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pi
   pe -Wno-sign-compare -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fno-stack-prote
   ctor -fomit-frame-pointer -fasynchronous-unwind-tables -g -Wdeclaration-afte
   r-statement -Wno-pointer-sign -fno-strict-overflow   -I/tmp/selfgz35869/NVID
   IA-Linux-x86_64-260.19.12/kernel -Wall -MD -Wsign-compare -Wno-cast-qual -Wn
   o-error -D__KERNEL__ -DMODULE -DNVRM -DNV_VERSION_STRING=\"260.19.12\" -mcmo
   del=kernel -mno-red-zone -UDEBUG -U_DEBUG -DNDEBUG  -DMODULE -D"KBUILD_STR(s
   )=#s" -D"KBUILD_BASENAME=KBUILD_STR(os_agp)"  -D"KBUILD_MODNAME=KBUILD_STR(n
   vidia)" -D"DEBUG_HASH=10" -D"DEBUG_HASH2=49" -c -o /tmp/selfgz35869/NVIDIA-L
   inux-x86_64-260.19.12/kernel/.tmp_os-agp.o /tmp/selfgz35869/NVIDIA-Linux-x86
   _64-260.19.12/kernel/os-agp.c
     cc -Wp,-MD,/tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/.os-inter
   face.o.d  -nostdinc -isystem /usr/lib64/gcc/x86_64-suse-linux/4.3/include -I
   include -Iinclude2 -I/usr/src/linux-2.6.32.23-0.3/include -I/usr/src/linux-2
   .6.32.23-0.3/arch/x86/include -include include/linux/autoconf.h   -I/tmp/sel
   fgz358
   69/NVIDIA-Linux-x86_64-260.19.12/kernel -D__KERNEL__ -Wall -Wundef -Wstrict-
   prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-
   function-declaration -Wno-format-security -fno-delete-null-pointer-checks -O
   2 -m64 -mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccum
   ulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe -
   Wno-sign-compare -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fno-stack-protector
   -fomit-frame-pointer -fasynchronous-unwind-tables -g -Wdeclaration-after-sta
   tement -Wno-pointer-sign -fno-strict-overflow   -I/tmp/selfgz35869/NVIDIA-Li
   nux-x86_64-260.19.12/kernel -Wall -MD -Wsign-compare -Wno-cast-qual -Wno-err
   or -D__KERNEL__ -DMODULE -DNVRM -DNV_VERSION_STRING=\"260.19.12\" -mcmodel=k
   ernel -mno-red-zone -UDEBUG -U_DEBUG -DNDEBUG  -DMODULE -D"KBUILD_STR(s)=#s"
   -D"KBUILD_BASENAME=KBUILD_STR(os_interface)"  -D"KBUILD_MODNAME=KBUILD_STR(n
   vidia)" -D"DEBUG_HASH=10" -D"DEBUG_HASH2=49" -c -o /tmp/selfgz35869/NVIDIA-L
   inux-x86_64-260.19.12/kernel/.tmp_
   os-interface.o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/os-inte
   rface.c
     cc -Wp,-MD,/tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/.os-regis
   try.o.d  -nostdinc -isystem /usr/lib64/gcc/x86_64-suse-linux/4.3/include -Ii
   nclude -Iinclude2 -I/usr/src/linux-2.6.32.23-0.3/include -I/usr/src/linux-2.
   6.32.23-0.3/arch/x86/include -include include/linux/autoconf.h   -I/tmp/self
   gz35869/NVIDIA-Linux-x86_64-260.19.12/kernel -D__KERNEL__ -Wall -Wundef -Wst
   rict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-impl
   icit-function-declaration -Wno-format-security -fno-delete-null-pointer-chec
   ks -O2 -m64 -mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -m
   accumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -p
   ipe -Wno-sign-compare -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fno-stack-prot
   ector -fomit-frame-pointer -fasynchronous-unwind-tables -g -Wdeclaration-aft
   er-statement -Wno-pointer-sign -fno-strict-overflow   -I/tmp/selfgz35869/NVI
   DIA-Linux-x86_64-260.19.12/k
   ernel -Wall -MD -Wsign-compare -Wno-cast-qual -Wno-error -D__KERNEL__ -DMODU
   LE -DNVRM -DNV_VERSION_STRING=\"260.19.12\" -mcmodel=kernel -mno-red-zone -U
   DEBUG -U_DEBUG -DNDEBUG  -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KB
   UILD_STR(os_registry)"  -D"KBUILD_MODNAME=KBUILD_STR(nvidia)" -D"DEBUG_HASH=
   10" -D"DEBUG_HASH2=49" -c -o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/
   kernel/.tmp_os-registry.o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/ker
   nel/os-registry.c
     cc -Wp,-MD,/tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/.nv-i2c.o
   .d  -nostdinc -isystem /usr/lib64/gcc/x86_64-suse-linux/4.3/include -Iinclud
   e -Iinclude2 -I/usr/src/linux-2.6.32.23-0.3/include -I/usr/src/linux-2.6.32.
   23-0.3/arch/x86/include -include include/linux/autoconf.h   -I/tmp/selfgz358
   69/NVIDIA-Linux-x86_64-260.19.12/kernel -D__KERNEL__ -Wall -Wundef -Wstrict-
   prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-
   function-declaration -Wno-format-security -fno-delete-null-pointer-checks -O
   2 -m64 -mtune=gene
   ric -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-arg
   s -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe -Wno-sign-compare -
   mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fno-stack-protector -fomit-frame-poin
   ter -fasynchronous-unwind-tables -g -Wdeclaration-after-statement -Wno-point
   er-sign -fno-strict-overflow   -I/tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19
   .12/kernel -Wall -MD -Wsign-compare -Wno-cast-qual -Wno-error -D__KERNEL__ -
   DMODULE -DNVRM -DNV_VERSION_STRING=\"260.19.12\" -mcmodel=kernel -mno-red-zo
   ne -UDEBUG -U_DEBUG -DNDEBUG  -DMODULE -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENA
   ME=KBUILD_STR(nv_i2c)"  -D"KBUILD_MODNAME=KBUILD_STR(nvidia)" -D"DEBUG_HASH=
   10" -D"DEBUG_HASH2=49" -c -o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/
   kernel/.tmp_nv-i2c.o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/n
   v-i2c.c
     cc -Wp,-MD,/tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/.nvacpi.o
   .d  -nostdinc -isystem /usr/lib64/gcc/x86_64-suse-linux/4.3/include -Iinclud
   e -Iinclude2 -I/usr/src/linux-2.6.32.23-0.3/include -I/usr/src/linux-2.6.32.
   23-0.3/arch/x86/include -include include/linux/autoconf.h   -I/tmp/selfgz358
   69/NVIDIA-Linux-x86_64-260.19.12/kernel -D__KERNEL__ -Wall -Wundef -Wstrict-
   prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-
   function-declaration -Wno-format-security -fno-delete-null-pointer-checks -O
   2 -m64 -mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccum
   ulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe -
   Wno-sign-compare -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -fno-stack-protector
   -fomit-frame-pointer -fasynchronous-unwind-tables -g -Wdeclaration-after-sta
   tement -Wno-pointer-sign -fno-strict-overflow   -I/tmp/selfgz35869/NVIDIA-Li
   nux-x86_64-260.19.12/kernel -Wall -MD -Wsign-compare -Wno-cast-qual -Wno-err
   or -D__KERNEL__ -DMODULE -DNVRM -DNV_VERSION_STRING=\"260.19.12\" -mcmodel=k
   ernel -mno-red-zone -UDEBUG -U_DEBUG -DNDEBUG  -DMODULE -D"KBUILD_STR(s)=#s"
   -D"KBUILD_BASENAME=KBUILD_STR(nvacpi)"  -D"KBUILD_MODNAME=KBUI
   LD_STR(nvidia)" -D"DEBUG_HASH=10" -D"DEBUG_HASH2=49" -c -o /tmp/selfgz35869/
   NVIDIA-Linux-x86_64-260.19.12/kernel/.tmp_nvacpi.o /tmp/selfgz35869/NVIDIA-L
   inux-x86_64-260.19.12/kernel/nvacpi.c
     ld -m elf_x86_64   -r -o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/ke
   rnel/nvidia.o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/nv-kerne
   l.o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/nv.o /tmp/selfgz35
   869/NVIDIA-Linux-x86_64-260.19.12/kernel/nv_gvi.o /tmp/selfgz35869/NVIDIA-Li
   nux-x86_64-260.19.12/kernel/nv-vm.o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260
   .19.12/kernel/os-agp.o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel
   /os-interface.o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/os-reg
   istry.o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/nv-i2c.o /tmp/
   selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/nvacpi.o 
   (cat /dev/null;   echo kernel//tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12
   /kernel/nvidia.ko;) > /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/
   modules.order
   make -f /usr/src/linux-2.6.32.23-0.3/scripts/Makefile.modpost
     scripts/mod/modpost -m -a -i /usr/src/linux-2.6.32.23-0.3-obj/x86_64/defau
   lt/Module.symvers -I /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/M
   odule.symvers  -o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/Modu
   le.symvers -S -w   -N /usr/src/linux-2.6.32.23-0.3-obj/x86_64/default/Module
   .supported -s
   WARNING: could not find /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kerne
   l/.nv-kernel.o.cmd for /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel
   /nv-kernel.o
     cc -Wp,-MD,/tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/.nvidia.m
   od.o.d  -nostdinc -isystem /usr/lib64/gcc/x86_64-suse-linux/4.3/include -Iin
   clude -Iinclude2 -I/usr/src/linux-2.6.32.23-0.3/include -I/usr/src/linux-2.6
   .32.23-0.3/arch/x86/include -include include/linux/autoconf.h   -I/tmp/selfg
   z35869/NVIDIA-Linux-x86_64-260.19.12/kernel -D__KERNEL__ -Wall -Wundef -Wstr
   ict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-impli
   cit-function-declaration
    -Wno-format-security -fno-delete-null-pointer-checks -O2 -m64 -mtune=generi
   c -mno-red-zone -mcmodel=kernel -funit-at-a-time -maccumulate-outgoing-args 
   -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe -Wno-sign-compare -mn
   o-sse -mno-mmx -mno-sse2 -mno-3dnow -fno-stack-protector -fomit-frame-pointe
   r -fasynchronous-unwind-tables -g -Wdeclaration-after-statement -Wno-pointer
   -sign -fno-strict-overflow   -I/tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.1
   2/kernel -Wall -MD -Wsign-compare -Wno-cast-qual -Wno-error -D__KERNEL__ -DM
   ODULE -DNVRM -DNV_VERSION_STRING=\"260.19.12\" -mcmodel=kernel -mno-red-zone
   -UDEBUG -U_DEBUG -DNDEBUG  -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_ST
   R(nvidia.mod)"  -D"KBUILD_MODNAME=KBUILD_STR(nvidia)" -D"DEBUG_HASH=10" -D"D
   EBUG_HASH2=49" -DMODULE -c -o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12
   /kernel/nvidia.mod.o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/n
   vidia.mod.c
     ld -r -m elf_x86_64 -T /usr/src/linux-2.6.32.23-0.3/scripts/module-common.
   lds --build-id -o /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/nvid
   ia.ko /tmp/selfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/nvidia.o /tmp/se
   lfgz35869/NVIDIA-Linux-x86_64-260.19.12/kernel/nvidia.mod.o
   NVIDIA: left KBUILD.
-> done.
-> Kernel module compilation complete.
ERROR: Unable to load the kernel module 'nvidia.ko'.  This happens most
       frequently when this kernel module was built against the wrong or
       improperly configured kernel sources, with a version of gcc that differs
       from the one used to build the target kernel, or if a driver such as
       rivafb, nvidiafb, or nouveau is present and prevents the NVIDIA kernel
       module from obtaining ownership of the NVIDIA graphics device(s), or
       NVIDIA GPU installed in this system is not supported by this NVIDIA
       Linux graphics driver release.
       
       Please see the log entries 'Kernel module load error' and 'Kernel
       messages' at the end of the file '/var/log/nvidia-installer.log' for
       more information.
-> Kernel module load error: insmod: error inserting './kernel/nvidia.ko': -1
   No such device
-> Kernel messages:
   [   36.101352] eth0: no IPv6 routers present
   [  546.591477] SFW2-INext-ACC-TCP IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:24:36:a3:02:9c:08:00 SRC=150.214.109.13
   DST=150.214.109.84 LEN=64 TOS=0x00 PREC=0x00 TTL=63 ID=23295 DF PROTO=TCP
   SPT=35095 DPT=22 WINDOW=65535 RES=0x00 SYN URGP=0 OPT
   (020405B4010303030101080A1E58ADF30000000004020000) 
   [  556.864993] SFW2-INext-ACC-TCP IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:24:36:a3:02:9c:08:00 SRC=150.214.109.13
   DST=150.214.109.84 LEN=64 TOS=0x00 PREC=0x00 TTL=63 ID=21435 DF PROTO=TCP
   SPT=41638 DPT=22 WINDOW=65535 RES=0x00 SYN URGP=0 OPT
   (020405B4010303030101080A1E58AE590000000004020000) 
   [  577.509559] JBD: barrier-based sync failed on cciss/c0d0p4 - disabling
   barriers
   [  645.751341] SFW2-INext-ACC-TCP IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:24:36:a3:02:9c:08:00 SRC=150.214.109.13
   DST=150.214.109.84 LEN=64 TOS=0x00 PREC=0x00 TTL=63 ID=18974 DF PROTO=TCP
   SPT=47962 DPT=22 WINDOW=65535 RES=0x00 SYN URGP=0 OPT
   (020405B4010303030101080A1E58B1D20000000004020000) 
   [  700.135815] nvidia: module license 'NVIDIA' taints kernel.
   [  700.135819] Disabling lock debugging due to kernel taint
   [  700.542134] NVRM: No NVIDIA graphics adapter found!
   [44240.339713] SFW2-INext-ACC-TCP IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:15:e8:af:22:02:08:00 SRC=95.79.26.146
   DST=150.214.109.84 LEN=60 TOS=0x00 PREC=0x00 TTL=45 ID=2151 DF PROTO=TCP
   SPT=35377 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 OPT
   (020405640402080A00FCD5260000000001030307) 
   [45980.056651] SFW2-INext-ACC-TCP IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:15:e8:af:22:02:08:00 SRC=66.240.52.5
   DST=150.214.109.84 LEN=60 TOS=0x00 PREC=0x00 TTL=45 ID=41866 DF PROTO=TCP
   SPT=37427 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 OPT
   (020405640402080A08CA97270000000001030307) 
   [172620.812792] SFW2-INext-DROP-DEFLT IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:0c:29:fc:f0:d0:08:00 SRC=150.214.109.7
   DST=150.214.109.84 LEN=353 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP
   SPT=67 DPT=68 LEN=333 
   [172620.815096] SFW2-INext-DROP-DEFLT IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:0c:29:22:e7:aa:08:00 SRC=150.214.109.1
   DST=150.214.109.84 LEN=353 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP
   SPT=67 DPT=68 LEN=333 
   [204601.250607] SFW2-INext-ACC-TCP IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:15:e8:af:22:02:08:00 SRC=87.233.170.130
   DST=150.214.109.84 LEN=48 TOS=0x00 PREC=0x00 TTL=108 ID=62661 PROTO=TCP
   SPT=17905 DPT=22 WINDOW=65535 RES=0x00 SYN URGP=0 OPT (0204056401010402) 
   [208778.459994] SFW2-INext-ACC-TCP IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:15:e8:af:22:02:08:00 SRC=87.233.170.130
   DST=150.214.109.84 LEN=60 TOS=0x00 PREC=0x00 TTL=44 ID=4033 DF PROTO=TCP
   SPT=58973 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 OPT
   (020405640402080A0707A3150000000001030306) 
   [238402.525866] SFW2-INext-ACC-TCP IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:1c:eb:a8:a2:2c:08:00 SRC=119.226.71.89
   DST=150.214.109.84 LEN=60 TOS=0x00 PREC=0x00 TTL=36 ID=42300 DF PROTO=TCP
   SPT=59420 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 OPT
   (020405640402080A0DC82F230000000001030307) 
   [238860.247562] SFW2-INext-ACC-TCP IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:1c:eb:a8:a2:2c:08:00 SRC=119.226.71.89
   DST=150.214.109.84 LEN=60 TOS=0x00 PREC=0x00 TTL=36 ID=53885 DF PROTO=TCP
   SPT=53465 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 OPT
   (020405640402080A0DCF33610000000001030307) 
   [238865.168467] SFW2-INext-ACC-TCP IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:1c:eb:a8:a2:2c:08:00 SRC=119.226.71.89
   DST=150.214.109.84 LEN=60 TOS=0x00 PREC=0x00 TTL=36 ID=41865 DF PROTO=TCP
   SPT=53572 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 OPT
   (020405640402080A0DCF46C00000000001030307) 
   [238869.130913] SFW2-INext-ACC-TCP IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:1c:eb:a8:a2:2c:08:00 SRC=119.226.71.89
   DST=150.214.109.84 LEN=60 TOS=0x00 PREC=0x00 TTL=36 ID=1872 DF PROTO=TCP
   SPT=53687 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 OPT
   (020405640402080A0DCF563A0000000001030307) 
   [238876.887722] SFW2-INext-ACC-TCP IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:1c:eb:a8:a2:2c:08:00 SRC=119.226.71.89
   DST=150.214.109.84 LEN=60 TOS=0x00 PREC=0x00 TTL=36 ID=7247 DF PROTO=TCP
   SPT=53887 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 OPT
   (020405640402080A0DCF74930000000001030307) 
   [238882.609385] SFW2-INext-ACC-TCP IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:1c:eb:a8:a2:2c:08:00 SRC=119.226.71.89
   DST=150.214.109.84 LEN=60 TOS=0x00 PREC=0x00 TTL=36 ID=48747 DF PROTO=TCP
   SPT=54021 DPT=22 WINDOW=5840 RES=0x00 SYN URGP=0 OPT
   (020405640402080A0DCF8AF20000000001030307) 
   [275011.347300] SFW2-INext-ACC-TCP IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:24:36:a3:02:9c:08:00 SRC=150.214.109.13
   DST=150.214.109.84 LEN=64 TOS=0x00 PREC=0x00 TTL=63 ID=16543 DF PROTO=TCP
   SPT=45715 DPT=22 WINDOW=65535 RES=0x00 SYN URGP=0 OPT
   (020405B4010303030101080A1E5DAAC60000000004020000) 
   [275799.921757] NVRM: No NVIDIA graphics adapter found!
   [275840.621579] SFW2-INext-ACC-TCP IN=eth0 OUT=
   MAC=d8:d3:85:62:aa:84:00:24:36:a3:02:9c:08:00 SRC=150.214.109.13
   DST=150.214.109.84 LEN=64 TOS=0x00 PREC=0x00 TTL=63 ID=36444 DF PROTO=TCP
   SPT=35639 DPT=22 WINDOW=65535 RES=0x00 SYN URGP=0 OPT
   (020405B4010303030101080A1E5DCB2C0000000004020000) 
   [275917.899033] NVRM: No NVIDIA graphics adapter found!
   [276290.049770] NVRM: No NVIDIA graphics adapter found!
ERROR: Installation has failed.  Please see the file
       '/var/log/nvidia-installer.log' for details.  You may find suggestions
       on fixing installation problems in the README available on the Linux
       driver download page at www.nvidia.com.
asenjo is offline   Reply With Quote
Old 10-19-10, 07:44 AM   #2
AaronP
NVIDIA Corporation
 
AaronP's Avatar
 
Join Date: Mar 2005
Posts: 2,487
Default Re: Unable to install Nvidia driver in SLES 11

You abridged your lspci log and didn't attach an nvidia-bug-report.log.gz so it's hard to say for sure, but it looks like only the Tesla bridges are visible, and that the GPUs themselves are not. This could be caused by a system BIOS problem, or a power problem on the Tesla boxes, or loose cables, or a variety of other problems. I would recommend contacting HP support.
AaronP is offline   Reply With Quote
Old 10-19-10, 07:48 AM   #3
AaronP
NVIDIA Corporation
 
AaronP's Avatar
 
Join Date: Mar 2005
Posts: 2,487
Default Re: Unable to install Nvidia driver in SLES 11

Oh wait, maybe I misread -- if you bought the Tesla S2050 directly from NVIDIA instead of through HP, then I'd suggest you contact NVIDIA support instead.

You should have a developer relations / support contact person that should be able to help you through any setup issues.
AaronP is offline   Reply With Quote
Old 10-19-10, 09:14 AM   #4
asenjo
Registered User
 
Join Date: Oct 2010
Posts: 3
Default Re: Unable to install Nvidia driver in SLES 11

Thank you AaronP. I've contacted both HP and Nvidia support, but I'm still waiting for help. It's true there is a lot of possible causes for this problem. I will double check the PCIe cables, but they should be well connected. Regarding the BIOS problem or power problem, are you aware of any tool or linux command I can use to check whether or not the GPUs are running?. Is there any particular BIOS configuration that should be set to enable the communication between the server and the tesla box? To try to isolate the problem, I will first try to connect the tesla to other server to better identify if the problem is in the HP or Nvidia side. Thank you very much.
asenjo is offline   Reply With Quote
Old 10-21-10, 09:54 AM   #5
asenjo
Registered User
 
Join Date: Oct 2010
Posts: 3
Default Re: Unable to install Nvidia driver in SLES 11

Update. I've received news from nVidia support:

Quote:
"We now have a clear response from HP. There is a problem with the BIOS of the DL580 which results in the GPUs in the S2050 chassis not being recognised. Also, the bandwidth between the DL580 and the GPUs is not as high as it should be; the bandwidth problem has been observed when testing an earlier NVIDIA product (S1070) with the DL580. HP is working on the BIOS problem, and also HP and NVIDIA are cooperating to resolve the bandwidth problem."
So, I'm waiting... Thanks.
asenjo is offline   Reply With Quote
Old 10-21-10, 05:27 PM   #6
ltpvu
Registered User
 
Join Date: Oct 2010
Posts: 3
Default Re: Unable to install Nvidia driver in SLES 11

Hi AaronP. I have a same problem. We brought two S2050 servers from Nvidia to install in our cluster with CentOS 5.3 platform. I could not get the driver installed because of errors that is exactly same as asenjo’ errors.

Here the PCI list of one cluster node which connects to 1 PCI port of Tesla S2050

Quote:
03:00.0 PCI bridge: nVidia Corporation Tesla S870 (rev a3) (prog-if 00 [Normal decode])
04:00.0 PCI bridge: nVidia Corporation Tesla S870 (rev a3) (prog-if 00 [Normal decode])
04:01.0 PCI bridge: nVidia Corporation Tesla S870 (rev a3) (prog-if 00 [Normal decode])
04:02.0 PCI bridge: nVidia Corporation Tesla S870 (rev a3) (prog-if 00 [Normal decode])
04:03.0 PCI bridge: nVidia Corporation Tesla S870 (rev a3) (prog-if 00 [Normal decode])
07:00.0 PCI bridge: nVidia Corporation Tesla S870 (rev a3) (prog-if 00 [Normal decode])
08:00.0 PCI bridge: nVidia Corporation Tesla S870 (rev a3) (prog-if 00 [Normal decode])
08:02.0 PCI bridge: nVidia Corporation Tesla S870 (rev a3) (prog-if 00 [Normal decode])
Our cluster includes Altus 1702 servers from Penguin. Could you see if this is also the BIOS problem and how I can fix it?

A weird thing is the drivers in Nvidia CD (e.g. NVIDIA-Linux-x86_64-195.36.15-pkg2.run) may be not support S2050. It could not found Tesla S2050 in the support list of the README file in the doc directory of installation package. I want to use the NVIDIA-Linux-x86_64-260.19.12.run since the S2050 in the support list but it sounds CUDA support for Redhat 2.6.18-194 kernel only.

I have more four questions and expect your answers

1. Will CUDA 3.2 package work on CentOS 5.3 (kernel 2.6.18-128.1.1.el5.530g0000, gcc 4.1.2)?

2. If not, will the driver package (e.g. NVIDIA-Linux-x86_64-195.36.15-pkg2.run) work for Tesla S2050?

3. If I install NVIDIA-Linux-x86_64-260.19.12.run (which is said that supports Tesla S2050), can it work with older CUDA toolkit like 3.1, 3.0 …?

4. Could you show me how to install these S2050 drivers manually? I have tried to copy 'nvidia.ko' to "/lib/modules/$(uname -r)/kernel/drivers/video/” but it does not work. When I ran “modprobe nvidia”, I got the error message “FATAL: Module nvidia not found”

Thank you
ltpvu is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 03:52 AM.


Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright ©1998 - 2014, nV News.