Go Back   nV News Forums > Linux Support Forums > NVIDIA Linux

Newegg Daily Deals

Reply
 
Thread Tools
Old 02-25-12, 07:09 AM   #1
eudoxos
Registered User
 
Join Date: Feb 2012
Posts: 5
Default CUDA ok, OpenCL crashes in libcuda.so

Running Linux, I have nVidia driver 290.10, CUDA and the nVidia GPU computing SDK installed (latest downloadable versions). The card is a GTX560Ti; it not used as graphics card in the computer anymore, an ATI card is. The nVidia card is detected just fine, I can run CUDA programs on it.

OpenCL runtime reports the platform / device correctly as NVIDIA CUDA / GeForce GTX 560 Ti, but whenever I run some code on it, it reports error compiling the program (which is compiled just fine by both Intel and AMD SDK's), i.e. clBuildProgram returns CL_BUILD_PROGRAM_FAILURE and clGetProgramBuildInfo(...,CL_PROGRAM_BUILD_LOG...) crashes:

Code:
#0 __strlen_sse42 () at ../sysdeps/x86_64/multiarch/strlen-sse4.S:32
#1 0x00007ffff22c0e67 in ?? () from /usr/lib/nvidia-current/libcuda.so
#2 0x0000000000402e85 in main (argc=2, argv=0x7fffffffdfc8) at test-chain.cc:107 // this is where clGetProgramBuildInfo is called
Any hint?

(I posted this first on the nvidia forum without any response. Having comparison Intel/nVidia/ATI regarding support of OpenCL, both Intel and AMD were very reponsive when I reported bugs in their compilers. nVidia seems to be happy that it sold the card, pushing CUDA everywhere and not giving a damn about OpenCL. Am I wrong? The card was not the cheapest one, and if I ever finish the code I work on, I will definitely recommend customers to go for ATI.)

Last edited by eudoxos; 02-25-12 at 07:11 AM. Reason: (better title)
eudoxos is offline   Reply With Quote
Old 03-15-12, 06:49 AM   #2
eudoxos
Registered User
 
Join Date: Feb 2012
Posts: 5
Default Re: CUDA ok, OpenCL crashes in libcuda.so

It reveals this was due to missing LD_LIBRARY_PATH for libnvidia-compiler.so. The stupid people at nvidia don't check for errors from dlopen; for the money they get, they could at least hire competent programmers.
eudoxos is offline   Reply With Quote
Old 03-15-12, 02:08 PM   #3
AaronP
NVIDIA Corporation
 
AaronP's Avatar
 
Join Date: Mar 2005
Posts: 2,487
Default Re: CUDA ok, OpenCL crashes in libcuda.so

Hi eudoxos,

Thanks for reporting this. The only place I can find in the code that loads libnvidia-compiler.so does properly check for dlopen failures. Can you please post the test case you're using to reproduce this problem and an nvidia-bug-report.log.gz file?
AaronP is offline   Reply With Quote
Old 03-19-12, 07:52 AM   #4
eudoxos
Registered User
 
Join Date: Feb 2012
Posts: 5
Default Re: CUDA ok, OpenCL crashes in libcuda.so

Hi AaronP,

good to hear that you check for error when opening libnvidia-compiler.so, but apparently you dont' react accordingly anyway.

I added two attachments, one is the output of nvidia-bug-report, the other is the source file plus (trivial) Makefile. The source basically only opens platform+device according to command-line options and compiles a trivial source code (you need to have cl.hpp somewhere around). It takes two args, platform number and device number, but you will see that when you run it without args.

My configuration is such that nvidia directory with libcuda.so, libnvidia-compiler.so & other is NOT in LD_LIBRARY_PATH or in /etc/ldd.so.conf. The reason is that it also contains nvidia's libGL.so, which I however need from fglrx (the xserver runs on the ATI card). For that reason, /etc/OpenCL/vendors/nvidiaocl64.icd contains the absolute path to libcuda.so, i.e. /usr/lib/nvidia-current/libcuda.so; therefore the nvidia platform is discovered by OpenCL runtime -- unlike libnvidia-compiler.so, which is not found by dlopen.

I think you should be able to reproduce the bug trivially on a machine with nvidia installed as per normal, and removing libnvidia-compiler.so somewhere out of reach of dlopen (I can check by strace that the lib is being searched). Then after cleaning ~/.nv/ComputeCache and running the attached program, you should get the crash.

Cheers, Vaclav

---

Additional backtrace for this program, when compiled with -g and run with "gdb --args ./main 1 0".

Code:
** OpenCL ready: platform "NVIDIA CUDA", device "GeForce GTX 560 Ti".
Error building source. Build log follows.

Program received signal SIGSEGV, Segmentation fault.
__strlen_sse2_pminub () at ../sysdeps/x86_64/multiarch/strlen-sse2-pminub.S:39
39	../sysdeps/x86_64/multiarch/strlen-sse2-pminub.S: No such file or directory.
(gdb) bt
#0  __strlen_sse2_pminub () at ../sysdeps/x86_64/multiarch/strlen-sse2-pminub.S:39
#1  0x00007ffff22353c7 in ?? () from /usr/lib/nvidia-current/libcuda.so
#2  0x0000000000405521 in cl::detail::GetInfoFunctor1<int (*)(_cl_program*, _cl_device_id*, unsigned int, unsigned long, void*, unsigned long*), _cl_program*, _cl_device_id*>::operator() (this=0x7fffffffdbd0, param=4483, size=0, 
    value=0x0, size_ret=0x7fffffffdba8) at ./cl.hpp:1009
#3  0x0000000000405053 in cl::detail::GetInfoHelper<cl::detail::GetInfoFunctor1<int (*)(_cl_program*, _cl_device_id*, unsigned int, unsigned long, void*, unsigned long*), _cl_program*, _cl_device_id*>, std::string>::get (f=..., 
    name=4483, param=0x7fffffffde60) at ./cl.hpp:745
#4  0x0000000000404868 in cl::detail::getInfo<int (*)(_cl_program*, _cl_device_id*, unsigned int, unsigned long, void*, unsigned long*), _cl_program*, _cl_device_id*, std::string> (f=0x4014b0 <clGetProgramBuildInfo@plt>, 
    arg0=@0x7fffffffddb0: 0xf93170, arg1=@0x7fffffffdc78: 0x710430, name=4483, param=0x7fffffffde60) at ./cl.hpp:1027
#5  0x0000000000403f3c in cl::Program::getBuildInfo<std::string> (this=0x7fffffffddb0, device=..., name=4483, 
    param=0x7fffffffde60) at ./cl.hpp:2916
#6  0x0000000000403499 in cl::Program::getBuildInfo<4483> (this=0x7fffffffddb0, device=..., err=0x0) at ./cl.hpp:2925
#7  0x0000000000401ef5 in main (argc=3, argv=0x7fffffffdf88) at main.cpp:54
(gdb)
Attached Files
File Type: zip nvidia-crash.zip (1.2 KB, 44 views)
File Type: gz nvidia-bug-report.log.gz (67.4 KB, 26 views)
eudoxos is offline   Reply With Quote
Old 03-20-12, 09:42 AM   #5
AaronP
NVIDIA Corporation
 
AaronP's Avatar
 
Join Date: Mar 2005
Posts: 2,487
Default Re: CUDA ok, OpenCL crashes in libcuda.so

Thanks for the detailed report. I identified the problem and filed internal bug number 957326.
AaronP is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 03:51 PM.


Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.