View Single Post
Old 03-19-12, 08:52 AM   #4
eudoxos
Registered User
 
Join Date: Feb 2012
Posts: 5
Default Re: CUDA ok, OpenCL crashes in libcuda.so

Hi AaronP,

good to hear that you check for error when opening libnvidia-compiler.so, but apparently you dont' react accordingly anyway.

I added two attachments, one is the output of nvidia-bug-report, the other is the source file plus (trivial) Makefile. The source basically only opens platform+device according to command-line options and compiles a trivial source code (you need to have cl.hpp somewhere around). It takes two args, platform number and device number, but you will see that when you run it without args.

My configuration is such that nvidia directory with libcuda.so, libnvidia-compiler.so & other is NOT in LD_LIBRARY_PATH or in /etc/ldd.so.conf. The reason is that it also contains nvidia's libGL.so, which I however need from fglrx (the xserver runs on the ATI card). For that reason, /etc/OpenCL/vendors/nvidiaocl64.icd contains the absolute path to libcuda.so, i.e. /usr/lib/nvidia-current/libcuda.so; therefore the nvidia platform is discovered by OpenCL runtime -- unlike libnvidia-compiler.so, which is not found by dlopen.

I think you should be able to reproduce the bug trivially on a machine with nvidia installed as per normal, and removing libnvidia-compiler.so somewhere out of reach of dlopen (I can check by strace that the lib is being searched). Then after cleaning ~/.nv/ComputeCache and running the attached program, you should get the crash.

Cheers, Vaclav

---

Additional backtrace for this program, when compiled with -g and run with "gdb --args ./main 1 0".

Code:
** OpenCL ready: platform "NVIDIA CUDA", device "GeForce GTX 560 Ti".
Error building source. Build log follows.

Program received signal SIGSEGV, Segmentation fault.
__strlen_sse2_pminub () at ../sysdeps/x86_64/multiarch/strlen-sse2-pminub.S:39
39	../sysdeps/x86_64/multiarch/strlen-sse2-pminub.S: No such file or directory.
(gdb) bt
#0  __strlen_sse2_pminub () at ../sysdeps/x86_64/multiarch/strlen-sse2-pminub.S:39
#1  0x00007ffff22353c7 in ?? () from /usr/lib/nvidia-current/libcuda.so
#2  0x0000000000405521 in cl::detail::GetInfoFunctor1<int (*)(_cl_program*, _cl_device_id*, unsigned int, unsigned long, void*, unsigned long*), _cl_program*, _cl_device_id*>::operator() (this=0x7fffffffdbd0, param=4483, size=0, 
    value=0x0, size_ret=0x7fffffffdba8) at ./cl.hpp:1009
#3  0x0000000000405053 in cl::detail::GetInfoHelper<cl::detail::GetInfoFunctor1<int (*)(_cl_program*, _cl_device_id*, unsigned int, unsigned long, void*, unsigned long*), _cl_program*, _cl_device_id*>, std::string>::get (f=..., 
    name=4483, param=0x7fffffffde60) at ./cl.hpp:745
#4  0x0000000000404868 in cl::detail::getInfo<int (*)(_cl_program*, _cl_device_id*, unsigned int, unsigned long, void*, unsigned long*), _cl_program*, _cl_device_id*, std::string> (f=0x4014b0 <clGetProgramBuildInfo@plt>, 
    arg0=@0x7fffffffddb0: 0xf93170, arg1=@0x7fffffffdc78: 0x710430, name=4483, param=0x7fffffffde60) at ./cl.hpp:1027
#5  0x0000000000403f3c in cl::Program::getBuildInfo<std::string> (this=0x7fffffffddb0, device=..., name=4483, 
    param=0x7fffffffde60) at ./cl.hpp:2916
#6  0x0000000000403499 in cl::Program::getBuildInfo<4483> (this=0x7fffffffddb0, device=..., err=0x0) at ./cl.hpp:2925
#7  0x0000000000401ef5 in main (argc=3, argv=0x7fffffffdf88) at main.cpp:54
(gdb)
Attached Files
File Type: zip nvidia-crash.zip (1.2 KB, 50 views)
File Type: gz nvidia-bug-report.log.gz (67.4 KB, 31 views)
eudoxos is offline   Reply With Quote