Go Back   nV News Forums > Linux Support Forums > NVIDIA Linux

Newegg Daily Deals

Reply
 
Thread Tools
Old 10-14-11, 01:07 PM   #1
apriori
Registered User
 
Join Date: Nov 2008
Posts: 22
Default GPU at $BUSID$ has fallen of the bus

Hi guys,

lets gather all issues related to this bug. At least its my observation that all distros see increasing amounts of reports of this issue. Currently I only experience it on my notebook which has a 8800MGTX, caused by all drivers starting from 275.09, up to (last tested) 285.05.09 (which is really questionable, because I think I used 275.x.x series without problems). This also doesn't seem to be a kernel bug, at least all kernels 2.6.32->3.0.6 seem to be affected.
More likely it's xorg related, I think. But unfortunately I can't easily revert that one (using 1.11.1 right now).

Distro is Archlinux 2010.05, upgraded to latest stable.
Funny thing is, another machine with the same distro (not quite sure whether exact same packages) and latest nvidia drivers having a 560 Ti works just fine.

So, please, lets try to track that issue down by providing as much data as possible.
apriori is offline   Reply With Quote
Old 10-15-11, 12:49 AM   #2
luudee
Registered User
 
Join Date: Jan 2005
Posts: 3
Default Re: GPU at $BUSID$ has fallen of the bus

I am having this problem as well, running Fedora 14 with all the latest updates. Updated NVDIA driver to 285.05.09. Had 270.41.19, tried 280.13, now at 285.05.09. Card is GTX 580. I am running x86_64. Two 30" monitors, KDE ...

The "Sticky: Stability Issues ..." post is 6 years old, perhaps somebody from Nvidia could update it ?


Thanks,
rudi



Oct 15 00:06:54 cpu11 kernel: NVRM: Xid (0000:04:00): 13, 0006 00000000 00009297 000023ac 00000000 00000000
Oct 15 00:44:38 cpu11 kernel: NVRM: Xid (0000:04:00): 13, 0001 00000000 00009297 00001158 3f800000 00000000
Oct 15 00:44:38 cpu11 kernel: NVRM: Xid (0000:04:00): 32, Channel ID 00000001 intr 00040000
Oct 15 00:44:38 cpu11 kernel: NVRM: Xid (0000:04:00): 32, Channel ID 00000001 intr 00040000
Oct 15 00:44:38 cpu11 kernel: NVRM: Xid (0000:04:00): 32, Channel ID 00000001 intr 00040000
Oct 15 00:44:38 cpu11 kernel: NVRM: Xid (0000:04:00): 32, Channel ID 00000001 intr 00040000
Oct 15 00:44:38 cpu11 kernel: NVRM: Xid (0000:04:00): 32, Channel ID 00000001 intr 00040000
Oct 15 00:44:38 cpu11 kernel: NVRM: Xid (0000:04:00): 32, Channel ID 00000001 intr 00040000
Oct 15 00:45:10 cpu11 kernel: NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Oct 15 00:45:12 cpu11 kernel: NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Oct 15 00:45:14 cpu11 kernel: NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Oct 15 00:45:16 cpu11 kernel: NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Oct 15 00:45:18 cpu11 kernel: NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Oct 15 00:45:21 cpu11 kernel: NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Oct 15 00:45:24 cpu11 kernel: NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Oct 15 00:45:27 cpu11 kernel: NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Oct 15 00:45:29 cpu11 kernel: NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Oct 15 00:45:32 cpu11 kernel: NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Oct 15 00:45:35 cpu11 kernel: NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Oct 15 00:45:37 cpu11 kernel: NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Oct 15 00:45:39 cpu11 kernel: NVRM: os_schedule: Attempted to yield the CPU while in atomic or interrupt context
Oct 15 00:45:40 cpu11 kernel: NVRM: GPU at 0000:04:00.0 has fallen off the bus.
luudee is offline   Reply With Quote
Old 10-15-11, 12:53 AM   #3
luudee
Registered User
 
Join Date: Jan 2005
Posts: 3
Default Re: GPU at $BUSID$ has fallen of the bus

One more thing, noticed this in my Xorg.0.log:


[ 1445.558] [mi] EQ overflowing. The server is probably stuck in an infinite loop.
[ 1445.558]
Backtrace:
[ 1445.629] 0: /usr/bin/Xorg (xorg_backtrace+0x28) [0x4a0908]
[ 1445.629] 1: /usr/bin/Xorg (mieqEnqueue+0x1f4) [0x49fe04]
[ 1445.629] 2: /usr/bin/Xorg (xf86PostMotionEventP+0xc4) [0x47c904]
[ 1445.629] 3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7f755b045000+0x453f) [0x7f755b04953f]
[ 1445.629] 4: /usr/bin/Xorg (0x400000+0x6a5f7) [0x46a5f7]
[ 1445.629] 5: /usr/bin/Xorg (0x400000+0x119103) [0x519103]
[ 1445.629] 6: /lib64/libc.so.6 (0x31da400000+0x33140) [0x31da433140]
[ 1445.629] 7: /lib64/libc.so.6 (__sched_yield+0x7) [0x31da4c8607]
[ 1445.629] 8: /usr/lib64/libnvidia-glcore.so.285.05.09 (0x322c800000+0x12e9fbb) [0x322dae9fbb]
[ 1445.629] 9: /usr/lib64/libnvidia-glcore.so.285.05.09 (0x322c800000+0x12ea11b) [0x322daea11b]
[ 1445.629] 10: /usr/lib64/libnvidia-glcore.so.285.05.09 (0x322c800000+0x128b4ad) [0x322da8b4ad]
[ 1445.629] 11: /usr/lib64/libnvidia-glcore.so.285.05.09 (0x322c800000+0x100cf75) [0x322d80cf75]
[ 1445.629] 12: /usr/lib64/xorg/modules/extensions/libglx.so (0x7f755cc3a000+0x4793e1) [0x7f755d0b33e1]



and



[ 1009.459] [mi] EQ overflowing. The server is probably stuck in an infinite loop.
[ 1009.459]
Backtrace:
[ 1009.460] 0: /usr/bin/Xorg (xorg_backtrace+0x28) [0x4a0908]
[ 1009.460] 1: /usr/bin/Xorg (mieqEnqueue+0x1f4) [0x49fe04]
[ 1009.460] 2: /usr/bin/Xorg (xf86PostMotionEventP+0xc4) [0x47c904]
[ 1009.460] 3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7f36a7025000+0x453f) [0x7f36a702953f]
[ 1009.460] 4: /usr/bin/Xorg (0x400000+0x6a5f7) [0x46a5f7]
[ 1009.460] 5: /usr/bin/Xorg (0x400000+0x119103) [0x519103]
[ 1009.460] 6: /lib64/libc.so.6 (0x31da400000+0x33140) [0x31da433140]
[ 1009.460] 7: /usr/lib64/xorg/modules/extensions/libglx.so (0x7f36a8c1a000+0x337c29) [0x7f36a8f51c29]
luudee is offline   Reply With Quote
Old 10-18-11, 04:59 AM   #4
apriori
Registered User
 
Join Date: Nov 2008
Posts: 22
Default Re: GPU at $BUSID$ has fallen of the bus

@luudee:

Please tell me your xorg* versions (especially of the server) and attach the complete Xorg.0.log,
in your case it more looks like an incompatible ABI version. The funny thing is also, that I don't have any of these os_schedule or Xid messages.
apriori is offline   Reply With Quote
Old 10-23-11, 01:02 PM   #5
ColdFeetBob
Registered User
 
Join Date: Oct 2011
Posts: 1
Default Re: GPU at $BUSID$ has fallen of the bus

Apart fron the "NVRM: GPU at 0000:04:00.0 has fallen off the bus." message, I have the exact same symptoms as luudee.

In my case, ususally is mplayer (both windowed and fullscreen) that triggers the freeze.
I'm running a full up-to-date Arch Linux, x86_64, with latest official nvidia drivers, GT240.
ColdFeetBob is offline   Reply With Quote
Old 10-24-11, 06:12 AM   #6
monty.clift
Registered User
 
Join Date: Oct 2011
Posts: 2
Default Re: GPU at $BUSID$ has fallen of the bus

I am having the same problem running x86_84 Fedora 15 & Fedora 16. I have Dell Precision 6500, laptop, with Quadro FX 2800M. Using Gnome 3 with additional monitor connected to the display_port I immediately get a kernel panic with the following lines in the /var/log/messages:
Oct 24 14:23:36 kernel: [ 1833.815841] dell_wmi: Received unknown WMI event (0x11)
Oct 24 14:23:36 kernel: [ 1833.874781] NVRM: GPU at 0000:01:00.0 has fallen off the bus.

This bug also happens with no external monitor however it takes a bit more time.

The driver I am using is 285.05.09

Thanks in advance,
Monty
monty.clift is offline   Reply With Quote
Old 11-04-11, 04:25 AM   #7
apriori
Registered User
 
Join Date: Nov 2008
Posts: 22
Default Re: GPU at $BUSID$ has fallen of the bus

@monty.clift:

You might want to try to disable wmi completely. Currently I'm also in the process of finding out how to do that. In my case, the Clevo570RU Notebook, all this mess is not resolved even when completely deactivating ACPI, so its not even related to that (although this machine has a hell lot of ACPI related issues I need to get fixed).

The only useful workaround I came up with up to now was to revert xorg to version 1.10 and use a 260.x driver which is a major pain the more recent your distro is. Funny enough its even possible to use OpenCL with such old drivers if the OpenCL libraries of the newer drivers are still around.
apriori is offline   Reply With Quote
Old 11-06-11, 04:49 AM   #8
apriori
Registered User
 
Join Date: Nov 2008
Posts: 22
Default Re: GPU at $BUSID$ has fallen of the bus

Here the bugreport log from my latest attempt with 290.06.
So far nothing changed.
Attached Files
File Type: gz nvidia-bug-report.log.gz (52.6 KB, 45 views)
apriori is offline   Reply With Quote

Old 11-07-11, 05:45 AM   #9
cehoyos
FFmpeg developer
 
Join Date: Jan 2009
Location: Vienna, Austria
Posts: 467
Default Re: GPU at $BUSID$ has fallen of the bus

Quote:
Originally Posted by luudee View Post
Two 30" monitors, KDE ...
I saw similar symptoms when using two screens because of heating the GPU. You could observe the GPU temperature to find out if that is the problem.
cehoyos is offline   Reply With Quote
Old 11-07-11, 11:04 AM   #10
vojta
Registered User
 
Join Date: Nov 2011
Posts: 19
Default Re: GPU at $BUSID$ has fallen of the bus

I have similar problems. X server freezes while using OpenGL or VDPAU. I have described my problem here.

CPU: Intel Core i5 520M
Memory: 8 GB (2x 4GB)
Graphics card: NVIDIA Quadro NVS 5100M

Using Gentoo Linux
nvidia-drivers version: 290.06 (also tried 285.05.09, nothing has changed)
X.org server version: 1.11.1 (also tried 1.10.4, nothing has changed)
Linux version: 3.1.0 (using ck- patches)

I will update xorg-server to 1.11.2 soon and report if anything changed.
Attached Files
File Type: gz nvidia-bug-report.log.gz (74.6 KB, 41 views)
File Type: gz Xorg.0.log.gz (22.6 KB, 38 views)
vojta is offline   Reply With Quote
Old 11-08-11, 04:28 AM   #11
apriori
Registered User
 
Join Date: Nov 2008
Posts: 22
Default Re: GPU at $BUSID$ has fallen of the bus

Yeah, I'd like to add, that my issue starts about 15 secs after starting X. I hardly ever manage to login KDE 4 completely (using kdm as login manager).

Currently the only semi-stable versions I got are (quite rare lockups):

Kernel 2.6.32 (yeah, I know its ancient)
Xorg 1.10.4
NVIDIA Drivers 270.41.19

My issue is definetly not hardware failure or temperature related. The machine runs non-stop for days using this driver or windows.

@vojta: The only reason I said something about "reverting to Xorg 1.10.4" is that this enables you to revert to older NVIDIA drivers to, which only support that ABI, e.g. in my case 270.41.19.
apriori is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 03:13 AM.


Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.