View Single Post
Old 09-20-10, 03:27 PM   #1
nicolasavru
Registered User
 
Join Date: Sep 2010
Posts: 2
Default Nvidia kernel module causes Oops on bootup

I am a new Arch user (have had more experience with Ubuntu) and am having some problems with the Nvidia kernel module on an install. I originally posted this question on the Arch forums (https://bbs.archlinux.org/viewtopic.php?id=105104) and it was suggested that I ask here.

I set up Arch on a flash drive with a RAID 1 configuration with the root partition and a ramdisk, running the entire OS from ram (based on this thread: https://bbs.archlinux.org/viewtopic.php?id=64281&p=2). I got it running very well and it works perfectly on my machines at home, but when booting on the Dell Precision T3500 workstations in my university's computer lab, I get the following error on bootup:

Code:
NVRM: loading NVIDIA UNIX x86_64 Kernel Module  256.53  Fri Aug 27 20:27:48 PDT 2010
ahci 0000:00:1f.2: controller reset failed (0xffffffff)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
ahci 0000:00:1f.2: failed to stop engine (-5)
BUG: unable to handle kernel paging request at ffffc90000073018
IP: [<ffffffffa04a29c4>] ahci_stop_engine+0x24/0x70 [libahci]
PGD 1dfc11067 PUD 1dfc12067 PMD 1dfc13067 PTE 0
Oops: 0000 [#1] PREEMPT SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/0000:02:00.0/uevent
CPU 1 
Modules linked in: nvidia(P) snd_hda_intel(+) lp ahci(+) snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq ppdev de

Pid: 928, comm: modprobe Tainted: P            2.6.35-ARCH #1 0K095G/Precision WorkStation T3500  
RIP: 0010:[<ffffffffa04a29c4>]  [<ffffffffa04a29c4>] ahci_stop_engine+0x24/0x70 [libahci]
RSP: 0018:ffff8801d7cbfbc8  EFLAGS: 00010286
RAX: ffffc90000073000 RBX: 000000000000001e RCX: 000000000000001d
RDX: ffff8801d8a09c18 RSI: ffff8801d7cbfc30 RDI: ffffc90000073018
RBP: ffff8801d7cbfbc8 R08: 000000000000fffb R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff8801d7cbfc30
R13: ffff8801d7fc0000 R14: 000000000000001e R15: ffffc90000072000
FS:  00007f864e190700(0000) GS:ffff880001840000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffc90000073018 CR3: 00000001d8151000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process modprobe (pid: 928, threadinfo ffff8801d7cbe000, task ffff8801d8278000)
Stack:
 ffff8801d7cbfbf8 ffffffffa04a2a2f ffff8801d9e32e30 000000000000001e
<0> ffff8801d77aee18 ffffc90000072000 ffff8801d7cbfc68 ffffffffa04a32ce
<0> ffff8801d9e32e30 ffffffffa04a5045 ffff8801fffffffb ffff8801d9e92088
Call Trace:
 [<ffffffffa04a2a2f>] ahci_deinit_port+0x1f/0xb0 [libahci]
 [<ffffffffa04a32ce>] ahci_init_controller+0x6e/0x120 [libahci]
 [<ffffffffa050712d>] ahci_pci_init_controller+0x3d/0x40 [ahci]
 [<ffffffffa0507ab2>] ahci_init_one+0x5a2/0xbd0 [ahci]
 [<ffffffff81371739>] ? mutex_unlock+0x9/0x10
 [<ffffffff8118b45e>] ? sysfs_addrm_finish+0x2e/0xd0
 [<ffffffff811cf82a>] ? kobject_get+0x1a/0x30
 [<ffffffff811ee395>] pci_device_probe+0x75/0xa0
 [<ffffffff812877ea>] ? driver_sysfs_add+0x5a/0x90
 [<ffffffff81287ac6>] driver_probe_device+0x96/0x1c0
 [<ffffffff81287c8b>] __driver_attach+0x9b/0xa0
 [<ffffffff81287bf0>] ? __driver_attach+0x0/0xa0
 [<ffffffff812869fe>] bus_for_each_dev+0x5e/0x90
 [<ffffffff81287789>] driver_attach+0x19/0x20
 [<ffffffff81287297>] bus_add_driver+0xc7/0x2e0
 [<ffffffff81287f01>] driver_register+0x71/0x140
 [<ffffffff81076b6d>] ? notifier_call_chain+0x4d/0x70
 [<ffffffff811ee621>] __pci_register_driver+0x51/0xd0
 [<ffffffff81076ecc>] ? __blocking_notifier_call_chain+0x5c/0x80
 [<ffffffffa050f000>] ? ahci_init+0x0/0x20 [ahci]
 [<ffffffffa050f01e>] ahci_init+0x1e/0x20 [ahci]
 [<ffffffff81002149>] do_one_initcall+0x39/0x1a0
 [<ffffffff8108cefb>] sys_init_module+0xbb/0x200
 [<ffffffff81009e82>] system_call_fastpath+0x16/0x1b
Code: 1f 84 00 00 00 00 00 55 48 8b 87 50 28 00 00 48 89 e5 48 8b 50 20 8b 47 28 c1 e0 07 89 c0 48 05 00 01 00 00 48 0
RIP  [<ffffffffa04a29c4>] ahci_stop_engine+0x24/0x70 [libahci]
 RSP <ffff8801d7cbfbc8>
CR2: ffffc90000073018
---[ end trace 8b525310bc523bb3 ]---
The bootup process hangs for approximately 5-7 minutes and then continues as normal. X server, video, etc. appears to work fine. As such, it is a non-critical issue, but one I would like to fix as it is extremely annoying waiting 10 minutes for it to boot.

I tried blacklisting the nvidia kernel module in rc.conf and the error did not occur, so it is definitely caused by the nvidia kernel module (256.53 drivers). I have also tried the beta nvidia drivers (260.19), but the same error occurred with them. The Dell Precision T3500s have Nvidia Quadro FX 580 gpus.

Any assistance would be much appreciated. Some log files are attached below. If any other logs are necessary, I will provide them.

Update: I blacklisted the ahci module as suggested on the Arch forums and that does indeed make the error disappear. Would be nice to figure out why it was conflicting with the ahci module.
Attached Files
File Type: log dmesg.log (41.1 KB, 35 views)
File Type: log kernel.log (69.0 KB, 38 views)
nicolasavru is offline   Reply With Quote