Go Back   nV News Forums > Linux Support Forums > NVIDIA Linux

Newegg Daily Deals

Reply
 
Thread Tools
Old 11-29-05, 05:15 PM   #1
shutdown
Registered User
 
Join Date: Nov 2005
Posts: 45
Default 8168 - Bugreport

From my dmesg:
Code:
------------[ cut here ]------------
kernel BUG at mm/rmap.c:486!
invalid operand: 0000 [#1]
PREEMPT 
Modules linked in: nvidia smbfs udf button thermal processor w83627hf i2c_nforce2 i2c_isa atxp1 hwmon_vid cpufreq_nforce2
CPU:    0
EIP:    0060:[<c014feb0>]    Tainted: P      VLI
EFLAGS: 00010286   (2.6.15-rc3-shutdown) 
EIP is at page_remove_rmap+0x30/0x40
eax: ffffffff   ebx: c6e4d840   ecx: c134bae0   edx: c134bae0
esi: b6a10000   edi: c134bae0   ebp: c3005f2c   esp: c3005e88
ds: 007b   es: 007b   ss: 0068
Process xawtv (pid: 15261, threadinfo=c3004000 task=c30975c0)
Stack: b6a10000 c6e4d840 c0149136 c134bae0 b6a10000 1a5d7067 1a5d7067 00000000 
       ffffffff df18e100 b6a14000 c67d6b6c b6a14000 c3005f2c c01492e5 c057cc70 
       c7b1ba14 c67d6b68 b6a10000 b6a14000 c3005f2c 00000000 b6a13fff c67d6b68 
Call Trace:
 [<c0149136>] zap_pte_range+0x156/0x250
 [<c01492e5>] unmap_page_range+0xb5/0x140
 [<c014945f>] unmap_vmas+0xef/0x1f0
 [<c014d935>] unmap_region+0x95/0x130
 [<c014dcb3>] do_munmap+0x113/0x180
 [<c014dd64>] sys_munmap+0x44/0x70
 [<c0102f45>] syscall_call+0x7/0xb
Code: 24 0c 83 42 08 ff 0f 98 c0 84 c0 74 1a 8b 42 08 40 78 18 c7 44 24 04 ff ff ff ff c7 04 24 10 00 00 00 e8 04 f0 fe ff 83 c4 08 c3 <0f> 0b e6 01 3b 09 45 c0 eb de 8d b6 00 00 00 00 83 ec 2c 89 5c 
 <6>note: xawtv[15261] exited with preempt_count 2
scheduling while atomic: xawtv/0x00000002/15261
 [<c0432747>] schedule+0x587/0x660
 [<c0119496>] vprintk+0x186/0x2d0
 [<c043353a>] rwsem_down_read_failed+0x8a/0x180
 [<c011cac2>] .text.lock.exit+0x27/0x85
 [<c011b5e3>] do_exit+0xf3/0x450
 [<c01043b0>] do_invalid_op+0x0/0xb0
 [<c0104145>] die+0x185/0x190
 [<c0104452>] do_invalid_op+0xa2/0xb0
 [<c014feb0>] page_remove_rmap+0x30/0x40
 [<e155e06a>] _nv004895rm+0x8a/0x94 [nvidia]
 [<e141f9f1>] rm_set_interrupts+0x129/0x144 [nvidia]
 [<c01039db>] error_code+0x4f/0x54
 [<c014feb0>] page_remove_rmap+0x30/0x40
 [<c0149136>] zap_pte_range+0x156/0x250
 [<c01492e5>] unmap_page_range+0xb5/0x140
 [<c014945f>] unmap_vmas+0xef/0x1f0
 [<c014d935>] unmap_region+0x95/0x130
 [<c014dcb3>] do_munmap+0x113/0x180
 [<c014dd64>] sys_munmap+0x44/0x70
 [<c0102f45>] syscall_call+0x7/0xb
Seems like the nVidia driver module failed while xawtv (TV viewer app using OGL) was running. The crazy thing about this is, that I did not notice something went wrong at all - xawtv worked perfectly until I stopped it "maually", on top of that I found this log only by coincedence because I was looking for some boot messages in my dmesg.
So this seems to be an error without consequences...

My system:
AMD Athlon XP-M 2600+
nForce2 Chipset
GeForce4 Ti4600 Graphics Card (Not OC'ed)
Some PCI cards, which are a Promise S150 TX4 SATA Controller, a Creative Audigy2 ZS, a Hauppauge Nexus-S (TV) and an Intel PRO/1000 NIC.

I am running slackware-current with an Kernel 2.6.15-rc3, nVidia driver is ver. 8168.
Because the rel80 driver series is under development right now, I think the information above may be useful for the developers. (I was just too curious about the new driver series, so I installed the leaked driver...)

Peter
shutdown is offline   Reply With Quote
Old 11-29-05, 07:54 PM   #2
zander
NVIDIA Corporation
 
zander's Avatar
 
Join Date: Aug 2002
Posts: 3,740
Default Re: 8168 - Bugreport

While there's no reference to the NVIDIA driver in the initial trace, it's possible that it is involved. If the problem reproduces/persists with an officially released driver and Linux kernel, please file a bug report.
zander is offline   Reply With Quote
Old 11-30-05, 01:53 PM   #3
shutdown
Registered User
 
Join Date: Nov 2005
Posts: 45
Default Re: 8168 - Bugreport

I'd like to add some information to my last post:
When I tried shutting down my computer, it seemed to hang when trying to kill X. I logged in via ssh from another computer and found X and xawtv still running and not responding to any kill signals send to them - the computer was not able to stop the two programs, so in the end I had to switch my computer off the hard way...

It looks like the nvidia drivers 1.0-8168 cause problems with xawtv and make it impossible for it to be closed again.

So if you nVidia guys are going to test the new driver before it's offical release, you may want to have a look at this problem...I'm pretty sure it is the nvidia driver causing this problems.
(I do not want to claim you for this as this driver was never officially released, but maybe you could fix it up before an official release...)

Peter
shutdown is offline   Reply With Quote
Old 11-30-05, 02:05 PM   #4
zander
NVIDIA Corporation
 
zander's Avatar
 
Join Date: Aug 2002
Posts: 3,740
Default Re: 8168 - Bugreport

Does this problem reproduce with e.g. Linux 2.6.14.3 and/or the nv driver?
zander is offline   Reply With Quote
Old 11-30-05, 02:39 PM   #5
shutdown
Registered User
 
Join Date: Nov 2005
Posts: 45
Default Re: 8168 - Bugreport

It seems like I forgot to mention that the problem did not occur with 2.6.14.X - I have 2.6.14.3 installed and tried with the nvidia driver, xawtv runs perfectly and there is no problem when I try to close it - no dmesg errors, no zombie process...
So you nvidia guys may have a look at the changes from 2.6.14 to 2.6.15-rcX, maybe the kernel developers changed something as they already did a few times before. It would be very sad if your rel80 series drivers did not work with 2.6.15 kernels.

I'm glad you're interested in the problem!

Peter
shutdown is offline   Reply With Quote
Old 11-30-05, 02:44 PM   #6
netllama
NVIDIA Corporation
 
Join Date: Dec 2004
Posts: 8,763
Default Re: 8168 - Bugreport

At this point, it seems just (if not more) as likely that this is a kernel bug, and not an nvidia driver bug.

Does this reproduce with the 'nv' X driver and the 2.6.15-rcX kernel?

Thanks,
Lonni
netllama is offline   Reply With Quote
Old 11-30-05, 02:57 PM   #7
zander
NVIDIA Corporation
 
zander's Avatar
 
Join Date: Aug 2002
Posts: 3,740
Default Re: 8168 - Bugreport

@shutdown: PageReserved went away in Linux 2.6.15-rc1, it's possible that what you're seeing is fallout related to this. The NVIDIA Linux graphics driver you're using should be prepared for this change, but it's possible that the final changes that went into Linux 2.6.15-rc* differ from those posted earlier. It's also possible that the problem is caused by e.g. v4l. Could you repeat the experiment with the nv driver?
zander is offline   Reply With Quote
Old 11-30-05, 03:00 PM   #8
shutdown
Registered User
 
Join Date: Nov 2005
Posts: 45
Default Re: 8168 - Bugreport

I tried 2.6.15-rc3 with the out-of-the-box "nv" driver of X.org 6.8.2 and the problem was not reproduceable, too. The nvidia module is not loaded and xawtv does not leave any dmesg errors or zombie processes on exit.
So it looks like a problem with the nvidia driver, though I am pretty sure it is caused by a change in kernel code. The question is, if the changed code was changed knowingly or not by the kernel developers...I don't think those guys think too much about what may happen to your driver if they change pieces of code :/

So if you have any idea what may be wrong tell me - if you create a patch (for the open source part of your driver), I volunteer for testing
But thank you for having a closer look at this again!

Quote:
Originally Posted by zander
Could you repeat the experiment with the nv driver?
I just did while you wrote your posting - if you need testing with other configurations, simply tell me.

Peter
shutdown is offline   Reply With Quote

Old 11-30-05, 03:22 PM   #9
zander
NVIDIA Corporation
 
zander's Avatar
 
Join Date: Aug 2002
Posts: 3,740
Default Re: 8168 - Bugreport

Does this problem only reproduce with xawtv or also with OpenGL applications?
zander is offline   Reply With Quote
Old 11-30-05, 04:28 PM   #10
shutdown
Registered User
 
Join Date: Nov 2005
Posts: 45
Default Re: 8168 - Bugreport

The Problem also occurs with other OpenGL-Applications.

I found another interesting thing - while doing some tests with versions, I got this:

Code:
------------[ cut here ]------------
kernel BUG at mm/rmap.c:486!
invalid operand: 0000 [#1]
PREEMPT
Modules linked in: nvidia smbfs udf button thermal processor w83627hf i2c_nforce2 i2c_isa atxp1 hwmon_vid cpufreq_nforce2
CPU:    0
EIP:    0060:[<c014feb0>]    Tainted: P      VLI
EFLAGS: 00013286   (2.6.15-rc3-shutdown)
EIP is at page_remove_rmap+0x30/0x40
eax: ffffffff   ebx: dca33ff8   ecx: c1385e80   edx: c1385e80
esi: af3fe000   edi: c1385e80   ebp: df699f2c   esp: df699e88
ds: 007b   es: 007b   ss: 0068
Process X (pid: 3981, threadinfo=df698000 task=dfab4030)
Stack: af3fe000 dca33ff8 c0149136 c1385e80 af3fe000 1c2f4067 1c2f4067 00000000
       ffffffff dfb7f880 af400000 cde6baf4 af400000 df699f2c c01492e5 c057cc70
       de9f9ac4 cde6baf0 af3fe000 af400000 df699f2c 00000000 af40dfff cde6baf0
Call Trace:
 [<c0149136>] zap_pte_range+0x156/0x250
 [<c01492e5>] unmap_page_range+0xb5/0x140
 [<c014945f>] unmap_vmas+0xef/0x1f0
 [<c014d935>] unmap_region+0x95/0x130
 [<c014dcb3>] do_munmap+0x113/0x180
 [<c014dd64>] sys_munmap+0x44/0x70
 [<c0102f45>] syscall_call+0x7/0xb
Code: 24 0c 83 42 08 ff 0f 98 c0 84 c0 74 1a 8b 42 08 40 78 18 c7 44 24 04 ff ff ff ff c7 04 24 10 00 00 00 e8 04 f0 fe ff 83 c4 08 c3 <0f> 0b e6 01 3b 09 45 c0 eb de 8d b6 00 00 00 00 83 ec 2c 89 5c
 <6>note: X[3981] exited with preempt_count 2
scheduling while atomic: X/0x00000002/3981
 [<c0432747>] schedule+0x587/0x660
 [<c0119496>] vprintk+0x186/0x2d0
 [<c043353a>] rwsem_down_read_failed+0x8a/0x180
 [<c011cac2>] .text.lock.exit+0x27/0x85
 [<c011b5e3>] do_exit+0xf3/0x450
 [<c01043b0>] do_invalid_op+0x0/0xb0
 [<c0104145>] die+0x185/0x190
 [<c0104452>] do_invalid_op+0xa2/0xb0
 [<e14b8472>] _nv004911rm+0x372/0x3a4 [nvidia]
 [<c014feb0>] page_remove_rmap+0x30/0x40
 [<e14d98ab>] _nv004517rm+0x23/0x28 [nvidia]
 [<e13b1abc>] rm_enable_interrupts+0x44/0x58 [nvidia]
 [<e13b1aa5>] rm_enable_interrupts+0x2d/0x58 [nvidia]
 [<e139732a>] _nv001462rm+0x7a/0x154 [nvidia]
 [<c01039db>] error_code+0x4f/0x54
 [<c014feb0>] page_remove_rmap+0x30/0x40
 [<c0149136>] zap_pte_range+0x156/0x250
 [<c01492e5>] unmap_page_range+0xb5/0x140
 [<c014945f>] unmap_vmas+0xef/0x1f0
 [<c014d935>] unmap_region+0x95/0x130
 [<c014dcb3>] do_munmap+0x113/0x180
 [<c014dd64>] sys_munmap+0x44/0x70
 [<c0102f45>] syscall_call+0x7/0xb
It's 2.6.15-rc3 again, but the nvidia driver version 7676 which seemed to work before causes the trouble this time - even without any OpenGL applications running or running before in this session. It happened when I tried to kill the X-Server...
So it seems like the problem is an incompatibility of 2.6.15-rcX with the nvidia driver in general...I am confused... :/

Peter
shutdown is offline   Reply With Quote
Old 11-30-05, 04:34 PM   #11
zander
NVIDIA Corporation
 
zander's Avatar
 
Join Date: Aug 2002
Posts: 3,740
Default Re: 8168 - Bugreport

I wouldn't expect 1.0-7676 to work with Linux >= 2.6.15-rc1. I'll give the latest -rc* kernel a try here.
zander is offline   Reply With Quote
Old 11-30-05, 07:54 PM   #12
netllama
NVIDIA Corporation
 
Join Date: Dec 2004
Posts: 8,763
Default Re: 8168 - Bugreport

I've just tested with 2.6.15-rc3 (x86) and was able to reproduce a crash when shutting down glxgears. I've opened bug 200716 to have this investigated further to determine whether this is a kernel bug, an nvidia driver bug, or something else altogether.

Thanks,
Lonni
netllama is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 08:04 PM.


Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.