nV News Forums

 
 

nV News Forums (http://www.nvnews.net/vbulletin/index.php)
-   NVIDIA Linux (http://www.nvnews.net/vbulletin/forumdisplay.php?f=14)
-   -   Crash when logout from X (http://www.nvnews.net/vbulletin/showthread.php?t=1185)

TGL 08-27-02 05:43 AM

Crash when logout from X
 
Hi,

My Gentoo box often, but not always, crash when I logout from Xfree (ugly green and purple scrambled screen and no more keyboard response). It happens with and without using gdm (or xdm).

Here is my hardware config:
- GeForce2MX
- Asus A7Pro (via KT133)
- Amd Duron 700

And software is:
- Gentoo linux 1.2
- kernel 2.4.19 (with agpart, vesa fb, but no nvidia/riva fb)
- xfree 4.2.0
- NVdriver 2960

I know there are some well known troubles with amd/via harware, but I can't solve the problem with the usual tricks (mem=nopentium, disabling agp support, removing agp fast-write...)

What is strange is that it works like a charm under slackware 8.0 with Xfree 4.1, and without any trick.

Any idea? Thanks,

Neutro 09-05-02 10:05 AM

"Mee too"
 
Hey, I have a bug that seems very similar. I really don't know if the nvidia drivers cause it -- do you know for sure? Here's the description of the bug.

- It doesn't happen under Windows.
- I can make X crash and restart by following those steps:
1. Open a movie clip in Xine
2. Logout
3. Login
4. Open a movie clip in Xine
- After such a crash of the X server, Quake 3 often crashes in the way you described (this can also happen in XMMS or Xine, so at first, I tought it was sound-related). My mouse sometimes still works, the sound buffer is often repeated in a loop, but the video output is messed up (purple and green garbage), and I must reboot and wait for fsck to do its job. Sometimes alt-sysreq-S/U/B doesn't even work and I have to hit the reset button.

Hardware:
- Asus A7V266
- Athlon XP 1800+
- GeForce 2 MX 400 64 MB

Software:
- Mandrake 8.1
- kernel 2.4.8
- XFree 4.1.0
- KDE 2.2.1 + 3.0.3 (crashes in both)
- NVdriver 2960

Zymurgeek 09-05-02 10:30 AM

Me too, intermittently
 
Here's some relevant stuff from /var/log/messages:

Sep 4 21:44:11 localhost kde(pam_unix)[8901]: session closed for user dave
Sep 4 21:44:11 localhost kernel: NVRM: AGPGART: freed 16 pages
Sep 4 21:44:12 localhost kernel: NVRM: AGPGART: backend released
Sep 4 21:44:12 localhost kernel: NVRM: AGPGART: unknown chipset
Sep 4 21:44:12 localhost kernel: NVRM: AGPGART: aperture: 128M @ 0xf0000000
Sep 4 21:44:12 localhost kernel: NVRM: AGPGART: aperture mapped from 0xf0000000
to 0xd1afb000
Sep 4 21:44:12 localhost kernel: NVRM: AGPGART: mode 4x
Sep 4 21:44:12 localhost kernel: NVRM: AGPGART: allocated 16 pages

Sep 5 04:03:51 localhost kernel: Page has mapping still set. This is a serious
situation. However if you
Sep 5 04:03:51 localhost kernel: are using the NVidia binary only module please
report this bug to
Sep 5 04:03:51 localhost kernel: NVidia and not to Red Hat Bugzilla or the linu
x kernel mailinglist.
Sep 5 04:03:51 localhost kernel: invalid operand: 0000
Sep 5 04:03:51 localhost kernel: CPU: 0
Sep 5 04:03:51 localhost kernel: EIP: 0010:[<c012d5ca>] Tainted: P
Sep 5 04:03:51 localhost kernel: EFLAGS: 00010282
Sep 5 04:03:51 localhost kernel:
Sep 5 04:03:51 localhost kernel: EIP is at (2.4.18-10custom)
Sep 5 04:03:51 localhost kernel: eax: 00000047 ebx: c11b5a94 ecx: ced68000
edx: ced69f64
Sep 5 04:03:51 localhost kernel: esi: 00000000 edi: 00000000 ebp: c023c344
esp: c136ff5c
Sep 5 04:03:51 localhost kernel: ds: 0018 es: 0018 ss: 0018
Sep 5 04:03:51 localhost kernel: Process kswapd (pid: 4, stackpage=c136f000)
Sep 5 04:03:51 localhost kernel: Stack: c020b180 c020b120 c020b0c0 ca17d440 c11
b5a94 c0137ca2 cff14600 c12e57f8
Sep 5 04:03:51 localhost kernel: 00000030 c01360e9 c11b5a94 c11b5ab0 000
00010 c023c344 c012b2b3 c11b5a94
Sep 5 04:03:51 localhost kernel: 00000030 c11b5a94 c11b5ab0 00000010 c01
2c666 c023c36c 00000000 000023fe
Sep 5 04:03:51 localhost kernel: Call Trace: [<c0137ca2>]
Sep 5 04:03:51 localhost kernel: [<c01360e9>]
Sep 5 04:03:51 localhost kernel: [<c012b2b3>]
Sep 5 04:03:51 localhost kernel: [<c012c666>]
Sep 5 04:03:51 localhost kernel: [<c012cfe0>]
Sep 5 04:03:51 localhost kernel: [<c0105000>]
Sep 5 04:03:51 localhost kernel: [<c0107006>]
Sep 5 04:03:51 localhost kernel: [<c012cd60>]
Sep 5 04:03:51 localhost kernel:
Sep 5 04:03:51 localhost kernel:
Sep 5 04:03:51 localhost kernel: Code: 0f 0b 8b 53 08 83 c4 0c 8b 0d 50 09 2a c
0 89 d8 29 c8 69 c0
Sep 5 07:41:56 localhost syslogd 1.4.1: restart.

Klaus-P 09-05-02 10:48 AM

Here is a very nebulous idea: Does this effect also appear when you
disable all power saving and power suspending modes, in your BIOS and
your Linux as well? I've got this effect after pushing the suspend/slep mode
buttom but not always. I've not configured it as yet.

Thunderbird 09-05-02 11:10 AM

That crash really isn't normal. It is not just the crash of a normal app but a crash of the Linux kernel and that is something special. Are you sure your kernel and nvidia kernel module are compiled using the same gcc version? (2.9x and not 3.2)

Zymurgeek 09-05-02 11:17 AM

Re: kernel compiler
 
Re: TBird,

I downloaded the source rpms and used the --rebuild flag. This was after the most recent recompile of my kernel. (I have RedHat 7.3 with gcc 2.96.)

TheOneKEA 09-05-02 05:32 PM

Right kernel source?
 
Since you're using the RPMs, you may be missing an error which states that you are compiling the module against a kernel tree which is different from the kernel that you are running at the time. Try using the tarballs and see if the error pops up.

Neutro 09-05-02 08:56 PM

For my part, I have MDK 8.1's original kernel and used nVidia's MDK8.1 RPMs directly. No power saving except for the screen.

As for /var/log/messages, in one occasion all the information before the crash is garbled (appears as a newline in some text editors, as a series of ^@^@^@ in some pagers). In another occasion, there's a bunch of errors following the X server crash until the final crash. They would be too long to list here, but here are interesting lines first line is the X server crash:

Sep 4 22:21:47 localhost kdm[2311]: Server for display :0 terminated unexpectedly
Sep 4 22:21:47 localhost kde(pam_unix)[5049]: session closed for user fbouffar
Sep 4 22:21:58 localhost kde(pam_unix)[5375]: session opened for user fbouffar by (uid=0)
sep 4 22:23:19 localhost su(pam_unix)[5682]: session opened for user root by fbouffar(uid=501)
Sep 4 22:29:45 localhost kernel: EXT2-fs error (device ide0(3,3)): ext2_check_page: bad entry in directory #1126801: rec_len is smaller than minimal - offset=0, inode=0, rec_len=0, name_len=0
Sep 4 22:29:50 localhost kernel: EXT2-fs error (device ide0(3,3)): ext2_check_page: bad entry in directory #1534852: unaligned directory entry - offset=432, inode=0, rec_len=65530, name_len=4

(...)

Sep 4 22:32:22 localhost kernel: attempt to access beyond end of device
Sep 4 22:32:22 localhost kernel: 03:03: rw=0, want=67108864 x(=0x), limit=67108864

(...)

ep 4 22:39:41 localhost kernel: auditIN=ppp0 OUT= MAC= SRC=203.122.14.117 DST=216.239.70.22 LEN=60 TOS=0x00 PREC=0x00 TTL=47 ID=46947 DF PROTO=TCP SPT=44719 DPT=21 WINDOW=5840 RES=0x00 SYN URGP=0
Sep 4 22:39:44 localhost kernel: auditIN=ppp0 OUT= MAC= SRC=203.122.14.117 DST=216.239.70.22 LEN=60 TOS=0x00 PREC=0x00 TTL=47 ID=46948 DF PROTO=TCP SPT=44719 DPT=21 WINDOW=5840 RES=0x00 SYN URGP=0
Sep 4 22:39:54 localhost kdm[2311]: Server for display :0 terminated unexpectedly
Sep 4 22:39:54 localhost kde(pam_unix)[5942]: session closed for user fbouffar
Sep 4 22:40:09 localhost kde(pam_unix)[6262]: session opened for user fbouffar by (uid=0)
Sep 4 22:40:53 localhost kernel: SysRq : Emergency Sync
Sep 4 22:40:53 localhost kernel: Syncing device 03:03 ... OK
Sep 4 22:40:53 localhost kernel: Done.
Sep 4 22:48:55 localhost syslogd 1.4-0: restart.

However I'm no expert and can't tell much from this.

TheOneKEA 09-06-02 07:20 AM

AHA!
 
Aha. Corrupted filesystem. Run e2fsck on the fs that has your X installation and see what happens.

Neutro 09-06-02 11:39 AM

He he.

It does that everytime, and I run fsck (e2fsk) every time the crash occurs. It takes about 10 min fixing all the errors.

As I said in my first post, this kind of crash occurs when I had first a X-server crash, so to solve the problem, I guess I have to find why the X-server crashes in the first place.

I just hope it's not the hard drive. But since it's reproducible, i.e. not random, I guess it isn't. Somebody suggested I reinstall X...

Neutro 09-13-02 09:22 PM

Problem solved?
 
Just for the record, for now I'm unable to reproduce the crash with the 3123 drivers installed...

EDIT: Huh, nevermind, the behavior is slightly different but I'm still experiencing the same problem. The difference is that the X server doesn't crash first.


All times are GMT -5. The time now is 12:33 AM.

Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.