View Full Version : reproducible Xserver Freeze with v4191
Hi! I've got a "Nvidia TNT2 M64 rev 21" and installed the latest 4191 drivers with Debians xserver-xfree86 (v4.1.0-16). Works like a charm. Thanks, guys!
Just one anoying problem: every once in a while the XServer FREEZES completely (mouse still moving!) and its process uses up 100% of the CPU. I have to log in from another machine and kill the process. Afterwards everything is fine except for a messy text console.
But the annoying part is that of course all my programs are terminated so that I have to start all over. So I live in constant fear of a sudden crash. :-(
Yesterday I found a _reproducible_ way to cause the crash. Whenever I surf to http://uschi.spiegl.de/ (sorry, this is with NO intention to promote this site. :-) Doesn't contain hardly anything anyway) with mozilla or konqueror -> FREEZE. With Opera however, everything is fine.
Anyone have any idea what's going on?
Or how I can avoid this?
Thanks so much!
Andy.
PS: I am currently using "NvAGP", but had tried AGPGART before.
My output of lspci:
00:00.0 Host bridge: Intel Corp. 82845 845 (Brookdale) Chipset Host Bridge (rev 11)
00:01.0 PCI bridge: Intel Corp. 82845 845 (Brookdale) Chipset AGP Bridge (rev 11)
00:1d.0 USB Controller: Intel Corp.: Unknown device 24c2 (rev 01)
00:1d.1 USB Controller: Intel Corp.: Unknown device 24c4 (rev 01)
00:1d.2 USB Controller: Intel Corp.: Unknown device 24c7 (rev 01)
00:1d.7 USB Controller: Intel Corp.: Unknown device 24cd (rev 01)
00:1e.0 PCI bridge: Intel Corp. 82820 820 (Camino 2) Chipset PCI (rev 81)
00:1f.0 ISA bridge: Intel Corp.: Unknown device 24c0 (rev 01)
00:1f.1 IDE interface: Intel Corp.: Unknown device 24cb (rev 01)
01:00.0 VGA compatible controller: nVidia Corporation Vanta [NV6] (rev 15)
02:03.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
02:04.0 FireWire (IEEE 1394): NEC Corporation: Unknown device 00f2 (rev 01)
02:08.0 Ethernet controller: Intel Corp.: Unknown device 1039 (rev 81)
02:0e.0 Multimedia video controller: Brooktree Corporation Bt878 (rev 02)
02:0e.1 Multimedia controller: Brooktree Corporation Bt878 (rev 02)
LordMorgul
02-06-03, 05:34 AM
The fact that your mouse is still able to move indicates to me that the video hardware is not to blame, especially if you have hardware mouse rendering enabled. Typically when you have video drivers/hardware issues you see odd things on the screen rather than simply 'locking' all drawing excluding the mouse.
This is likely a software problem, potentially a conflict with other running software. Websurfing with mozilla or konquerer causes a crash, but NOT opera? That is interesting. I have no clue what the problem is.. but I'd start with upgrading my software one piece at a time, paying special attention to check the results after any major upgrade.. i.e. XFree86 up to 4.2.0, mozilla up to 1.2.x, etc. Chasing down conflicts such as that with older versions of these apps is going to be painful. :eek: Its either that or backtraces and debuggers...
Remove ALL plugins that your browsers might be fetching! Plugins while neat are evil as well. Put them back when it works.
Since that particular site includes quicktime video I wonder if you are opening/viewing them or simply navigating the opening page? Those evil plugins... :afro:
I have been having a problem like this for quite some time, although I have not found a way to reproduce it successfully, I will try the webpage above later when I get back to the machine.
There does not seem to be any pattern, it could freeze when doing a cut & paste, using the wheelmouse, moving a window, or even sometimes just when the screensaver is running.
Logging in from another machine and running 'top', shows that X is using over 98% CPU, and attempting to kill X results in the machine completely locking up and the ssh connection is broken. Eventually, the machine will lock hard without killing X, as this has happened overnight on a couple of occasions and I find the machine locked solid in the morning.
The mouse still moves, so I can accept that it is not a hardware issue (my card is a GeForce2 MX 400) but the lockups only occur when I have installed the NVDriver module (currently 3123, but it has done this with every version I have tried up to now) and use the 'nvidia' driver. The 'nv' driver with XFree86 does not give any problems.
I am running Gentoo Linux on a 1.8GHz P4 with 512MB DDR RAM, and a VIA P4X266 chipset, but have seen this problem with both Mandrake 8.x and RedHat 7.x (can't remember the exact versions)
Nothing seems to get written to any log files indicating the error, and I am not sure where to start trying to trace the actual cause.
I've been scouring the 'net to see if I can find someone who has figured this out, and I have found a few instances of others describing a similar problem, but no answers... :(
It's got me baffled... :confused:
Whoohoo, I updated Mozilla to v1.2 and now X doesn't hang anymore on websites with very wide pictures. What a relief!
BUT this doesn't mean that there is no problem with the Nvidia Xserver anymore. I am sure there is! Just that the new mozilla doesn't trigger it anymore.
Anyone knows if Nvidia is investigating these kinds of bugs?
If you have similar problems, please post them here with many details so that Nvidia sees that there is need for a fix.
When accessing the site www.sem-muenchen.de (http://www.sem-muenchen.de) with mozilla the system hangs. All details can be found at http://bugzilla.mozilla.org/show_bug.cgi?id=193780
When turning the nvidia driver of, everything works fine.
Works for me, build ID 2003020921 (which was a CVS pull of Mozilla on Feb. 9, 2003, at around 9 pm local time -- I'm in the Eastern timezone).
But then, I'm not using 4191. Is the bug reproducible with 3123 on your system?
No problems with driver 3123.
Thanx.
Andy Mecham
02-22-03, 03:15 PM
Yes, we're interested in bugs like this. Please send machine details and repro steps to linux-bugs@nvidia.com. If you've managed to regress this across Mozilla/NVIDIA driver/X/Window manager versions, that would be really helpful to know.
--andy
Both sites (www.sem-muenchen.de and http://uschi.spiegl.de/) include very large pictures. That should be an important evidence!
I suppose that it has something to do with memory allocation routines.
Andy, I already posted my repro steps in this thread. If you need any other info please don't hesitate to ask.
Andy.
Oh no, and it happened again!!! Reproducible again!
This time I was using qiv (http://www.klografx.net/qiv) to look at a JPG image
(I just put it there for you to try: http://spiegl.de/andy/large_image.jpg but it works with every large picture!)
I switched to full screen mode ("f") and scrolled around a bit with the cursor keys. And soon everything froze. :-(
I had to log in from my laptop and kill the Xserver which was using up 99% of the cpu.
I am still using "NVIDIA XFree86 Driver 1.0-4191", NvAGP, and the latest Xserver from Debian (stable branch) which is 4.1.0-16
What other information do you need?
I've uploaded my XFree86.0.log, too:
http://spiegl.de/andy/XFree86.0.log
Hope you can track down now this nasty little bug because it's really annoying having to kill and restart everything :-((
Thanks,
Andy.
Works for me
with NVIDIA 4349 and Mozilla 1.4a
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4a) Gecko/20030402
meldroc
04-18-03, 07:47 PM
spiegl, I have the exact same problem that you have. It started when I upgraded my NVidia Linux drivers to 4191, and the problem still occurs in 4349.
In a nutshell, when I visit a web page with a very large image, XFree86 locks up, with applications unresponsive, keyboard unresponsive, but mouse cursor still working. I have to find another machine and use it to ssh into mine and kill X, which brings the system out of catatonia, but mangles the text console. If I restart X, the screen comes back fine. I tried the same web pages using the nv driver instead of the cool nvidia driver, and this problem did not occur.
Here's some info:
System: 650MHz Athlon, Asus motherboard, 384MB RAM, Nvidia TNT2 M64 rev 21 video card (identical to spiegl's)
Currently running Debian Unstable (Sid) using it's XFree86 (version 4.2.1) and KDE for the most part, though I use Mozilla and have encountered these lockups in Mozilla and Konqueror.
The last time I had this problem was at this Ars OpenForum thread (Linux desktop screenshots.) (http://arstechnica.infopop.net/OpenTopic/page?q=Y&a=tpc&s=50009562&f=96509133&m=7670937654&p=15) This entire thread is full of very large images, which have triggered this problem multiple times for me.
I'm emailing the NVidia bugs people details.
Andy Mecham
04-18-03, 08:18 PM
Thanks for your report! I've reproduced this here.
--andy
I have been having a very similar problem with the binary only nvidia drivers for as long as I can remember. And it doesn't appear to be a hardware specific issue as I've had it happen on Intel chipsets with Intel CPU's as well as Via chipsets with Athlon CPU's.
Normally, starting up aviplay or mplayer (fullscreen or window mode) will immediately kill my X server. It's only when X goes to draw the application window seemingly with media content that it actually dies. The behavior at this point goes one of several ways. The least worse scenario is that the X server simply gets killed and I'm back at a login screen once the server starts back up (under both kdm and xdm). If you log back in immediately, you can start the video application with no problems.
More extreme forms of the behavior result in a scrambled text console with random junk ASCII on the screen or a locked up picture of what was last on your X server's screen. Remotely logging in is usually a possibility, but any attempts to kill X fail and a kill -9 almost always results in a hard lockup of the machine.
I've been using agpgart exclusively, so I guess I could try nvidia's AGP driver instead. It might help. This happens on at least ten different machines (I'm running a workstation/NFS server environment at work and almost all my users have had this happen to them). But it also happens on my Athlon at home, so it's not the hardware configuration.
On a side note, I've also noticed something odd when booting from Linux at home back into Windows XP. After warm booting from Linux, XP will start to boot, giving the initial Microsoft screen, and right at the point when the screen blanks as it tries to go to the Welcome screen, the monitor stays dark and has no video signal. I assumed this was just my machine at home (since it is the only one I dual boot), but I'm beginning to think it has something to do with the nvidia kernel driver because I've noticed in Windows a few times where a screen refresh will occur as I go into a Direct X game and I notice my Linux desktop background shows up momentarily before the game clears out video memory and actually starts writing to the screen. Very weird...
Anyway, if someone at nvidia wants me to dump a lot of information about my machine configurations and try some things on my end, I'd love to help. I hate the fact that these drivers are NOT open source so that the community can help you with these problems, but c'est la vie I guess. I still like your products and if you insist on developing your drivers in this manner, I would at least like to see them stabilize a bit more.
Thanks for fixing the problems with running multiple X servers from your previous 4xxx release by the way.
Also, please note that all of these problems have been seen across the entire XFree86 4.x series up to 4.2.1 which I'm running right now. And all of the systems have been Debian testing or unstable.
Andy Mecham
04-18-03, 10:00 PM
Sure - check out the "if you have a problem" thread and start a new one with your info.
--andy
Just for the record, I was desperately trying to find the cause of this problem, and it may very well be the nVidia drivers. This is what people believe over at Mozilla's Bugzilla site.
This happened after installing the 4349 drivers: using Mozilla (v 1.1, 1.3 or 1.4a), it seems that X randomly gets stuck at 100% CPU, generally when I'm typing text in a textbox or in the mail client, or when I'm scrolling a page.
When that happens, I have the same symptoms as above: machine is locked up, only the mouse pointer moves; I can't restart X nor change VT and the only solution is to reboot. I know that X is the problem since I ran "top" in batch mode with its output redirected to a file -- top continues working after X locks up.
OS: Mandrake 9.0 (XFree86 v.4.2.1-3mdk)
Video Hardware: GF 2 MX400 64 MB
Other Hardware: Asus A7V266, Athlon 1800+, 512 MB DDR.
I've been experiencing this problem too (Dell OEM TNT2 M64), always related to displaying a large picture. I get almost 100% reproducible with Mozilla 1.0.0 and have also encountered the problem when zooming in on pdfs in gv and gs. It might not be the highest priority since I guess it doesn't affect the newer graphics cards, but it would be nice to get this fixed as it means that I never know if a website or email might freeze my system and I like having the 3d accel for games.
I say almost 100% reproducible because when I tried to determine how big an image needed to be in order to cause the problem, I opened increasingly large images in succession and it worked until I did something else and came back to it.
One note: the bug happens with images in Mozilla even if the image is scaled down before display on screen. An example of a huge image that is scaled down and causes the problem is at
http://www.ieee-infocom.org/2003
Please let me know whether you folks are planning on solving this problem, or if there are any workarounds. (I've tried various options in XF86Config but they didn't seem to make any difference).
At least this problem should be mentioned in the errata; maybe then it will get fixed eventually.
Nuitari
06-16-03, 07:26 PM
I use Slackware 9.0 and 4363 drivers (installed from the website by hand) with a Ti4200
I had this bug mainly happen with Netscape 7.02
It happenned a few times over the last few month since I bought the card, however it happens much more often since I purchased a bigger screen for twinview. I run both screen at 1280x1024
I have slackware's Xfree 4.3 installed.
I am currently downloading the latest develsnapshot of Xfree and will report back soon.
Nuitari
06-21-03, 12:31 PM
I returned the card 3 days ago and the problem stopped happenning.
Hi Nuitari,
this seems to be the ONLY true solution to this problem as Nvidia doesn't seem to be interested in our bug reports and complaints. :-(
Sorry guys, less income for you.
Nuitari, what card are you using now?
I tried a Radeon 9500 from ATI, but can't even get the ATI-drivers to work at all.
Bye, Andy.
Nuitari
06-22-03, 12:56 PM
Ti 4200
I exchanged the card, sorry I didn't mean to say I got a different model.
Probably I had a defective card or something.
The ATI Drivers do work, but you really got to know how XF86Config work
Oh, misunderstanding then.
OFF TOPIC:
What do you mean exactly when you say I'd have to know how XF86Config works? I've been using Linux and XFree86 for 10 years now and *think* I understand it pretty well. But the ATI driver drives me nuts. If you have any tipps for me, maybe you can write me a private message at nvidia.andy(at)spiegl.de ?
Thanks,
Andy.
Nuitari
06-23-03, 01:57 PM
Basically special stuff, esp about dual head, is all configured using a bit string.
The biggest problems with the newer ati drivers and the dual head is getting both head at a decent resolution and making it reproductible.
The ebst tip I can offer if you really want to go with ATI is to buy a card that is supported by Xfree's radeon driver (I think its any card < 9000).
I'm sticking with Nvidia.
jonjonsson
06-28-03, 07:01 AM
Hi
my System sometimes hangs too. My Specs :
AMD Athlon 600 Slot A on an Asus K7V (Via KT133), Geforce 2 MX 400, Red-Hat Linux 9 with the latest updates, Ximian-Desktop 2 with the latest updates, Nvidia-Driver V4363. How to reproduce this :
Start OpenOffice Writer => System Hangs, but mouse is still moveable. I noticed the same Bug some weeks ago on Gentoo-Linux (same hardware System), when using KDE. As far as i know, this only happend when antialiasing was turned on, so after turning off antialiasing in the kde-control center my System never hung. But without antialiasing the fonts look so awful that i canīt work with my machine. I will also send this bug-report to nvidias email. If someone has an hint how to stop this bug from happening, please contact me.
Thank you
Jan Kreuzer
richie123
07-23-04, 09:19 PM
I have been experiencing the same type of lock up with my gefore 2 mx.
If you want a garanteed reproducable lockup run the "deluxe" screensaver from the xscreensavers package.
My system is an AMD duron 700, via 686a chipset geforce 2 mx pci, sblive with alsa, fedora core 2, custom kernel 2.6.7 - no acpi, no smp, i686 compiled. also happens with default kernels
If this is a software porblem, its one that's been around a long time, as I have seen it I bought the card in 2002 on mandrake 8.2. This has to be a nvidia problem since the "nv" driver has always been rock solid (but no 3d).
Nvidia needs to seroiously work on fixing their drivers and stop blaming everyone else for their poor QA
My acient TNT 1 is the only nvidia based card that I have ever seen that runs stable with the nvidia linux drivers. I am seriously considering dumping nvidia for ATI on my next vid board purchase
vBulletin® v3.7.1, Copyright ©2000-2012, Jelsoft Enterprises Ltd.