View Single Post
Old 09-16-02, 08:32 PM   #1
idangazit
Guest
 
Posts: n/a
Exclamation APM (nonsuspend) hard-lockups - I READ THE FAQ

Hey all,

This isn't the first time, but the old forums were wiped, and we have a new driver, so I'm calling this to the attention of those nvidia developers trolling around

I (and several others that I've spoken to in the past) have experienced random hard-freezes with APM *enabled* in the kernel but without doing any sort of apm action (e.g. suspend, etc).

This has been going on since I've had my geforce2go (dell inspiron 4100), all of the driver versions I've used, all of the kernel versions I've used. Unilaterally stops when passing apm=off to the kernel at boot time. Annoying random hard (read: must hold power switch) lockups with apm enabled. I've not tried to compile in the magic-SysRq-key to the kernel and try it, perhaps that would somehow manage to kill X or restart the box.

AGP is not the culprit. AGP enabled, disabled, no difference. Internal or AGPGart no difference. Nothing.

So, basically the above statement has happened with the following version numbers around:

Xfree 4.2.0 (binary release from xfree86.org using xinstall.sh)
Kernel 2.4.17 -> 2.4.19
Nvidia drivers 2313 and up

I had a protracted email conversation with Andy Ritger (nvidia employee) about this some time ago and even pointed out a recurring, reproducible bug. It was for gnome 1.4, when running the xscreensaver capplet and browsing the previews of different savers, several among them caused the hard-lockup phenomena. Since then, I've compiled gnome2 and am running it, xscreensaver-demo doesn't freeze anymore, but I still manage to get hard locks at inopportune moments, mostly when working with mozilla, although it happens generally when doing some significant amount of scrolling in any app. This includes (but isn't limited to) scrolling in pull-down menus that are too large for the screen.

A recent side-effect I've been noticing (which may or may not be connected to the driver) is the fact that occasionally apm (when enabled) fails to generate battery status, listing 0x0's in the entries of /proc/apm. This comes and goes, too -- OK 1 minute, 0x0 the next. When status does come back, it's in the right place, e.g. the battery isn't thinking it's empty or something. Seeing as I have no trouble with battery reporting under windows, it is confined to the linux side. Said battery reporting issues do not happen when not in X, and I'm not relying on an intermediary to check this, I've been looking at /proc/apm directly. My guess is that it has something to do with the nvidia driver.

The reason this bug is relevant and important for laptop users is that without APM there's no way to read battery status at all, and none of the basic power-conservation features inherent in APM work, i.e. cpu clock speed management for speedstep pIII's or amd what-have-you. There's no way to know how much battery life is remaining short of rebooting into The Other OS. This is a serious usability problem, even without considering that battery life under non-APM linux is significantly shorter because the good powersaving features are disabled.

I appreciate that there are all manner of goals for the development of the driver, and that it isn't going to be open-sourced anytime soon. So basically I'm asking to kick this bug up a notch in terms of visibility and priority -- I'm not asking for suspend-resume support. All I want is working apm compatibility.

If there's some tool you'd like me to run I'd be happy. I don't quite know how to go about triaging hard-freezes, I don't think it even has time to dump core when it dies.

Thanks,

Idan
  Reply With Quote