PDA

View Full Version : Memory leaks - unstable, crashes etc. pp.


Thilo
12-01-02, 07:23 PM
This is really really a shame - the DRI project was a lot stabler. :( But right now it's also pretty inactive. Leaves us with nVidias with closed source but buggy drivers.

System specs:
I have a Geforce4 Ti4600 with a Via Apollo AGP chipset and AMD Athlon 1,2 Ghz. While I know this can be buggy, i have eleminated many sources of problems: Kernel2.4.19 etc..
I tried to use nvidias internal AGP support as well as the in the kernel included agpgart - nvidia's builtin agp support works better though.
It is both on Debian and Gentoo system this way, both tested under kernel 2.4.19

1. Unstable.
I tried with AGP 2x respective AGP4x .. relatively much does not matter, crashes regardless of the agp drive strength.

2.
After running unreal tournament 2003 for a while, it gets more and more unplayable with every level being loaded. If I check, with free the available memory, there is no memory left anymore. This is, if you count the CACHE with it! The memory is actually used and there is almost nothing cashed. Even if I kill X and unload the nvdriver kernel module, and kill almost all of my processes, I still have all of my 256 megabytes of RAM used up. AND NO! Don't tell me it's all cashed, because it is not! I am looking at the output that "free" generates:

-/+ buffers/cache:

I guess that the nvdriver kernel memory is leaking ALOT of memory, and cause it is part of the kernel this memory does not get freed.

Weird .. my Voodoo3 ran just perfectly .. maybe I should have rather gotten a Radeon.. but on the other hand the always-incomplete status of this driver is also a turn down.

Oh people .. :/

bwkaz
12-02-02, 06:32 AM
Well, you seem to know that Linux caches filesystem data. That's what the "cached" stuff is.

How about, instead of reporting what free tells you, reporting what /proc/meminfo tells you (esp. whether the memory is active or inactive)? cat /proc/meminfo to find out.

Then do a ps aux and look at the VSZ (virtual size) column to see which process is using this memory. If nothing is, then your claim that the kernel is might be valid, but I'd want to see the output of all this stuff first.

If you see it when playing UT2k3 for an extended period of time, could it be possible that UT2k3 has a memory leak?

Thilo
12-02-02, 09:32 AM
Originally posted by bwkaz
Well, you seem to know that Linux caches filesystem data. That's what the "cached" stuff is.

indeed

How about, instead of reporting what free tells you, reporting what /proc/meminfo tells you (esp. whether the memory is active or inactive)? cat /proc/meminfo to find out.

no. -/+ buffers means, that this is the amount of free memory without counting the cache/bufers. example output:

thilo@Thilo thilo $ free
total used free shared buffers cached
Mem: 256404 251956 4448 0 7524 126840
-/+ buffers/cache: 117592 138812
Swap: 530104 124352 405752
thilo@Thilo thilo $

thilo@Thilo thilo $ bc
bc 1.06
Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
251956 - 126840 - 7524
117592


Then do a ps aux and look at the VSZ (virtual size) column to see which process is using this memory. If nothing is, then your claim that the kernel is might be valid, but I'd want to see the output of all this stuff first.

Forgot to tell you about this, but this is what I have seen. The memory is not used by any process.


If you see it when playing UT2k3 for an extended period of time, could it be possible that UT2k3 has a memory leak?


No! All processes will have all of the allocated memory freed per default as soon as they get killed! after ending unreal tournament i should have plenty of memory again, but that's just not the case!

Chalnoth
12-02-02, 11:39 AM
Looks like about half of the memory in use there is cached. I thought I was seeing the same problem, and it turned out that updating the kernel and enabling DMA on my hard drive fixed the problem.

Thilo
12-03-02, 09:55 AM
No. This is not when all of the memory is locked up, but only meant as an example.

With nvidia's buggy driver my system's memory is usually getting locked until it is unusable.

david_reno
12-03-02, 03:26 PM
I posted a similar post to the thread for "official" feedback on the latest nvidia driver. I'm new to these issues, so let me verify something that I think I'm seeing.

Should I immediately upgrade from stock redhat kernel 2.4.18-18.7.x to standard linux kernel 2.4.19 or later?

I'm trying to attack this problem on many fronts, including flashing the GA-7VRXP mainboard's bios to the latest. The mainboard uses the
Via Apollo Pro KT266 chipset, is this problematic?

Thanks for any help,
David

Chalnoth
12-03-02, 04:18 PM
Well, I just updated my kernel using Redhat's up2date software (I think that's what it's called). I haven't touched one of the "stock" linux kernels to date.

Thilo
12-04-02, 06:55 AM
Actually I compiled my own kernels beginning from the 2.2.10 series ... trust me if I tell you I know what I'm speaking of. I like it that nVidia at least makes drivers for Linux, but if they decide to make it closed source, PLEASE MAKE IT STABLE

bwkaz
12-04-02, 11:52 AM
It is stable for some people... :p

For example,

[bilbo@beta bilbo]$ free
total used free shared buffers cached
Mem: 257104 233216 23888 0 14632 56940
-/+ buffers/cache: 161644 95460
Swap: 393552 3448 390104 That 161MB in use is partly from X itself (which has a 27MB RSS according to ps), and mostly from the Distributed Folding client, which has an RSS of around 112MB. Uptime isn't great (only ~22 hrs), but this same setup has run before for almost a month with no *noticeable* (I wasn't specifically looking) memory leakage into the kernel.

This is kernel 2.4.19 off kernel.org, I can attach my .config if you want, X 4.2 from source ("distro" is LFS), and the 3123 drivers from tarballs. With NVreg_EnableVia4x=1, NVreg_EnableAGPSBA=1, and NVreg_EnableAGPFW=1 in the os-registry.c file.

I don't think it'd be hardware, but it's possible -- I've got a KT333 chipset (crappy Biostar motherboard though), Athlon XP1800, 256MB of DDR333 RAM, anything else? hmm.... DMA is on on all my drives. Using agpgart.

Anything else that might be different between your setup and mine?

Thilo
12-04-02, 03:25 PM
The memory leakage only occurs when playing 3D games.

Gentoo with kernel 2.4.19
A few facts:

thilo@Thilo agp $ cat status
Status: Enabled
Driver: NVIDIA
AGP Rate: 4x
Fast Writes: Enabled
SBA: Disabled
thilo@Thilo agp $ cat card
Fast Writes: Supported
SBA: Supported
AGP Rates: 4x 2x 1x
Registers: 0x1f000217:0x1f000114
thilo@Thilo agp $ cat host-bridge
Host Bridge: Via Apollo Pro KT133
Fast Writes: Supported
SBA: Supported
AGP Rates: 4x 2x 1x
Registers: 0x1f000217:0x00000114
thilo@Thilo agp $

bwkaz
12-04-02, 03:38 PM
Hmm, OK. If I had more free time, I'd play more 3D games, but as it is, all I've been able to do on this run is an hour or so of WineX (accelerated Half-Life), and about the same amount of time playing Rune (which, if you're not aware, uses the UT engine).

However, when I had the ~1 month of uptime, I had been playing Descent 3 for probably 20-30 hours, Rune for however long it takes to beat it the second time through (estimate is ~10-20 hrs). There was also ~10 hrs. of Quake 2 (not very taxing, but still), and maybe 5 hrs. or so of the UT2k3 demo. Also a bit of accelerated OpFor (again under Wine)

$ cat status
Status: Enabled
Driver: AGPGART
AGP Rate: 4x
Fast Writes: Enabled
SBA: Enabled
[bilbo@beta agp]$ cat card
Fast Writes: Supported
SBA: Supported
AGP Rates: 4x 2x 1x
Registers: 0x1f000217:0x1f000314
[bilbo@beta agp]$ cat host-bridge
Host Bridge: Via Apollo Pro KT266 [b](<--- this is wrong, but the kernel doesn't recognize it as a KT333, so...)
Fast Writes: Supported
SBA: Supported
AGP Rates: 4x 2x 1x
Registers: 0x1f000217:0x00000314 Don't know if that helps or not...

Original_PQ
12-09-02, 09:59 AM
I was about to confirm the problem, but when I tried to make some test results, I noticed there was no problem!
I don't have the memory leak anymore!

I can't figure out what other I did than adding 'options agpgart agp_try_unsupported=1' to modules.conf.
I got a new mobo (Asus A7V8X) while ago and finally got bored with the agpgart error message.

Some time ago I was able to lock all memory just by starting and stopping an OpenGL app. Eventually all memory was consumed and swapping started like hell.

Now it seems like all allocated memory gets freed.
Even after UT2003 demo.

Perhaps the nVidia drivers have updated... I haven't noticed though.
It's so easy to update packages through Portage :-)

I currently have 3123 drivers and 2.4.19-gentoo-r9 kernel.