Go Back   nV News Forums > Linux Support Forums > NVIDIA Linux

Newegg Daily Deals

Reply
 
Thread Tools
Old 02-25-04, 06:55 AM   #25
zander
NVIDIA Corporation
 
zander's Avatar
 
Join Date: Aug 2002
Posts: 3,740
Default

The pci_find_subsys warning comes from a "bottom half" interrupt service routine, the reiserfs stuff is most likely unrelated (i.e. just happened to be executing at that point in time).
zander is offline   Reply With Quote
Old 02-25-04, 07:17 AM   #26
tamran
Registered User
 
Join Date: Feb 2004
Location: Ft. Myers, FL
Posts: 67
Default

I did some snooping around in the kernel code at: drivers/pci/search.c and found the following starting on line 141:

Code:
/**
 * pci_find_subsys - begin or continue searching for a PCI device by vendor/subvendor/device/subdevice id
 * @vendor: PCI vendor id to match, or %PCI_ANY_ID to match all vendor ids
 * @device: PCI device id to match, or %PCI_ANY_ID to match all device ids
 * @ss_vendor: PCI subsystem vendor id to match, or %PCI_ANY_ID to match all vendor ids
 * @ss_device: PCI subsystem device id to match, or %PCI_ANY_ID to match all device ids
 * @from: Previous PCI device found in search, or %NULL for new search.
 *
 * Iterates through the list of known PCI devices.  If a PCI device is
 * found with a matching @vendor, @device, @ss_vendor and @ss_device, a pointer to its
 * device structure is returned.  Otherwise, %NULL is returned.
 * A new search is initiated by passing %NULL to the @from argument.
 * Otherwise if @from is not %NULL, searches continue from next device on the global list.
 *
 * NOTE: Do not use this function anymore, use pci_get_subsys() instead, as
 * the pci device returned by this function can disappear at any moment in
 * time.
 */
struct pci_dev *
pci_find_subsys(unsigned int vendor, unsigned int device,
		unsigned int ss_vendor, unsigned int ss_device,
		const struct pci_dev *from)
{
	struct list_head *n;
	struct pci_dev *dev;

	WARN_ON(in_interrupt());
	spin_lock(&pci_bus_lock);
	n = from ? from->global_list.next : pci_devices.next;

	while (n && (n != &pci_devices)) {
		dev = pci_dev_g(n);
		if ((vendor == PCI_ANY_ID || dev->vendor == vendor) &&
		    (device == PCI_ANY_ID || dev->device == device) &&
		    (ss_vendor == PCI_ANY_ID || dev->subsystem_vendor == ss_vendor) &&
		    (ss_device == PCI_ANY_ID || dev->subsystem_device == ss_device))
			goto exit;
		n = n->next;
	}
	dev = NULL;
exit:
	spin_unlock(&pci_bus_lock);
	return dev;
}
Perhaps it's mearly a matter of the nvidia driver (or something else - agpgart perhaps???) calling the depreciated function pci_find_subsys() ?? This might explain why it's not happening on 2.4 kernels. I personally do not have the ability to use a 2.4 kernel as I am using the AMD64 processor set, and kernel functions I need are only available on 2.6 kernels.

I'm not really a programmer, so I couldn't say for sure what any of this does.
tamran is offline   Reply With Quote
Old 02-25-04, 07:48 AM   #27
zander
NVIDIA Corporation
 
zander's Avatar
 
Join Date: Aug 2002
Posts: 3,740
Default

The context from which pci_find_subsys is called is no longer deemed valid, which is why the warning message is issued; less clear is why the call is being made in the first place (i.e. which error condition was encountered).
zander is offline   Reply With Quote
Old 02-25-04, 08:08 AM   #28
robinr
Registered User
 
Join Date: Jun 2003
Posts: 5
Default

I changed an option "NvAGP" "1". So far so good. i.e. several days of unbroken uptime. That's statistics of course.

The abovementioned errors crop up in the logs though.

Update: 2004-03-04; still working.

Last edited by robinr; 03-03-04 at 06:05 PM.
robinr is offline   Reply With Quote
Old 03-04-04, 06:49 PM   #29
tamran
Registered User
 
Join Date: Feb 2004
Location: Ft. Myers, FL
Posts: 67
Default Any News?

I still get the crashes about every two days on average. Anyone else got any news? Has someone got things to be stable?

How about the following:

Quote:
I changed an option "NvAGP" "1". So far so good. i.e. several days of unbroken uptime. That's statistics of course.

The abovementioned errors crop up in the logs though.

Update: 2004-03-04; still working.
I also found that worked for me on another system, however NVIDIA does not yet provide that support with the AMD64 driver. When will this be an option by NVIDIA on AMD64 systems.

Tamran
tamran is offline   Reply With Quote
Old 03-06-04, 03:32 AM   #30
ByteEnable
Registered User
 
Join Date: Jun 2003
Posts: 24
Default

Quote:
Originally posted by zander
The context from which pci_find_subsys is called is no longer deemed valid, which is why the warning message is issued; less clear is why the call is being made in the first place (i.e. which error condition was encountered).
Hi Zander,

I've read some of your replies in this thread. Although you have very good suggestions you come off as a Pro Windows dweeb. Just because it *works* on Windows doesn't mean that the problem is strictly a Linux problem. You hint at signal integrity issues that are motherboard and or AGP related, I believe this is above you, and unless you site some specific examples with scope shots, you are spreading FUD. Also if you have signal integity problems, it doesn't matter what software you run, its a hardware problem that is always present, waiting for the right cycle to lock your system up. IMHO, Linux stresses a system more than Windows.

I run a custom 2.4 kernel that I modified to add AGPGART support for my Via Apollo Pro266T chipset ( Adding kernel AGP (GART) support for a chipset ) . The NVidia driver does not support my AGP interface, so I have to use the kernel AGPGART to get good performance. I've experienced the X lockup ( X hangs at 99% cpu, telnet in and reboot) for over a year now. Up until a few days ago, I was using a GeForce 2 TI. I now have a GeForce FX 5700 Ultra that behaves the same way. Its random. I can go for days then bam! X hangs at 99%. This is not a hard lockup, this is software locking up. Other people have experienced the same issues I have across multiple hardware platforms and video hardware (NVidia). The common denominator is the NVidia driver.

I would like to run the 2.6 kernel which has support for my AGP interface. I cannot because as soon as the NVidia driver loads, it hard locks the system, no telnet, just the power button. I've looked at the AGP code in the 2.6 kernel and it looks correct (comparing to 2.4). Others have the same problem with different hardware. The Debian guys are pointing the finger at the Nvidia driver and have patches. I run Fedora, so I'm at the mercy of NVidia.

Byte
ByteEnable is offline   Reply With Quote
Old 03-06-04, 05:05 AM   #31
maro
Registered User
 
Join Date: Feb 2004
Location: Holy Roman Empire
Posts: 64
Default

Byte, I am using the same chipset, having a similar problem (not as severe as yours, cf. my post above). [Interestingly but OT, I devised a patch similar to yours to get the chipset recognized in 2.4.]

Can you point me towards the Debian patch you mentioned? Using Slackware myself, but it might give me a clue.

My problem with Nvidia atm is to find any suitable contact to raise this issue. Currently the only way I see is to make the issue known in forums like this...
maro is offline   Reply With Quote
Old 03-06-04, 05:42 AM   #32
zander
NVIDIA Corporation
 
zander's Avatar
 
Join Date: Aug 2002
Posts: 3,740
Default

@ByteEnable: the point I was trying to make is that a personal computer is too complex and modular a piece of machinery to make finger pointing a viable problem solving strategy, as convenient as it may seem. The problems discussed here and in similar threads have no single root cause, there is no magic one-for-all solution. The suggestions I made were intended to help solve common problems or at least narrow down the number of conceivable culprits. I tried to be careful not to cast blaim on anyone or anything specifically, though some of my remarks may be interpreted such if taken out of context. There are Linux specific problems (with a variety of root causes), there is broken hardware in use, there are software bugs (in the NVIDIA driver, no doubt, but elsewhere, as well) and some are self-made (e.g. every other installation HOWTO concludes with instructions on how to forcibly enable AGP fast writes and AGP v2.x side band addressing); this forum alone provides numerous examples for each one of these categories. The bottom line is that a given problem needs to be looked into before a root cause can be determined.

If none of the suggestions above prove effectual, if the problem doesn't seem to be related to specific hardware/software configurations (e.g. the system proves equally unreliable with and without AGP support) and if you are convinced you're suffering from a NVIDIA driver bug, do report it to NVIDIA if you have not done so already.
zander is offline   Reply With Quote

Old 03-06-04, 10:09 AM   #33
ByteEnable
Registered User
 
Join Date: Jun 2003
Posts: 24
Default

Quote:
Originally posted by zander
@ByteEnable: the point I was trying to make is that a personal computer is too complex and modular a piece of machinery to make finger pointing a viable problem solving strategy, as convenient as it may seem. The problems discussed here and in similar threads have no single root cause, there is no magic one-for-all solution.
You hit the nail on the head, root cause. The Linux community and others cannot get to root cause because the drivers are:

1) binary only
2) encrypted to prevent patches (turning on features reserved for high end cards)

As per your suggestion:

1) Is there an NVidia bugzilla?
2) If not, what is the email address for reporting bugs?

You offer no help, you just take up electrons. You remind me of some NVidia Field Application Engineer's I have worked with in the past, always in denial, always somebody elses problem.

maro:

http://minion.de/

Byte
ByteEnable is offline   Reply With Quote
Old 03-06-04, 10:30 AM   #34
zander
NVIDIA Corporation
 
zander's Avatar
 
Join Date: Aug 2002
Posts: 3,740
Default

Whatever.
zander is offline   Reply With Quote
Old 03-06-04, 10:40 AM   #35
SuLinUX
 
SuLinUX's Avatar
 
Join Date: Sep 2003
Location: UK
Posts: 847
Default

ByteEnable

Well Andy Mecham(nvidia) browsers and modurates this forum so I have no doubt he has investigated this thread and the issues. It cannot be any worse than the nvidia driver in Windoze turning my fan off and making playing games useless.

I dont think you should be talking to Zander like that.
__________________
AthlonXP 2600+ / nForce2 Asus A7N8X-X / PNY GeForce FX5900 Ultra / 1024Mb Samsung Ram /nForce Sound / Hansol 920D Plus 19" monitor / Lite-On 32x12x40 / 2x Maxtor HD 40Gb/80Gb / nVidia 7174 driver / Gnome 2.10.1 / Kernel 2.6.11.9 / Slackware 10.0
SuLinUX is offline   Reply With Quote
Old 03-06-04, 12:11 PM   #36
tamran
Registered User
 
Join Date: Feb 2004
Location: Ft. Myers, FL
Posts: 67
Default Why flame?

ByteEnable,

Zander has provided nothing but insight as far as I'm concerned. I have read his posts in other topics in this forum and whether or not he knows what he's talking about (I think the former), his suggestions were an excellent basis for helping a few of us troubleshoot and isolate the problem.

It's easy to jump to conclusions and stamp your feet that things don't work. I wanted to kick my computer a few times ... but with rational thinking (and actions) I have been able to isolate the problem and make clear (in my mind). Also, in the process I've learned a lot about linux and hardware.

This forum is for rational discussions and troubleshooting, not flaming.

....... on another note, I'm curious to see robinr's results to date. Has it still been stable with NvAGP "1"? If so, with everything I've read in this thread and seen myself I would be convinced that there is "some" conflict with the kernel agpgart and nvidia driver, bottom line.

NVIDIA devs, please include the agp driver in the next AMD64 driver release? I have no issues turning IOMMU off in my kernel. I don't have 4gb of ram anyways.

Tamran
tamran is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 03:00 PM.


Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.