Go Back   nV News Forums > Linux Support Forums > NVIDIA Linux

Newegg Daily Deals

Reply
 
Thread Tools
Old 04-03-06, 09:19 PM   #37
JaXXoN
Registered User
 
Join Date: Jul 2005
Location: Munich
Posts: 910
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

@Zander

Thanks for patiently answering my ignorant posts! However, i'm now completly

dazzed and confused, but trying to continue:

From what i see in the kernel sources, the only files within the drivers
sub-directory that are using change_page_attr() are "char/agp/intel-agp.c"
and "char/agp/generic.c" (means: no hard disk or network driver does change
caching behaviour of memory allocated for DMA). Also, change_page_attr() is
not used in dma_alloc_consistent(), nor there are other hints concerning caching
attributs as far as i can see (however, could be certainly somewhere deeply
hidden in the header files). Also, i haven't experienced any other high
latencies when only hard drive and network is in operation, so how can these
drivers keep cached memory consistent without wbinvd? At one MIPS
platform i worked some times ago, they used the trick that the whole physical
memory (RAM) was mapped twice, once cached at 0x80000000 and once
uncached at 0xa000000 (or was it vice versa? doesn't matter), so you could
easily decide if you like to access physical memory cached or uncached by
adding an certain offset. However, this special mechanism needed to
be taken care of by the platform specific drivers.

Anyway, in linux/asm-i386/dma.h, i changed MAX_DMA_ADDRESS from
PAGE_OFFSET+0x1000000 (16MByte) to PAGE_OFFSET+0x10000000 (256 MByte).

The kernel now tells (as expected):
Code:
  DMA zone: 65536 pages, LIFO batch:15
  DMA32 zone: 0 pages, LIFO batch:0
  Normal zone: 163840 pages, LIFO batch:31
  HighMem zone: 32752 pages, LIFO batch:7
glxgears is now again at 16000FPS on display 1+2. I guess the the smaller
number experienced, earlier, was caused by a higher interrupt load due
to a limited DMA buffer for the first card.

Anyway, the system still freezes.

When the X-Server starts, i get a couple of the following messages:

Code:
NVRM: bad caching on address 0xc0426000: actual 0x163 != expected 0x173
NVRM: please see the README section on Cache Aliasing for more information
NVRM: bad caching on address 0xc0427000: actual 0x163 != expected 0x173
NVRM: bad caching on address 0xc0426000: actual 0x163 != expected 0x173
[...]
and when the X-Server starts looping, then i get:

Code:
NVRM: Xid: 6, PE0000 0400 ff000000 00009328 00000000 ff989ea6
If if my assumption that the DMA zone is uncached is true, then still,
cache_page_attr() is doing something different ....

As always, any patentient feedback is highly, highly appreciated :-)

regards

Bernhard
JaXXoN is offline   Reply With Quote
Old 04-04-06, 06:15 AM   #38
JaXXoN
Registered User
 
Join Date: Jul 2005
Location: Munich
Posts: 910
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Quote:
Originally Posted by zander
the GFP_DMA zone isn't uncached by default
What i mean is: when __get_free_pages() is called with GFP_DMA, then
the new pages will be taken from the first 16 Mbyte and mapped uncached.

When calling __get_free_pages() without GFP_DMA, then the pages will be
mapped cached and preferably taken from other zones, but when these
zones can't fullfill the request, then the pages will be taken form the
DMA zone (but also mapped cached).

Does that make sense?

regards

Bernhard
JaXXoN is offline   Reply With Quote
Old 04-04-06, 06:42 AM   #39
zander
NVIDIA Corporation
 
zander's Avatar
 
Join Date: Aug 2002
Posts: 3,740
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

@JaXXoN: correct, in the core kernel, only AGPGART uses change_page_attr() currently.

When AGPGART (or NvAGP, for that matter) allocates AGP memory, it allocates it in 4KB pages and maps these pages into the AGP aperture via the GART; accesses to the memory through the aperture (either from the GPU or, if the chipset suppports it, from the CPU) will bypass any caches, so if a cached mapping of the same memory exists, or existed and the CPU caches weren't flushed, it's possible that stale cache lines are written back asynchronously after the memory has been written to (by the driver) via the aperture. If a cached mapping exists, especially if it uses large pages, speculative CPU accesses to the kernel mapping near the memory mapped into the aperture can make this problem worse. The GPU will read undefined data in this case, which typically results in stability problems. On PCI-E, system memory is frequently allocated to be mapped into user-space with a WC memory type, which translates to a situation analogous to the one described for AGP memory above.

When you allocate memory with __get_free_pages(), its kernel mapping will be cached by default, regardless of the zone you allocate from. The "bad caching" warning messages in the case you posted were printed by the driver when it found that the pages' kernel mappings' PTE entries hadn't been updated with the _PAGE_PCD flag, i.e. that the mapping were still cached. The performance increase you observed with a larger GFP_DMA zone probably were due to allocations succeeding that failed previously. Unless you disable PAT support on a PCI-E system, incurring a (potentially significant, as I said earlier) performance penalty and possibly other problems in the future, you really can't get around using change_page_attr()/global_flush_tlb().

Note that most PCI(-E) devices use snooped transactions, so the above isn't a problem for them.
zander is offline   Reply With Quote
Old 04-04-06, 07:38 AM   #40
JaXXoN
Registered User
 
Join Date: Jul 2005
Location: Munich
Posts: 910
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Quote:
Originally Posted by zander
Note that most PCI(-E) devices use snooped transactions, so the above isn't a problem for them.
Ok, that makes things clearer! But just to be 100% sure we are talking
about the same thing, please let me tell my understanding of "PCI
snooping" as i learned on some PPC platform:

* when a PCI card starts a DMA transfer *from* main memory
(to the card), the PCI host controllers informs the cache controller
to flush any cache lines affected prior to the transfer (this will
stall the DMA transfer until the affected cache lines are flushed).

* when a PCI card is doing DMA transfer *to* main memory, then
the PCI host controller invalidates all caches lines affected after the
transfer.

On PPC, this is basically not a good strategy, because the fine grained
manual cache flushing/invalidation (flush_cache_range(),
invalidate_cache_range(), etc.) is less time consuming then PCI snooping,
because of potential DMA stalls (at least on the platform i worked with
on that topic).

I didn't considered that x86 might use snooping as i learned that this
is a "bad thing" :-) So i do now understand that the pages allocated
with GFP_DMA are still cached and taken from the first 16 MByte
("DMA zone") and the PCI host controller will ensure memory coherence.

Altough you explained it in detail i don't yet understand why snooping
doesn't work for PCIe nvidia chips - sorry for my ignorance.

Anyway: would it be a solution to add a "UC zone" of a certain
size that is exlusivle used for calls to __get_free_pages() with
"GFP_UC" applied which will allocate uncached pages in that zone?
(i guess this is what you indicated in some post, earlier).

regards

Bernhard
JaXXoN is offline   Reply With Quote
Old 04-04-06, 08:20 AM   #41
JaXXoN
Registered User
 
Join Date: Jul 2005
Location: Munich
Posts: 910
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Quote:
Originally Posted by zander
On PCI-E, system memory is frequently allocated to be mapped into user-space with a WC memory type
I think i start understanding the situation:

First, cached pages are mapped to kernel space in nv_vm_malloc_pages().
Then later on, these pages are mapped with WC enabled to user space using
NV_REMAP_PAGE_RANGE(). Means, there are two virtual pages for each
physical page with different page caching attributes, right? I guess i need
to refer to the IA32 manuals to figure out what are the exact impacts on
the PCI snooping in this case and how to possible fix it.

To be on the save side, i guess both mappings would need to be done
uncached. This would very likely come with a 3D performance
penalty, but it's not uncommen for real time system to sacrifice
throughput in favour of real time behaviour.

regards

Bernhard
JaXXoN is offline   Reply With Quote
Old 04-04-06, 08:36 AM   #42
zander
NVIDIA Corporation
 
zander's Avatar
 
Join Date: Aug 2002
Posts: 3,740
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

@JaXXoN: by snooping, I was referring to the cache support mechanisms described in the PCI specification. This is supported by NVIDIA PCI/PCI-E GPUs, but there are cases where non-snooped transactions are used by PCI-E GPUs to improve performance; for AGP transactions, the AGP specification doesn't require hardware enforced coherency. Note that even in the latter two cases, snooped PCI transactions are performed.

To answer your question, I suppose a GFP_UC zone would help if you were primarily concerned with the additional latency incurred by the cached to uncached transitions, but it's not clear how to integrate such a zone with the zone allocator. Also, since such a zone isn't generally useful, I think it'd be difficult to argue for it. An alternative would be to allocate a sufficient amount of UC memory up-front in the driver and to service UC memory allocation requests with a suballocator. I don't think that'd be generally useful for the NVIDIA Linux graphics driver either, though.
zander is offline   Reply With Quote
Old 04-04-06, 08:42 AM   #43
zander
NVIDIA Corporation
 
zander's Avatar
 
Join Date: Aug 2002
Posts: 3,740
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Quote:
Originally Posted by JaXXoN
First, cached pages are mapped to kernel space in nv_vm_malloc_pages().
Then later on, these pages are mapped with WC enabled to user space using
NV_REMAP_PAGE_RANGE(). Means, there are two virtual pages for each
physical page with different page caching attributes, right? I guess i need
to refer to the IA32 manuals to figure out what are the exact impacts on
the PCI snooping in this case and how to possible fix it.

To be on the save side, i guess both mappings would need to be done
uncached. This would very likely come with a 3D performance
penalty, but it's not uncommen for real time system to sacrifice
throughput in favour of real time behaviour.
Correct, on PCI-E systems, WC memory is allocated uncached (i.e. allocated with __get_free_pages() and its kernel mapping updated with change_page_attr()) and then mapped into user-space using remap_(page|pfn)_range() with the WC memory type (this requires PAT support; if PAT support is absent, driver-internal WC system memory allocations will fail). At this point the two mappings will have compatible memory types, i.e. UC and WC (uncached, write-combining). The alternative would be WB (write-back cached) and WB. If you omit the change_page_attr() step, you'll have WB and WC, an illegal combination.
zander is offline   Reply With Quote
Old 04-04-06, 09:37 AM   #44
JaXXoN
Registered User
 
Join Date: Jul 2005
Location: Munich
Posts: 910
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Quote:
Originally Posted by zander
by snooping, I was referring to the cache support mechanisms described in the PCI specification.
Not fully sure if this is the same thing i discribed and will need to check in
detail. However, i guess the result is the same: pages are mapped
cached and some magic will ensure that data written by CPU
residing in cache will make its way to the PCI card (or vice verca).


Quote:
Originally Posted by zander
To answer your question, I suppose a GFP_UC zone would help if you were primarily concerned with the additional latency incurred by the cached to uncached transitions
IMHO, the -rt patch is only of limited use when wbinvd is causing
high latencies up to several hundred microseconds. It would probably
be good enough for the professional audio community, but definitly
a no-no for a broad range of automatization applications.


Quote:
Originally Posted by zander
but it's not clear how to integrate such a zone with the zone allocator.
Also, since such a zone isn't generally useful, I think it'd be difficult
to argue for it.
Apart the question on how to add the zone and how to modify
__get_free_pages() to deliver uncached pages, i guess it would
make sense to have a kernel boot option that defines the size of
the "UC zone" (default = 0).


Quote:
Originally Posted by zander
An alternative would be to allocate a sufficient amount of UC memory up-front in the driver and to service UC memory allocation requests with a suballocator. I don't think that'd be generally useful for the NVIDIA Linux graphics driver either, though.
I guess adding a seperate allocator in the nvidia diver would be quite
an implementation effort.

regards

Bernhard
JaXXoN is offline   Reply With Quote

Old 04-04-06, 09:55 AM   #45
JaXXoN
Registered User
 
Join Date: Jul 2005
Location: Munich
Posts: 910
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Quote:
Originally Posted by zander
The alternative would be WB (write-back cached) and WB. If you omit the change_page_attr() step, you'll have WB and WC, an illegal combination.
You mean, it would be sufficient to make sure that nv_vmap_vmalloc()
re-maps the kernel pages WB instead of WC? However, i guess it would
still be necessary to take care about flushing the pte entries changed
with set_pte()? But to avoid a full "wbinvd", do you think using a bunch
of "invlpg" instructions instead would be sufficient? (this instruction
invalidates an existing TLB entry for a given page).

regards

Bernhard
JaXXoN is offline   Reply With Quote
Old 04-04-06, 10:42 AM   #46
zander
NVIDIA Corporation
 
zander's Avatar
 
Join Date: Aug 2002
Posts: 3,740
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

@JaXXoN: no, updating the open-source interface layer to just always retain the kernel mappings' WB memory type and to map WC memory to user-space with the WB memory type would be insufficient. As I said earlier, however, if you have a PCI-E system, you could disable PAT support, in which case driver-internal WC memory allocation attempts would fail and cache coherency would no longer be a problem. Depending on the system/configuration, the performance penalty incurred may be acceptable for your purposes.
zander is offline   Reply With Quote
Old 04-04-06, 08:09 PM   #47
JaXXoN
Registered User
 
Join Date: Jul 2005
Location: Munich
Posts: 910
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

@Zander

So if i understand it correctly, then i need to

a) disable pat support so that pages mapped to user space by
nv_vmap_vmalloc() will be WB rather than WC.

b) omit nv_set_page_attrib_cached() in nv_rm_malloc_pages() and
nv_set_page_attrib_uncached() in nv_rm_free_pages() to keep
pages WB rather than UC.

c) omit nv_flush_caches() in nv_rm_malloc_pages() and
nv_set_page_attrib_uncached() - not necessary since
change_page_attr() is not called.

How about nv_flush_caches() in nv_vmap_vmalloc()?
I guess it would be sufficient to use __flush_tlb_one()
per page, instead.

I will try that out and let you know the results!


Anyway, thanks again for your patience ... i learned a lot about
MMU and cache in the (crappy) x86 architecture these days :-)

regards

Bernhard
JaXXoN is offline   Reply With Quote
Old 04-05-06, 06:17 AM   #48
zander
NVIDIA Corporation
 
zander's Avatar
 
Join Date: Aug 2002
Posts: 3,740
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

@JaXXoN: i think it's sufficient to disable PAT support, i.e. to load the NVIDIA Linux kernel module with nv_disable_pat=1, the driver ought to take care of the rest. Note that nv_vmap_vmalloc() creates a virtually contiguous kernel mappings of individual pages, when necessary, user mappings are created via mmap(2) and nv_kern_mmap().
zander is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 01:20 AM.


Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.