Go Back   nV News Forums > Linux Support Forums > NVIDIA Linux

Newegg Daily Deals

Reply
 
Thread Tools
Old 06-09-06, 01:00 AM   #73
jubiliant_ankit
Registered User
 
Join Date: May 2006
Posts: 8
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Thanks Jaxxon for going through my results .

As you suggested lmbench and hackbench are not the benchmarks to test the real time capabilties of the system ,can u please suggest some good tests to actually characterise the real time linux kernel .

Some of the tests I know of are
1> cyclictest.c kind of very standard test used for characterising .
2> rt-tester on Ingo molnar's ftp site .

But the problem behind using both is I don't know how to interpret the results I am getting through these .

And moreover rt-tester I think is written in python ,so I don't know how to execute it.

I will attach both of them in my next post .

Please help me in this regard.

Thanks in anticipation
ankit
jubiliant_ankit is offline   Reply With Quote
Old 06-09-06, 01:14 AM   #74
jubiliant_ankit
Registered User
 
Join Date: May 2006
Posts: 8
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Here are the tests attached with this mail. Please try to execute them if possible.

Thanks in anticipation
ankit
Attached Files
File Type: zip cyclictest.zip (10.8 KB, 148 views)
File Type: zip rt-tester.zip (10.0 KB, 154 views)
jubiliant_ankit is offline   Reply With Quote
Old 06-10-06, 06:52 AM   #75
erDiZz
Registered User
 
Join Date: Jun 2006
Posts: 3
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Hi all.

I've looked briefly at how wbinvd performs on my Pentium D 820 and it looks like it always takes not less than 466000 cycles. Even if I execute several wbinvd instructions in order then each takes not less than this lot of cycles. Prefetching a 4Mb array into the cache with prefetcht0 instruction (covers my CPU's 1Mb cache four times to be sure) makes wbinvd's execution time stable. It may well take more than 800000 cycles without prefetching fake data. I think it takes so long because each cache line has to be examined for dirtiness etc. so wbinvd is always expensive and undesired.

Though in my case there is only one realtime-restricted interrupt: local APIC timer of the bootstrap processor, so one of the possible solutions is to defer wbinvd till the end of the next lapic timer tick thus allowing to work on 5Khz frequency without delays caused by wbinvd. I've tested it and it works. Glxgears shows up with deferred wbinvd showing crapy gears for 1-2 seconds at the start but then works fine. Disabling PAT is of course a more interesting solution.

And not to miss the moment I'd like to thank Bernhard for his article about realtime use of local APIC timer's interrupts which inspired me to write my own implementation. It's SMP-capable, ready to introduce realtime IO-APIC interrupts and contains a very low-level debugging toolkit (hooks in the interrupt descriptor table) which helped me to discover wbinvd problem and to find this thread by 'nvidia wbinvd' request. I was amazed to find Bernhard working on this problem There's a screenshot of how it was at http://gcode.blogspot.com right at the top. I'm going to put it all at sourceforge.net this august.

Best regards, Dmitry.
erDiZz is offline   Reply With Quote
Old 06-10-06, 08:48 AM   #76
JaXXoN
Registered User
 
Join Date: Jul 2005
Location: Munich
Posts: 910
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Quote:
Originally Posted by erDiZz
I've looked briefly at how wbinvd performs on my Pentium D 820 and it looks like it always takes not less than 466000 cycles.
Hi Dmitry!

Thanks for sharing your wbinvd analysis - so it looks like that my
earlier idea of flushing the cache manually by reading in a lot of bulk
data before issuing the wbinvd instruction won't work!

Quote:
Originally Posted by erDiZz
Though in my case there is only one realtime-restricted interrupt: local APIC timer of the bootstrap processor, so one of the possible solutions is to defer wbinvd till the end of the next lapic timer tick thus allowing to work on 5Khz frequency without delays caused by wbinvd. I've tested it and it works.
Cool! True, synchronizing the wbinvd instruction with the real time
interrupt (SA_NODELAY applied) is probably the best solution in this case.

Quote:
Originally Posted by erDiZz
Glxgears shows up with deferred wbinvd showing crapy gears for 1-2 seconds at the start but then works fine.
How exactly did you have done the synchronisation?

Out of my stomach, I would suggest the following procedure:

1. right before the wbinvd instruction, perform a wait_for_completion(),
but only if the local apic timer is in use.

2. before the wait_for_completion(), give the soon waiting
process a high priority (higher than normal interrupt processes,
but lower than application specific real time processes).

3. after the wbinvd instruction, restore the original priority
for the process.

4. in ack_local_apic_timer(), perform a complete().

This would ensure that the wbinvd instruction is invoked
after the local apic timer interrupt was processed (and
after all high priority processes associated with the local
apic timer tick have done their job).

Anyway, this procedure would only work for the "proprietary"
APIC timer interface (mentioned earlier in this thread). I guess
it should be possible to implement a similar scheme for the
standard high resolution timer interface.

Quote:
Originally Posted by erDiZz
Disabling PAT is of course a more interesting solution.
As discussed, this comes with a small 3D performance penalty, but
that souldn't be too much of an issue for most combined
3D/hard-real-time applications. However, i guess the best
variant would be to have the option to disable PAT and/or to
have the option to sync the real time task with the nvidia driver
concerning the wbinvd instruction.

Quote:
Originally Posted by erDiZz
which inspired me to write my own implementation. It's SMP-capable, ready to introduce realtime IO-APIC interrupts
Cool! This APIC timer interface is actually pretty old and i basically
abandoned it after Ingo Molnar came up with his -rt patches and
Thomas Gleixner added the rt-aware HRT interface. However, with
this new scheme, it is only possible to have timed real time user
space processes, but no timed real time timer interrupts (because
all timers are in use). Having such a SA_NODELAY timer interrupt
could be pretty handy in some cases, because the scheduler is stil
causing a lot of overhead/latencies.

So your SMP-capable implementation sounds pretty interessting!

Quote:
Originally Posted by erDiZz
and contains a very low-level debugging toolkit (hooks in the interrupt descriptor table) which helped me to discover wbinvd problem and to find this thread by 'nvidia wbinvd' request. I was amazed to find Bernhard working on this problem There's a screenshot of how it was at http://gcode.blogspot.com right at the top.
Looks pretty comfortable - the way i'm doing the analysis is pretty
spartanic :-) I guess having such a tool could get more ppl. doing
real-time analysis, eliminating potential remaining latency issues.

Quote:
Originally Posted by erDiZz
I'm going to put it all at sourceforge.net this august.
Thanks for sharing - I'm looking forward!

regards

Bernhard
JaXXoN is offline   Reply With Quote
Old 06-10-06, 10:18 AM   #77
erDiZz
Registered User
 
Join Date: Jun 2006
Posts: 3
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Quote:
Originally Posted by JaXXoN
Hi Dmitry!

How exactly did you have done the synchronisation?

Out of my stomach, I would suggest the following procedure:

1. right before the wbinvd instruction, perform a wait_for_completion(),
but only if the local apic timer is in use.

2. before the wait_for_completion(), give the soon waiting
process a high priority (higher than normal interrupt processes,
but lower than application specific real time processes).

3. after the wbinvd instruction, restore the original priority
for the process.

4. in ack_local_apic_timer(), perform a complete().
Well, it looks like at the moment everything is fine, glxgears starts normally, doom3 plays fine and Xgl works as it should. Probably I've incidentally fixed the issue while fighting with routine problems.

I did the trick this way: the caller of wbinvd() increments "wbinvd desired" counter and busy-loops until the counter is set back to its original value. Realtime interrupt path checks the counter right before 'iret' and if it is nonzero performs wbinvd and decrements the counter.

To be clear:

Code:
/* Guaranteed contiguous 4 Mb, could be easily found elsewhere */
char rti_my_cache_bulk [4194304];

void rti_wbinvd (void)
{
        unsigned long flags;

        /* This irq_save disables non-privileged interrupts only,
         * i.e. IF flag is kept set and an external flag is set indicating that
         * non-realtime interrupts must be queued for this processor */
        local_irq_save (flags);

        if (rti_rt_on &&
            smp_processor_id () == 0)
        {
                int n = rti_wbinvd_pending;

                /* Prefetch a lot of data into the cache so that wbinvd
                 * will take less time (but still a lot) */
                asm volatile (
                        "movl $rti_my_cache_bulk, %%eax;"
                        "1:"
                        "prefetcht0 (%%eax);"
                        "addl $4, %%eax;"
                        "cmpl $(rti_my_cache_bulk + 4194301), %%eax;"
                        "jb 1b;"
                        : : : "cc", "eax", "memory");

                rti_wbinvd_pending ++;
                while (rti_wbinvd_pending != n)
                        __asm__ __volatile__ ("rep; nop" : : : "memory");
        } else
                __asm__ __volatile__ ("wbinvd" : : : "memory");

        local_irq_restore (flags);
}
entry.S, realtime interrupt path:

Code:
[...]
        cli
        cmpl $0, rti_wbinvd_pending
        je   2f
        decl rti_wbinvd_pending
        wbinvd
2:      iret
Quote:
Originally Posted by JaXXoN
Cool! This APIC timer interface is actually pretty old and i basically
abandoned it after Ingo Molnar came up with his -rt patches and
Thomas Gleixner added the rt-aware HRT interface. However, with
this new scheme, it is only possible to have timed real time user
space processes, but no timed real time timer interrupts (because
all timers are in use). Having such a SA_NODELAY timer interrupt
could be pretty handy in some cases, because the scheduler is stil
causing a lot of overhead/latencies.
As far as I understand -rt kernels are not suitable for automotive applications, at least to-be-reached 100 microseconds target doesn't sound good for my particular application. Realtime interrupts are already there in several incarnations though: RTLinux, RTAI/ADEOS, Fiasco/L4Linux (DROPS system). I'm doing my own because it's a hack that gives more skill and freedom to experiment and I find my approach (hooking into IDT) quite promising: it has less to do with Linux and more with a realtime hypervisor.

Quote:
Originally Posted by JaXXoN
Looks pretty comfortable - the way i'm doing the analysis is pretty
spartanic :-) I guess having such a tool could get more ppl. doing
real-time analysis, eliminating potential remaining latency issues.

Thanks for sharing - I'm looking forward!
I hope it will be usefull too. With this in mind I've separated tracing code from realtime in my patch so that it would possible to issue non-realtime tracing-only variant which is less intrusive and is easier to apply to other kernels. It's August because my June is full of examinations and the whole July I'll be out for obligatory military training.

Regards, Dmitry.
erDiZz is offline   Reply With Quote
Old 06-14-06, 01:31 AM   #78
jubiliant_ankit
Registered User
 
Join Date: May 2006
Posts: 8
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Hello Jaxxon,
Sorry but I didn't get any replies from you on my previous post.
Please help me in this as I am quite a novice to this field.

And also if you some kind of documentation as to what real time features are built in 2.6.16 kernel ,then please send it to me or attach it with your next post on this forum.

You can mail me at ankit_jain@iitb.ac.in.

Thanks in anticipation
ankit
jubiliant_ankit is offline   Reply With Quote
Old 06-14-06, 05:55 AM   #79
JaXXoN
Registered User
 
Join Date: Jul 2005
Location: Munich
Posts: 910
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Quote:
Originally Posted by jubiliant_ankit
And also if you some kind of documentation as to what real time features are built in 2.6.16 kernel ,then please send it to me or attach it with your next post on this forum.
There are little provisions for hard real time in the vanilla kernel, but there
are quite a number of real time enhancements available as seperate
patches such as RTAI, RTLinux and many others. I guess, you could
write books on that subject :-) I recommend typing "Linux" and "realtime"
in google for details - you will find tons of information.

BTW.: did you applied the realtime preemption patch before you did
the measurements above? (available at
http://people.redhat.com/mingo/realtime-preempt/older/)

regards

Bernhard
JaXXoN is offline   Reply With Quote
Old 06-14-06, 07:17 AM   #80
jubiliant_ankit
Registered User
 
Join Date: May 2006
Posts: 8
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Thanks Jaxxon for your prompt reply.
I have tested the 2.6.16 kernel with the rt-29 patch of Ingo Molnar.
The results I got are::

Interrrupt Latency :: 24.62 micro seconds ( through APIC timer)
Scheduler Latency :: maximum was 2 milli seconds. (through rttest.c)

Other results of benchmarks i.e. lmbench and hackbench are attached with this mail.

Please have a look at them and comment on validity of those results .

Thanks in anticipation
ankit
Attached Files
File Type: txt lmbench.txt (3.7 KB, 157 views)
File Type: txt hackbench_results.txt (1.0 KB, 160 views)
jubiliant_ankit is offline   Reply With Quote

Old 06-14-06, 08:25 AM   #81
JaXXoN
Registered User
 
Join Date: Jul 2005
Location: Munich
Posts: 910
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Quote:
Originally Posted by jubiliant_ankit
Scheduler Latency :: maximum was 2 milli seconds.
I assume you have the apictimer patch applied in addition to -rt29?

Please note that in this case the HRT interface is limited to the
system clock tick resolution and a 2 milliseconds worst case scheduler
latency is plausible. Please try rttest again without apictimer patch
applied in order to get (hopefully) sub 100 microseconds results.

regards

Bernhard
JaXXoN is offline   Reply With Quote
Old 06-16-06, 02:05 AM   #82
jubiliant_ankit
Registered User
 
Join Date: May 2006
Posts: 8
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Hello Jaxxon ,
Thanks for the information. I tried the test again and I got 460us as the scheduler latency with APIC timer support disabled.

Can you also please comment on the lmbench and hackbench results.

And also one more request. I have figured out what real time features are added through rt-29 patch by comparing the menuconfig options of vanilla and patched kernel . I have made a text file of those features with their help ( what are those features about).

But my problem is ,I want to know what of those features are architecture dependent and architecture independent.
As in do all of these features applies to arm ,ppc, mips,i386 ???

If you can make this out it would be of a great help to me.
Please find that txt file attached with this mail.

Thanks in anticipation
ankit
Attached Files
File Type: txt real_time_features_rt29.txt (22.9 KB, 213 views)
jubiliant_ankit is offline   Reply With Quote
Old 06-29-06, 11:23 AM   #83
JaXXoN
Registered User
 
Join Date: Jul 2005
Location: Munich
Posts: 910
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Quote:
Originally Posted by erDiZz
I did the trick this way: the caller of wbinvd() increments "wbinvd desired" counter and busy-loops until the counter is set back to its original value. Realtime interrupt path checks the counter right before 'iret' and if it is nonzero performs wbinvd and decrements the counter.
Hi Dmitry!

Can you please describe your hardware setup in more detail
(mainboard, CPU(s)), espcially if it is a multicore/SMP system?

I'm currently trying to implement a similar scheme for the
non-apictimer realtime kernel (using the HR timer interface)
but without the need to modify the kernel.

Unfortnuatly, i failed so far to synchronize the wbinvd instruction
with the real time application. The situation is that on a multicore
(SMP) system, wbinvd needs to be performed on both cores.
I added two high priority kernel threads (one running on each
core) that are woken up by the user space real time application
through an ioctl call. The threads will then perform the wbinvd
instruction and when both are complete, then the nvidia driver
proceeds.

I'm not sure if your implementation would work on an SMP system!?

regards

Bernhard
JaXXoN is offline   Reply With Quote
Old 07-18-06, 06:56 AM   #84
erDiZz
Registered User
 
Join Date: Jun 2006
Posts: 3
Default Re: [PATCH, REALTIME] nvidia-1.0-8178 and Linux-2.6.16-rt11

Hello, JaXXoN.

As I've said before I am far from home at one-month military training near St. Petersburg, almost unable to work on anything else but handling legs correctly while passing an officer.

In my implementation a spinlock is held to synchronise wbinvd instructions, it works fine with SMP. Seems like I've already forgot the details, so I can't help with any advice till the end of this month.

Regards, Dmitry.
erDiZz is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 11:18 AM.


Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.