Comment 233 for bug 263555

Revision history for this message
In , John (john-redhat-bugs) wrote :

It looks like the root cause of this problem has been found. Included here is the work-around for it as well as the reference to the 2.6.28-rc fix for the problem.

>---------- Forwarded message ----------
>From: Steven Rostedt <email address hidden>
>Date: Wed, Oct 15, 2008 at 3:21 PM
>Subject: [PATCH -stable] disable CONFIG_DYNAMIC_FTRACE due to possible
>memory corruption on module unload
>To: LKML <email address hidden>, <email address hidden>
>Cc: Linus Torvalds <email address hidden>, Andrew Morton
><email address hidden>, Arjan van de Ven <email address hidden>,
><email address hidden>, <email address hidden>, Thomas Gleixner
><email address hidden>, Ingo Molnar <email address hidden>
>
>
>
>While debugging the e1000e corruption bug with Intel, we discovered
>today that the dynamic ftrace code in mainline is the likely source of
>this bug.
>
>For the stable kernel we are providing the only viable fix
>patch: labeling
>CONFIG_DYNAMIC_FTRACE as broken. (see the patch below)
>
>We will follow up with a backport patch that contains the
>fixes. But since
>the fixes are not a one liner, the safest approach for now is to
>disable the code in question.
>
>The cause of the bug is due to the way the current code in mainline
>handles dynamic ftrace. When dynamic ftrace is turned on, it also
>turns on CONFIG_FTRACE which enables the -pg config in gcc that places
>a call to mcount at every function call. With just CONFIG_FTRACE this
>causes a noticeable overhead. CONFIG_DYNAMIC_FTRACE works to ease this
>overhead by dynamically updating the mcount call sites into nops.
>
>The problem arises when we trace functions and modules are unloaded.
>The first time a function is called, it will call mcount and the mcount
>call will call ftrace_record_ip. This records the calling site and
>stores it in a preallocated hash table. Later on a daemon will
>wake up and call kstop_machine and convert any mcount callers into
>nops.
>
>The evolution of this code first tried to do this without the
>kstop_machine
>and used cmpxchg to update the callers as they were called. But I
>was informed that this is dangerous to do on SMP machines if another
>CPU is running that same code. The solution was to do this with
>kstop_machine.
>
>We still used cmpxchg to test if the code that we are modifying is
>indeed code that we expect to be before updating it - as a final
>line of defense.
>
>But on 32bit machines, ioremapped memory and modules share the same
>address space. When a module would load its code into memory
>and execute
>some code, that would register the function.
>
>On module unload, ftrace incorrectly did not zap these functions from
>its hash (this was the bug). The cmpxchg could have saved us in most
>cases (via luck) - but with ioremap-ed memory that was exactly
>the wrong
>thing to do - the results of cmpxchg on device memory are undefined.
>(and will likely result in a write)
>
>The pending .28 ftrace tree does not have this bug anymore, as
>a general push
>towards more robustness of code patching, this is done
>differently: we do not
>use cmpxchg and we do a WARN_ON and turn the tracer off if
>anything deviates
>from its expected state. Furthermore, patch sites are
>statically identified
>during build time so there's no runtime discovery of dynamic code areas
>anymore, and no room for code unmaps to cause the hash to
>become out of date.
>
>We believe the fragility of dynamic patching has been sufficiently
>addressed in the development code via the static patching
>method, but further
>suggestions to make it more robust are welcome.
>
>Signed-off-by: Steven Rostedt <email address hidden>
>Acked-by: Ingo Molnar <email address hidden>
>Acked-by: Thomas Gleixner <email address hidden>
>---
> kernel/trace/Kconfig | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
>Index: linux-compile.git/kernel/trace/Kconfig
>===================================================================
>--- linux-compile.git.orig/kernel/trace/Kconfig 2008-10-02
>10:18:49.000000000 -0400
>+++ linux-compile.git/kernel/trace/Kconfig 2008-10-15
>17:29:34.000000000 -0400
>@@ -103,7 +103,8 @@ config CONTEXT_SWITCH_TRACER
> all switching of tasks.
>
> config DYNAMIC_FTRACE
>- bool "enable/disable ftrace tracepoints dynamically"
>+ bool "enable/disable ftrace tracepoints dynamically (BROKEN)"
>+ depends on BROKEN
> depends on FTRACE
> depends on HAVE_DYNAMIC_FTRACE
> default y
>
>--
>To unsubscribe from this list: send the line "unsubscribe
>linux-kernel" in
>the body of a message to <email address hidden>
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>