Comment 42 for bug 541511

Revision history for this message
In , Daniel-ffwll (daniel-ffwll) wrote :

> --- Comment #40 from legolas558 <email address hidden> 2010-03-26 07:49:31 PST ---
> In all cases (even 16 whack pages and/or 1000/2000 retries), no more than 2
> failures are found in dmesg (because as you said it gives up after that).

I've overlooked this, but now that I've checked, this is _very_ curious.
With v6 you only ever see 2 chipset flush failures, no matter how hard you
abuse your machine?

With the three dmesgs you've posted, these two failures are always in the
same chipset flush, just opposite directions (gtt->cpu and cpu->gtt
transfers). They'll also coincide with the chipset flush timed out
message. Can you please check that this is indeed the case (with the other
dmesgs you've got lying around) with the other test runs, too? Just
compare the "expected: xxx" value on each of the three backtraces.

This is strange because my code only gives up on the _current_ chipset
flush and doesn't bother to report any further timeouts. It still executes
all chipset flushes and still reports about failed ones. So if your hw
only ever reports one failure where everything fails (timeout+paranoia
check failures in both directions) and never fails again, this would be
_very_ strange indeed.

> I am worried about this fact that our hardware, apparently the same, is not
> showing same behaviour...my .config is here:

I've compared our configs and tried changing a few relevant ones to your
setting. Still can't reproduce your failures.