Comment 50 for bug 541511

Revision history for this message
In , legolas558 (legolas558) wrote :

(In reply to comment #48)
> > --- Comment #42 from legolas558 <email address hidden> 2010-03-26 14:49:10 PST ---
> > From my dmesg logs:
> > ~~ session1 - v6 patch
> > [ 79.983513] i8xx chipset flush failed, expected: 5807, cpu_read: 5806
> > [ 79.983771] i8xx chipset flush failed, expected: 5807, gtt_read: 5806
> > ~~ session2 - v6 patch
> > [ 101.807650] i8xx chipset flush failed, expected: 14194, cpu_read: 14193
> > [ 101.807844] i8xx chipset flush failed, expected: 14194, gtt_read: 14193
> > ~~ session3 - v5 patch
> > [ 2832.905107] i8xx chipset flush failed, expected: 113457, cpu_read: 113456
> > [ 2832.905315] i8xx chipset flush failed, expected: 113457, gtt_read: 113456
> > [ 2910.626579] i8xx chipset flush failed, expected: 215361, cpu_read: 215360
> > [ 2910.626872] i8xx chipset flush failed, expected: 215361, gtt_read: 215360
> > [ 2977.424469] i8xx chipset flush failed, expected: 308976, cpu_read: 308975
> > [ 2977.424746] i8xx chipset flush failed, expected: 308976, gtt_read: 308975
>
In the session3, v5 might be v2 actually.

> Yet again I was totally blind. All your failed flushes report an actual
> value that's only one off from the expected one. But since v2 I'm moving
> around the check value on the check page, so each position is only used
> every 1024th cache flush. Which means that if the flush doesn't work and
> the old value is still there, it should be "expected_value - 1024".
>
Well I hope this will be useful to improve the patch.

> Furthermore your system seems to be the only one where chipset flushes
> fail in pairs (always both directions in the same chipset flush). I
> haven't seen this on any other dmesg neither by me nor by any other
> tester.
>
Yes, I admit I feel lonely recently...it would be nice to find another guy with my exact hardware.

My only custom option for intel driver in xorg.conf is:
Option "XvMC" "true"

but I doubt this could be relevant.

> In other words I highly suspect that something is (very rarely) corrupting
> the last two bits of a 4 byte block. This would also explain why the
> correct value never shows up, even after extensive gtt whacking.
>
I have tried booting with acpi=off, but seems that KMS depends on ACPI. I can only think that some ACPI or "gone wild" IRQ is causing the corruption, or that there is a broken GTT memory cell as you hypothesized.

> Please test your box with memtest86+. If that doesn't turn anything up
> I'll write a testpatch (memtest86+ doesn't check the gtt, wherein the
> problem might be, too).
>
I completed 2 passes (ECC off) with memtest86+ v4.00 and no errors were found in my 503M (I suppose the missing memory is shadowed). So the corruption might lie in the GTT area, but I don't know how to test that...and if I have understood correctly i855GM is not very handy to make this kind of consistency checks; I am waiting your testpatch because unfortunately I am far from being able to write such GTT testpatch.