Comment 102 for bug 550559

Revision history for this message
Whit Blauvelt (whit-launchpad) wrote :

Found and fixed the problem. In my case the kernel was trying to check the secondary GUID partition table (GPT), where there was a bad sector - in this case in the second-to-last sector on the disk (and outside of all partitions). Except on my system there also was no GUID partition table - no primary GPT, no secondary GPT, as it's using a standard old msdos partition table. Parted has no problem seeing that as the case. But the kernel - some bright programmer thought the kernel should not only check for a primary GPT, but even after finding none there, and even after booting using the msdos partition table, that the kernel should obsess if it finds a bad sector where the secondary GPT would be - if the system even had one - and try to read that sector again and again, thus crippling the system for no good reason at all.

Anyway, "hdparm --repair-sector ######## --yes-i-know-what-i-am-doing /dev/sda" totally fixed the problem.

In summary, GPT support has only shown up in more recent kernels. So if other READ FPDMA QUEUED bug reports are right that it's more recent kernels implicated, there's some chance that this brain-dead GPT code has bitten more people than me. The question would be: is the sector you're having trouble with after or before those allocated to your partitions? If so, that would explain why running e2fsck -c isn't going to fix them, and why the problem will persist even if your disk looks clean. You can as it turns out confirm the bad sector with smartctl, and it's folks on the list for that who pointed me towards the problem with GPT - which of course I wasn't looking for because the system that had the problem has never had a GPT.