Filesystems end up remounted read-only

Bug #515937 reported by William Grant
68
This bug affects 14 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Medium
Unassigned

Bug Description

Sometimes (occasionally after resume, but also after just being left alone for a while) my /home LV will become read-only.

This is a kern.log snippet from a case where I'd left it alone for a few minutes:

[...snip irrelevant boot stuff...]
Feb 2 18:37:18 magrathea kernel: [ 107.120041] wlan0: no IPv6 routers present
Feb 2 20:07:21 magrathea kernel: [ 5510.246995] Monitor-Mwait will be used to enter C-3 state
Feb 2 20:07:21 magrathea kernel: [ 5510.260250] thinkpad_acpi: EC reports that Thermal Table has changed
Feb 2 20:07:21 magrathea kernel: [ 5510.271591] CPU0 attaching NULL sched-domain.
Feb 2 20:07:21 magrathea kernel: [ 5510.271596] CPU1 attaching NULL sched-domain.
Feb 2 20:07:21 magrathea kernel: [ 5510.300364] ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0xe frozen
Feb 2 20:07:21 magrathea kernel: [ 5510.300367] ata1.00: irq_stat 0x00400000, PHY RDY changed
Feb 2 20:07:21 magrathea kernel: [ 5510.300370] ata1: SError: { PHYRdyChg CommWake }
Feb 2 20:07:21 magrathea kernel: [ 5510.300373] ata1.00: failed command: FLUSH CACHE EXT
Feb 2 20:07:21 magrathea kernel: [ 5510.300377] ata1.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Feb 2 20:07:21 magrathea kernel: [ 5510.300378] res 50/00:00:00:00:00/00:00:00:00:00/a0 Emask 0x10 (ATA bus error)
Feb 2 20:07:21 magrathea kernel: [ 5510.300380] ata1.00: status: { DRDY }
Feb 2 20:07:21 magrathea kernel: [ 5510.300383] ata1: hard resetting link
Feb 2 20:07:21 magrathea kernel: [ 5510.311958] CPU0 attaching sched-domain:
Feb 2 20:07:21 magrathea kernel: [ 5510.311966] domain 0: span 0-1 level MC
Feb 2 20:07:21 magrathea kernel: [ 5510.311973] groups: 0 1
Feb 2 20:07:21 magrathea kernel: [ 5510.311985] CPU1 attaching sched-domain:
Feb 2 20:07:21 magrathea kernel: [ 5510.311990] domain 0: span 0-1 level MC
Feb 2 20:07:21 magrathea kernel: [ 5510.311996] groups: 1 0
Feb 2 20:07:22 magrathea kernel: [ 5511.050162] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Feb 2 20:07:22 magrathea kernel: [ 5511.052417] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
Feb 2 20:07:22 magrathea kernel: [ 5511.052427] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
Feb 2 20:07:22 magrathea kernel: [ 5511.052436] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
Feb 2 20:07:22 magrathea kernel: [ 5511.056971] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
Feb 2 20:07:22 magrathea kernel: [ 5511.056980] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
Feb 2 20:07:22 magrathea kernel: [ 5511.056988] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out
Feb 2 20:07:22 magrathea kernel: [ 5511.058819] ata1.00: configured for UDMA/133
Feb 2 20:07:22 magrathea kernel: [ 5511.058829] ata1.00: device reported invalid CHS sector 0
Feb 2 20:07:22 magrathea kernel: [ 5511.064408] ata1.00: configured for UDMA/133
Feb 2 20:07:22 magrathea kernel: [ 5511.064421] end_request: I/O error, dev sda, sector 0
Feb 2 20:07:22 magrathea kernel: [ 5511.064448] ata1: EH complete
Feb 2 20:07:22 magrathea kernel: [ 5511.119184] Aborting journal on device dm-1-8.
Feb 2 20:07:22 magrathea kernel: [ 5511.130652] journal commit I/O error
Feb 2 20:07:22 magrathea kernel: [ 5511.130798] EXT4-fs error (device dm-1): ext4_journal_start_sb: Detected aborted journal
Feb 2 20:07:22 magrathea kernel: [ 5511.130808] EXT4-fs (dm-1): Remounting filesystem read-only
Feb 2 20:07:22 magrathea kernel: [ 5511.131189] ecryptfs_encrypt_page: Error attempting to write lower page; rc = [-30]
Feb 2 20:07:22 magrathea kernel: [ 5511.131198] ecryptfs_write_end: Error encrypting page (upper index [0x0000000000000002])
[...snip lots of eCryptFS error repeats ...]

I think I first noticed this about three weeks ago. I've been rebooting to fix it.

ProblemType: Bug
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: CONEXANT Analog [CONEXANT Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: wgrant 3837 F.... pulseaudio
 /dev/snd/pcmC0D0p: wgrant 3837 F...m pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xfc020000 irq 17'
   Mixer name : 'Conexant CX20561 (Hermosa)'
   Components : 'HDA:14f15051,17aa211c,00100000 HDA:14f12c06,17aa2122,00100000'
   Controls : 14
   Simple ctrls : 7
Date: Tue Feb 2 20:15:17 2010
DistroRelease: Ubuntu 10.04
EcryptfsInUse: Yes
Frequency: Once every few days.
HibernationDevice: RESUME=UUID=3fb818ec-307a-4a71-9b87-7d53e226fcae
InstallationMedia: Error: [Errno 13] Permission denied: '/var/log/installer/media-info'
MachineType: LENOVO 2764CTO
Package: linux-image-2.6.32-12-generic 2.6.32-12.16
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-12-generic root=/dev/mapper/hostname-root ro quiet splash
ProcEnviron:
 LANG=en_AU.utf8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-12.16-generic
Regression: Yes
RelatedPackageVersions: linux-firmware 1.28
Reproducible: No
SourcePackage: linux
TestedUpstream: No
Uname: Linux 2.6.32-12-generic x86_64
dmi.bios.date: 04/22/2009
dmi.bios.vendor: LENOVO
dmi.bios.version: 7VET66WW (2.16 )
dmi.board.name: 2764CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr7VET66WW(2.16):bd04/22/2009:svnLENOVO:pn2764CTO:pvrThinkPadT400:rvnLENOVO:rn2764CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 2764CTO
dmi.product.version: ThinkPad T400
dmi.sys.vendor: LENOVO

Revision history for this message
William Grant (wgrant) wrote :
Andy Whitcroft (apw)
Changed in linux (Ubuntu):
status: New → Triaged
importance: Undecided → High
Revision history for this message
Chase Douglas (chasedouglas) wrote :

This looks similar to an issue followed on this linux-kernel thread:

http://marc.info/?t=123795843900001&r=1&w=2

Please open a bug in the kernel bugzilla (http://bugzilla.kernel.org) to track this issue. When a fix is found upstream we may be able to pull the fix into Ubuntu.

Thanks

Changed in linux (Ubuntu):
status: Triaged → Incomplete
importance: High → Medium
Revision history for this message
GUmeR (marcin-bogdanski) wrote :

I have exactly the same issue, also Lenovo laptop :/

Did you open a bug on kernel bugzilla? If yes, what is bug number?

For now I added following boot options: acpi=off noapic nolapic
source: https://bugzilla.redhat.com/show_bug.cgi?id=462425#c80

But to be honest I have no idea if that will have any affect.

Revision history for this message
Emily Wind (emilywind) wrote :

I have the same issue as well, except I have an ASUS laptop running 64bit Ubuntu with an ext4 partition. Would be glorious to see this fixed, although it is hard to tell what has caused it to be broken.

Revision history for this message
Emily Wind (emilywind) wrote :

The kernel version I am using at the moment is the standard Ubuntu build of 2.6.32.22.

This error occurs about once a day during normal use, which for me includes running a word processor, sometimes a torrent program or sometimes not, empathy and other standard Ubuntu desktop applications. It most commonly happens when I am watching a video, which involves lots of constant data reading and likely increases the chance of meeting the circumstances which cause this bug to occur.

I also replaced the 'errors=remount-readonly' in the fstab entry for my root partition with 'defaults' several occurrences of this issue ago and tune2fs says the error mode should be to continue, but the remounting read-only still happens which definitely makes it seem like a deep-rooted kernel issue. I would love to see this resolved soon, so please let me know if I can give any further data to assist in its resolution.

Revision history for this message
Emily Wind (emilywind) wrote :

It seems this bug is related to https://bugs.launchpad.net/ubuntu/+source/linux/+bug/346691, which seems to randomly affect different 64bit kernel releases and not others. This would explain why the error report on the Ubuntu forums about this dated back to 2008 and such. If the developers looked for a patch pattern within the affected kernels, that would likely be a good start.

I think the reason this recently started affecting me a lot might be due to the latest kernel (2.6.32-22). I did not have the issues as all with kernel 2.6.32-21 as GUmeR reports, so reverting to that is the best bet for avoiding this issue for now. Cheers.

Revision history for this message
GUmeR (marcin-bogdanski) wrote :

Here is thread on Ubuntu forums about this bug: http://ubuntuforums.org/showthread.php?t=1475124. Here is bug report on kernel bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=16006. Looks like it was fixed in newer kernel. Didin't check it though

Revision history for this message
Stuart (stuartneilson) wrote :

This does sound possible - is this commit in a mainline Ubuntu kernel, and if so, which?

Quote from https://bugzilla.kernel.org/show_bug.cgi?id=16006

Comment #4 From Tejun Heo 2010-05-19 14:40:31 -------

FLUSH_EXT timed out which shouldn't happen but can. libata as of 2.6.32
doesn't retry after any FLUSH failure and just returns the error to upper layer
leading to fs ro remounting the device. The reason for the behavior is that
FLUSH failure often indicates (abort by device always does) data loss so
continuing RW operation is likely to cause massive filesystem corruption. As
the behavior caused some spurious failures like this, EH was updated to
distinguish between various FLUSH failure modes and retry unless it's certain
that the device aborted it. So, in short, please upgrade to newer kernel or
tell your distro to backport the update.

Thanks.

Revision history for this message
Stuart (stuartneilson) wrote :

I have been using kernel 2.6.34 http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.34-lucid/ with Lucid and kernel 2.6.33 http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.33/ with Karmic since the 19 May and have not had a single freeze or failure to resume from suspend with either system.

tags: removed: regression-potential
Revision history for this message
alberich (alberich) wrote :

Hello,

I experience that behaviour about once or twice a day with one of my three 64bit machines (always the same). The other two are unaffected, although I run the same Ubuntu 12.04.1 LTS on all machines. The affected one is an i7-2700K CPU @ 3.50GHz. However, among the unaffected is an i7-2670QM CPU @ 2.20GHz. The SMART status of the HD is ok.

If thus bus error happens, I cannot even do a sudo shutdown but have to do a hard reset, which is a quite unsatisfying situation.

al

Revision history for this message
alberich (alberich) wrote :

Hello,
I migrated to debian squeeze on the affected machine. The bug remains.
al

William Grant (wgrant)
Changed in linux (Ubuntu):
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.