Consistent repeating [ata1: link is slow to respond, please be patient ]

Bug #297058 reported by Ramon Buckland
156
This bug affects 36 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Won't Fix
Undecided
Miguel A. Alvarado V.

Bug Description

I have a Dell Latitude D630.

With Ubuntu 8.04 and now with 8.10 I consistently receive the following error message(s).

Currently I am running an Hitachi 80GB SATA drive, I have also tried a Seagate 80G and another Hitachi 80G hard drive and all seem to exhibit the problem.

Nov 12 12:01:57 itasca kernel: [ 1226.128187] ata1: link is slow to respond, please be patient (ready=0)
Nov 12 12:02:02 itasca kernel: [ 1231.112167] ata1: device not ready (errno=-16), forcing hardreset
Nov 12 12:02:02 itasca kernel: [ 1231.112191] ata1: soft resetting link
Nov 12 12:02:02 itasca kernel: [ 1231.292629] ata1.00: configured for PIO3
Nov 12 12:02:02 itasca kernel: [ 1231.292662] ata1: EH complete
Nov 12 12:08:27 itasca kernel: [ 1616.132171] ata1: link is slow to respond, please be patient (ready=0)
Nov 12 12:08:32 itasca kernel: [ 1621.116175] ata1: device not ready (errno=-16), forcing hardreset
Nov 12 12:08:32 itasca kernel: [ 1621.116199] ata1: soft resetting link
Nov 12 12:08:32 itasca kernel: [ 1621.296651] ata1.00: configured for PIO3
Nov 12 12:08:32 itasca kernel: [ 1621.296684] ata1: EH complete
Nov 12 12:11:43 itasca kernel: [ 1812.128172] ata1: link is slow to respond, please be patient (ready=0)
Nov 12 12:11:48 itasca kernel: [ 1817.112176] ata1: device not ready (errno=-16), forcing hardreset
Nov 12 12:11:48 itasca kernel: [ 1817.112202] ata1: soft resetting link
Nov 12 12:11:48 itasca kernel: [ 1817.292446] ata1.00: configured for PIO3
Nov 12 12:11:48 itasca kernel: [ 1817.292462] ata1: EH complete
Nov 12 12:13:03 itasca kernel: [ 1892.128178] ata1: link is slow to respond, please be patient (ready=0)
Nov 12 12:13:08 itasca kernel: [ 1897.120164] ata1: device not ready (errno=-16), forcing hardreset
Nov 12 12:13:08 itasca kernel: [ 1897.120187] ata1: soft resetting link
Nov 12 12:13:08 itasca kernel: [ 1897.300640] ata1.00: configured for PIO3
Nov 12 12:13:08 itasca kernel: [ 1897.300671] ata1: EH complete
Nov 12 12:15:11 itasca kernel: [ 2020.069207] ata1.00: limiting speed to PIO0
Nov 12 12:15:16 itasca kernel: [ 2025.108176] ata1: link is slow to respond, please be patient (ready=0)
Nov 12 12:15:21 itasca kernel: [ 2030.092169] ata1: device not ready (errno=-16), forcing hardreset
Nov 12 12:15:21 itasca kernel: [ 2030.092194] ata1: soft resetting link
Nov 12 12:15:21 itasca kernel: [ 2030.272660] ata1.00: configured for PIO0
Nov 12 12:15:21 itasca kernel: [ 2030.272690] ata1: EH complete

The message repeats approximately every 2 minutes.

I hoped that upgrading to 8.10 would resolve the issue but it has not, with the same problem seen when I ran 8.04

rbuckland@itasca:~$ uname -a
Linux itasca 2.6.27-7-generic #1 SMP Tue Nov 4 19:33:06 UTC 2008 x86_64 GNU/Linux

rbuckland@itasca:~$ cat /proc/version
Linux version 2.6.27-7-generic (buildd@crested) (gcc version 4.3.2 (Ubuntu 4.3.2-1ubuntu11) ) #1 SMP Tue Nov 4 19:33:06 UTC 2008

rbuckland@itasca:~$ lspci
00:00.0 Host bridge: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub (rev 0c)
00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 0c)
00:02.1 Display controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 0c)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 02)
00:1c.1 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 2 (rev 02)
00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 6 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation 82801HEM (ICH8M) LPC Interface Controller (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801HBM/HEM (ICH8M/ICH8M-E) IDE Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801HBM/HEM (ICH8M/ICH8M-E) SATA AHCI Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 02)
03:01.0 CardBus bridge: O2 Micro, Inc. Cardbus bridge (rev 21)
03:01.4 FireWire (IEEE 1394): O2 Micro, Inc. Firewire (IEEE 1394) (rev 02)
09:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5755M Gigabit Ethernet PCI Express (rev 02)
0c:00.0 Network controller: Intel Corporation PRO/Wireless 4965 AG or AGN [Kedron] Network Connection (rev 61)

There does not seem to be any consensus on what causes this issue, or how to resolve it. There is a lot of similarities to other bugs posted.

Things which have been tried
- change BIOS to AHCI from ATA (for the controller)
- upgrade the BIOS to the latest from Dell (July 2008)
- have a CD in the drive

Regards
Ramon

ProblemType: Bug
Architecture: amd64
DistroRelease: Ubuntu 8.10
Package: linux-image-2.6.27-7-generic 2.6.27-7.16
ProcCmdLine: root=LABEL=ithaca ro quiet splash
ProcEnviron:
 PATH=/home/username/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/opt/jdk/bin:/opt/apache-maven/bin:/opt/eclipse
 LANG=en_AU.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.27-7.16-generic
SourcePackage: linux

Tags: apport-bug
Revision history for this message
Ramon Buckland (ramon-thebuckland) wrote :
Revision history for this message
Kamil Wilczek (kamil-van-wilczek) wrote :

Hello,

I have very similar problem: when I'm shutting down or reseting system (Ubuntu 8.10 AMD64) sometimes, often but not always this takes something about one minute or two, with similar messages, but I do not see those again and again, but once, two times, something like this:

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata1.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
         res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1: link is slow to respond, please be patient (ready=0)
ata1: device not ready (errno=-16), forcing hardreset
ata1: soft resetting link
ata1: nv_mode_filter: 0x701f&0x701f->0x701f, BIOS=0x7000 (0xc000c600) ACPI=0x701f (60:900:0x11)
ata1.00: configured for UDMA/33
ata1: EH complete
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata1.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
         cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1: link is slow to respond, please be patient (ready=0)
ata1: device not ready (errno=-16), forcing hardreset
ata1: soft resetting link
ata1: nv_mode_filter: 0x701f&0x701f->0x701f, BIOS=0x7000 (0xc000c600) ACPI=0x701f (60:900:0x11)
ata1.00: configured for UDMA/33
ata1: EH complete

I'm also using Hitachi hard drive. My hardware is ASUS A6m-Q024 notebook. Uname -a:

Linux kamil-laptop 2.6.27-7-generic #1 SMP Tue Nov 4 19:33:06 UTC 2008 x86_64 GNU/Linux

I'm also attaching some info.

Greetings!

Revision history for this message
Kamil Wilczek (kamil-van-wilczek) wrote :

Syslog.txt is only from last hour, here also dmesg and lspci.

Revision history for this message
Kamil Wilczek (kamil-van-wilczek) wrote :
Revision history for this message
Giordano Battilana (jordan83) wrote :
Download full text (4.0 KiB)

This bug affects me too.

I have an Asus EEE Box that I use as home server (with Ubuntu Server 8.10).
Sometimes the server stops working, I think because it cannot access the hard drive (Seagate ST9160310AS).

I got this impression from the fact that services such as apache and ssh respond to user input but they cannot proceed beyond the point where they have to load data from the hd.
For example, ssh asks me the password but he cannot proceed beyond that.

After the reboot I always read messages like these:

Mar 25 02:23:40 fatso -- MARK --
Mar 25 02:43:40 fatso -- MARK --
Mar 25 02:45:01 fatso kernel: [208911.040236] ata1: link is slow to respond, please be patient (ready=0)
Mar 25 02:45:06 fatso kernel: [208916.022648] ata1: device not ready (errno=-16), forcing hardreset
Mar 25 02:45:06 fatso kernel: [208916.022683] ata1: soft resetting link
Mar 25 02:45:07 fatso kernel: [208917.301094] ata1.00: configured for UDMA/133
Mar 25 02:45:07 fatso kernel: [208917.301157] ata1: EH complete
Mar 25 02:45:07 fatso kernel: [208917.302249] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
Mar 25 02:45:07 fatso kernel: [208917.302688] sd 0:0:0:0: [sda] Write Protect is off
Mar 25 02:45:48 fatso kernel: [208917.303605] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 25 02:45:48 fatso kernel: [208952.360177] ata1: link is slow to respond, please be patient (ready=0)
Mar 25 02:45:48 fatso kernel: [208957.350166] ata1: device not ready (errno=-16), forcing hardreset
Mar 25 02:45:48 fatso kernel: [208957.350200] ata1: soft resetting link
Mar 25 02:45:48 fatso kernel: [208957.963604] ata1.00: configured for UDMA/133
Mar 25 02:45:48 fatso kernel: [208957.963662] ata1: EH complete
Mar 25 02:45:48 fatso kernel: [208957.975045] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
Mar 25 02:45:48 fatso kernel: [208957.975179] sd 0:0:0:0: [sda] Write Protect is off
Mar 25 02:45:48 fatso kernel: [208957.975405] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 25 02:46:23 fatso kernel: [208993.010175] ata1: link is slow to respond, please be patient (ready=0)
Mar 25 02:46:28 fatso kernel: [208997.990181] ata1: device not ready (errno=-16), forcing hardreset
Mar 25 02:46:28 fatso kernel: [208997.990213] ata1: soft resetting link
Mar 25 02:46:28 fatso kernel: [208998.391096] ata1.00: configured for UDMA/133
Mar 25 02:46:28 fatso kernel: [208998.391159] ata1: EH complete
Mar 25 02:46:28 fatso kernel: [208998.392151] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
Mar 25 02:46:28 fatso kernel: [208998.392575] sd 0:0:0:0: [sda] Write Protect is off
Mar 25 02:46:28 fatso kernel: [208998.393564] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 25 02:56:20 fatso kernel: [209589.702645] ata1: link is slow to respond, please be patient (ready=0)
Mar 25 02:56:26 fatso kernel: [209594.680132] ata1: device not ready (errno=-16), forcing hardreset
Mar 25 02:56:26 fatso kernel: [209594.680164] ata1: soft resetting link
Mar 25 02:56:26 fatso kernel: [209596.061012] ata1.00: configured for UDMA/133
Mar 25 02...

Read more...

Revision history for this message
Giordano Battilana (jordan83) wrote :
Revision history for this message
Wolm (torben-wolm) wrote :

I have the exact same issue with an Asus EEE Box, also with Ubuntu Server 8.10.

Sometimes the server runs for months before this happening. Other times only a single day. I have two identical boxes, but only the one of them (the busyiest) is showing this behaviour. The busy one has an external USB harddisk attached, if that matters.

Does anyone have an idea what to look for? I cannot see anything special in the messages-log. Besides suddenly the messages about "link is slow to respond" appears. It is usually in the morning this happens (around 7:32).

Revision history for this message
Giordano Battilana (jordan83) wrote :

Any news about this bug?
I found some posts (like this one http://ubuntuforums.org/archive/index.php/t-956040.htm ) where people claim that the "irqpoll" kernel option might be the solution to this issue.
I tested it but the problem did not disappear :-\

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi Roman,

Since you are the original bug reporter, would you be willing to test the latest Karmic 9.10 Alpha release and confirm if this issue remains? ISO CD images can be found at http://cdimage.ubuntu.com/releases/karmic/ . Please let us know your results. If the issue remains, can you attach an updated dmesg output? Thanks!

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Wolm (torben-wolm) wrote :

I stopped using the USB-drive (and taking backup!), and have not had any breakdowns since. But I'm not sure the USB drive was the problem or it was caused by the massive disk-activity during rsync. Needless to say this is not a good solution (to stop doing backups), but I do not dare crash the system over and over again because of this problem.

I can only test it on production systems as I have no spare EEE boxes, so I'm not that happy to test the Alpha-version...

But when the release version of 9.10 comes out, I will test it again.

Thanks.

Revision history for this message
Giordano Battilana (jordan83) wrote :

I would like to add that *apparently* one of the latest ubuntu 9.04 kernel upgrades seems to have solved the issue, at least for me.
My EEE Box has been running flawlessly for about 2 months now (maybe more) and I am confident that the problem has gone.

$ uname -a
Linux fatso 2.6.28-15-server #49-Ubuntu SMP Tue Aug 18 19:30:06 UTC 2009 i686 GNU/Linux

I'll let you know if the error pops up again.

Revision history for this message
Ramon Buckland (ramon-thebuckland) wrote : Re: [Bug 297058] Re: Consistent repeating [ata1: link is slow to respond, please be patient ]
Download full text (6.7 KiB)

Hi Leann,

I am downloading it this evening and will take a look. I'll let you know.

Ramon

On Wed, Sep 2, 2009 at 22:50, Leann Ogasawara <<email address hidden>
> wrote:

> Hi Roman,
>
> Since you are the original bug reporter, would you be willing to test
> the latest Karmic 9.10 Alpha release and confirm if this issue remains?
> ISO CD images can be found at http://cdimage.ubuntu.com/releases/karmic/
> . Please let us know your results. If the issue remains, can you
> attach an updated dmesg output? Thanks!
>
> ** Changed in: linux (Ubuntu)
> Status: New => Incomplete
>
> --
> Consistent repeating [ata1: link is slow to respond, please be patient ]
> https://bugs.launchpad.net/bugs/297058
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “linux” package in Ubuntu: Incomplete
>
> Bug description:
> I have a Dell Latitude D630.
>
> With Ubuntu 8.04 and now with 8.10 I consistently receive the following
> error message(s).
>
> Currently I am running an Hitachi 80GB SATA drive, I have also tried a
> Seagate 80G and another Hitachi 80G hard drive and all seem to exhibit the
> problem.
>
> Nov 12 12:01:57 itasca kernel: [ 1226.128187] ata1: link is slow to
> respond, please be patient (ready=0)
> Nov 12 12:02:02 itasca kernel: [ 1231.112167] ata1: device not ready
> (errno=-16), forcing hardreset
> Nov 12 12:02:02 itasca kernel: [ 1231.112191] ata1: soft resetting link
> Nov 12 12:02:02 itasca kernel: [ 1231.292629] ata1.00: configured for PIO3
> Nov 12 12:02:02 itasca kernel: [ 1231.292662] ata1: EH complete
> Nov 12 12:08:27 itasca kernel: [ 1616.132171] ata1: link is slow to
> respond, please be patient (ready=0)
> Nov 12 12:08:32 itasca kernel: [ 1621.116175] ata1: device not ready
> (errno=-16), forcing hardreset
> Nov 12 12:08:32 itasca kernel: [ 1621.116199] ata1: soft resetting link
> Nov 12 12:08:32 itasca kernel: [ 1621.296651] ata1.00: configured for PIO3
> Nov 12 12:08:32 itasca kernel: [ 1621.296684] ata1: EH complete
> Nov 12 12:11:43 itasca kernel: [ 1812.128172] ata1: link is slow to
> respond, please be patient (ready=0)
> Nov 12 12:11:48 itasca kernel: [ 1817.112176] ata1: device not ready
> (errno=-16), forcing hardreset
> Nov 12 12:11:48 itasca kernel: [ 1817.112202] ata1: soft resetting link
> Nov 12 12:11:48 itasca kernel: [ 1817.292446] ata1.00: configured for PIO3
> Nov 12 12:11:48 itasca kernel: [ 1817.292462] ata1: EH complete
> Nov 12 12:13:03 itasca kernel: [ 1892.128178] ata1: link is slow to
> respond, please be patient (ready=0)
> Nov 12 12:13:08 itasca kernel: [ 1897.120164] ata1: device not ready
> (errno=-16), forcing hardreset
> Nov 12 12:13:08 itasca kernel: [ 1897.120187] ata1: soft resetting link
> Nov 12 12:13:08 itasca kernel: [ 1897.300640] ata1.00: configured for PIO3
> Nov 12 12:13:08 itasca kernel: [ 1897.300671] ata1: EH complete
> Nov 12 12:15:11 itasca kernel: [ 2020.069207] ata1.00: limiting speed to
> PIO0
> Nov 12 12:15:16 itasca kernel: [ 2025.108176] ata1: link is slow to
> respond, please be patient (ready=0)
> Nov 12 12:15:21 itasca kernel: [ 2030.092169] ata1: device not ready
> (errno=-16), forcing hardreset
> Nov 12 12:15:...

Read more...

Revision history for this message
Giordano Battilana (jordan83) wrote :

Today my EEE Box stopped responding to my inputs.
After a forced reboot I had a look in the logs, expecting the usual error messages.
Instead I found this error line only in /var/log/syslog:

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen

I don't know if this is related to this bug report but I suspect so...

Cheers

Revision history for this message
Giordano Battilana (jordan83) wrote :

The problem is definitely there again :(((
2-3 months of non-stopping functioning of the EEE box fooled me into thinking that the problem was gone.
I might decide to try the beta version of karmic and in case I'll let you know if the issue is gone (which I very much hope so.. :(( )

Cheers

Revision history for this message
Wolm (torben-wolm) wrote : Re: [Bug 297058] Re: Consistent repeating [ata1: link is slow to respond, please be patient ]

Thanks for the info! It is a really annoying problem...

Hopefully it is gone in the next version.

Thanks,
Torben

On 2009-09-17, at 21:37, Giordano wrote:

> The problem is definitely there again :(((
> 2-3 months of non-stopping functioning of the EEE box fooled me into
> thinking that the problem was gone.
> I might decide to try the beta version of karmic and in case I'll
> let you know if the issue is gone (which I very much hope so.. :(( )
>
> Cheers
>

Revision history for this message
mkis62 (mihaikx62) wrote :

Same here on Acer TravelMate with the latest Karmic 9.10, updated daily.
Hope fore lucky updates?

Revision history for this message
Ricardo Teixeira (ricardo-ctrler) wrote :

I had my disk sleeping with hdparm -S 12 (60 seconds). It has no activity (and it's unmounted) so I should stay sleeping for very long time. Even so, the disk is constantly waking up (every 15 minutes or so) and then sleeping again and I get these errors on dmesg:

[160245.988251] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[160245.988266] ata1.00: cmd b0/d1:01:00:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
[160245.988268] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[160245.988273] ata1.00: status: { DRDY }
[160246.492039] ata1: soft resetting link
[160246.673890] ata1.00: configured for UDMA/100
[160246.673926] ata1: EH complete
[162046.000069] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[162046.000085] ata1.00: cmd b0/d1:01:00:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
[162046.000087] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[162046.000092] ata1.00: status: { DRDY }
[162046.508034] ata1: soft resetting link
[162046.689875] ata1.00: configured for UDMA/100
[162046.689912] ata1: EH complete

The strange thing is that the disk wakes up, gives this error, then goes immediately to sleep, in like 10 seconds, not respecting the 60 seconds sleep time in hdparm.

Revision history for this message
John Talbot (jwtalbot) wrote :

Effecting me too!...
;(

Revision history for this message
John Talbot (jwtalbot) wrote :

Forgot to add..
Latest Karmic as of today!

Revision history for this message
Wolm (torben-wolm) wrote :
Download full text (3.4 KiB)

So. It just happened again. My server crashed. This time I am sure it
has nothing to do with the USB drive I had since it is no longer attached.

It seems to be some unfortunate timing of a kernel(?) problem and
heavy disk use.

I just suddenly get these messages in the log:

Oct 23 00:56:13 matrix kernel: [14573759.262982] ata1: link is slow to respond, please be patient (ready=0)
Oct 23 00:56:13 matrix kernel: [14573764.242683] ata1: device not ready (errno=-16), forcing hardreset
Oct 23 00:56:13 matrix kernel: [14573764.242721] ata1: soft resetting link
Oct 23 00:56:13 matrix kernel: [14573765.081129] ata1.00: configured for UDMA/133
Oct 23 00:56:13 matrix kernel: [14573765.081188] ata1: EH completeOct 23 00:56:13 matrix kernel: [14573765.082422] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
Oct 23 00:56:13 matrix kernel: [14573765.126583] sd 0:0:0:0: [sda] Write Protect is off
Oct 23 00:56:53 matrix kernel: [14573765.127506] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Which just repeat themselves until about 01:19 and then it goes quiet until a final logging at
7:54 where the server finally crashes (just stops to respond to network requests, keyboard a.s.o.)

I just checked the kern.log, which has a lot of entries of:

Oct 23 00:54:12 matrix kernel: [14573754.220270] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Oct 23 00:56:13 matrix kernel: [14573754.220348] ata1.00: cmd ca/00:50:14:9f:8d/00:00:00:00:00/e1 tag 0 dma 40960 out
Oct 23 00:56:13 matrix kernel: [14573754.220352] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Oct 23 00:56:13 matrix kernel: [14573754.220465] ata1.00: status: { DRDY }
Oct 23 00:56:13 matrix kernel: [14573759.262982] ata1: link is slow to respond, please be patient (ready=0)
Oct 23 00:56:13 matrix kernel: [14573764.242683] ata1: device not ready (errno=-16), forcing hardreset
Oct 23 00:56:13 matrix kernel: [14573764.242721] ata1: soft resetting linkOct 23 00:56:13 matrix kernel: [14573765.081129] ata1.00: configured for UDMA/133
Oct 23 00:56:13 matrix kernel: [14573765.081188] ata1: EH complete
Oct 23 00:56:13 matrix kernel: [14573765.082422] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
Oct 23 00:56:13 matrix kernel: [14573765.126583] sd 0:0:0:0: [sda] Write Protect is off
Oct 23 00:56:13 matrix kernel: [14573765.126598] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00Oct 23 00:56:53 matrix kernel: [14573765.127506] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

This adds some more info about an exception?

Searching for theses entries, gives a lot of people reporting the same problem:

And probably a solution: http://ubuntuforums.org/showthread.php?t=1145513
(The guy on that post wonders why there hasn't been many reports on this issue...)

Also:
https://bugzilla.redhat.com/show_bug.cgi?id=462425
https://bugzilla.redhat.com/show_bug.cgi?id=404851
http://lkml.org/lkml/2008/11/9/22
http://forums.fedoraforum.org/showthread.php?t=219746

I'm running kernel 2.6.27-11-server. Someone suggest to run kernel-rt instead:

https://bugs.launchpad.net/ubuntu/+sour...

Read more...

Revision history for this message
evuraan (evuraan) wrote :

affects me too. I thought it was my PATA 2.5" HDD and replaced 2. Still happens on my 3rd drive.

[ 6015.274312] ata1.00: status: { DRDY ERR }
[ 6015.274316] ata1.00: error: { ABRT }
[ 6015.274399] ata1.00: NODEV after polling detection
[ 6015.274403] ata1.00: revalidation failed (errno=-2)
[ 6015.274407] ata1: failed to recover some devices, retrying in 5 secs
[ 6017.775854] ata1: soft resetting link
[ 6017.898195] ata1.00: configured for UDMA/100
[ 6017.898219] ata1: EH complete
[ 6017.915044] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
[ 6017.915188] sd 0:0:0:0: [sda] Write Protect is off
[ 6017.915192] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[ 6017.915555] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 6059.238436] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 6059.238445] ata1.00: BMDMA stat 0x65
[ 6059.238453] ata1.00: cmd ca/00:08:f1:02:0c/00:00:00:00:00/e0 tag 0 dma 4096 out
[ 6059.238455] res 51/04:08:71:02:0c/04:01:18:19:3f/60 Emask 0x1 (device error)
[

Revision history for this message
Christian Niederreiter (cndg) wrote :

This bug is annoying to me since Ubuntu 7.04 and affects accesses to the hard drive periodically (my computer: Samsung R65 notebook). Until 9.04 I was able to avoid it by disabling the CD drive polling (sudo hal-disable-polling --device /dev/sr0), CD drive seems to influence the occurrence, but since I installed Karmic (yesterday) this is without effect.

Nov 9 01:05:10 r65 kernel: [ 6597.046016] ata1: link is slow to respond, please be patient (ready=0)
Nov 9 01:05:15 r65 kernel: [ 6602.029064] ata1: device not ready (errno=-16), forcing hardreset
Nov 9 01:05:15 r65 kernel: [ 6602.029080] ata1: soft resetting link
Nov 9 01:05:15 r65 kernel: [ 6602.233383] ata1.00: configured for UDMA/100
Nov 9 01:05:15 r65 kernel: [ 6602.249391] ata1.01: configured for UDMA/33
Nov 9 01:05:15 r65 kernel: [ 6602.257800] ata1: EH complete

Revision history for this message
Christian Niederreiter (cndg) wrote :

Update: I managed to install a new firmware version as described in https://bugs.launchpad.net/linux/+bug/75295/comments/97 . Since my last reboot hard disk accesses did not freeze any more. So this might have been sufficient.

Revision history for this message
Wolm (torben-wolm) wrote :

I'm now running 2.6.27-15-server, and the error is still there.

My server crashed November 23rd, November 25th, November 26th and November 29th.

So upgrading the kernel actually made it worse!

A Linux Server crashing every other day...

Revision history for this message
headlessspider (headlessspider) wrote :

running 9.10 linux kernel 2.6.31-16-generic
hard drive is 160 gig wd drive (laptop)

my system just "pauses" like it was considering the secrets of the universe and after anywhere from 5 - 45 seconds just continue with what it was doing. syslog and messages excerpts below. the drive is about two months old.

syslog:
Dec 17 08:10:00 ren kernel: [519506.000132] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 17 08:10:00 ren kernel: [519506.000162] ata1.01: cmd a0/00:00:00:08:00/00:00:00:00:00/b0 tag 0 pio 16392 in
Dec 17 08:10:00 ren kernel: [519506.000166] cdb 4a 01 00 00 10 00 00 00 08 00 00 00 00 00 00 00
Dec 17 08:10:00 ren kernel: [519506.000170] res 40/00:03:00:00:00/00:00:00:00:00/b0 Emask 0x4 (timeout)
Dec 17 08:10:00 ren kernel: [519506.000178] ata1.01: status: { DRDY }
Dec 17 08:10:00 ren kernel: [519511.040076] ata1: link is slow to respond, please be patient (ready=0)
Dec 17 08:10:00 ren kernel: [519516.024087] ata1: device not ready (errno=-16), forcing hardreset
Dec 17 08:10:00 ren kernel: [519516.024106] ata1: soft resetting link
Dec 17 08:10:00 ren kernel: [519516.257478] ata1.00: configured for UDMA/100
Dec 17 08:10:00 ren kernel: [519516.288399] ata1.01: configured for UDMA/33
Dec 17 08:10:00 ren kernel: [519516.296848] ata1: EH complete

messages:
Dec 17 08:10:00 ren kernel: [519511.040076] ata1: link is slow to respond, please be patient (ready=0)
Dec 17 08:10:00 ren kernel: [519516.024087] ata1: device not ready (errno=-16), forcing hardreset
Dec 17 08:10:00 ren kernel: [519516.024106] ata1: soft resetting link
Dec 17 08:10:00 ren kernel: [519516.257478] ata1.00: configured for UDMA/100
Dec 17 08:10:00 ren kernel: [519516.288399] ata1.01: configured for UDMA/33
Dec 17 08:10:00 ren kernel: [519516.296848] ata1: EH complete

Revision history for this message
giorgio_fornara (giorgio-fornara) wrote :

Same as headlessspider on:
asus a8J
karmic 9.10
kernel 2.6.31-17-generic
adding the option "irqpoll" to GRUB gives NO changes, freezes stills the same.

here below the syslog

Dec 21 19:04:05 my2912071352 kernel: [ 410.000241] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 21 19:04:05 my2912071352 kernel: [ 410.000264] ata1.01: cmd a0/00:00:00:00:00/00:00:00:00:00/b0 tag 0
Dec 21 19:04:05 my2912071352 kernel: [ 410.000267] cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Dec 21 19:04:05 my2912071352 kernel: [ 410.000270] res 40/00:03:00:00:00/00:00:00:00:00/b0 Emask 0x4 (timeout)
Dec 21 19:04:05 my2912071352 kernel: [ 410.000277] ata1.01: status: { DRDY }
Dec 21 19:04:10 my2912071352 kernel: [ 415.040118] ata1: link is slow to respond, please be patient (ready=0)
Dec 21 19:04:15 my2912071352 kernel: [ 420.024123] ata1: device not ready (errno=-16), forcing hardreset
Dec 21 19:04:15 my2912071352 kernel: [ 420.024138] ata1: soft resetting link
Dec 21 19:04:16 my2912071352 kernel: [ 420.284569] ata1.00: configured for UDMA/100
Dec 21 19:04:16 my2912071352 kernel: [ 420.316407] ata1.01: configured for UDMA/33
Dec 21 19:04:16 my2912071352 kernel: [ 420.324541] ata1: EH complete
Dec 21 19:04:46 my2912071352 kernel: [ 450.989301] ata1.01: limiting speed to UDMA/25:PIO4
Dec 21 19:04:46 my2912071352 kernel: [ 450.989311] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 21 19:04:46 my2912071352 kernel: [ 450.989332] ata1.01: cmd a0/00:00:00:00:00/00:00:00:00:00/b0 tag 0
Dec 21 19:04:46 my2912071352 kernel: [ 450.989335] cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Dec 21 19:04:46 my2912071352 kernel: [ 450.989337] res 40/00:03:00:00:00/00:00:00:00:00/b0 Emask 0x4 (timeout)
Dec 21 19:04:46 my2912071352 kernel: [ 450.989345] ata1.01: status: { DRDY }
Dec 21 19:04:51 my2912071352 kernel: [ 456.028118] ata1: link is slow to respond, please be patient (ready=0)
Dec 21 19:04:56 my2912071352 kernel: [ 461.012120] ata1: device not ready (errno=-16), forcing hardreset
Dec 21 19:04:56 my2912071352 kernel: [ 461.012135] ata1: soft resetting link
Dec 21 19:04:57 my2912071352 kernel: [ 461.216567] ata1.00: configured for UDMA/100
Dec 21 19:04:57 my2912071352 kernel: [ 461.248405] ata1.01: configured for UDMA/25
Dec 21 19:04:57 my2912071352 kernel: [ 461.256540] ata1: EH complete

Revision history for this message
giorgio_fornara (giorgio-fornara) wrote :

linux RT is not a solution for my laptop. see bug #279693.

Revision history for this message
giorgio_fornara (giorgio-fornara) wrote :
Revision history for this message
Fredrik Wahlberg (wahlis) wrote :

I have the exact same problem. Asus EEE Box 202 with a Seagate ST9160310AS. Every 6-8 weeks for the last year I've had a crash. There is no pattern to it and it does not seem to be connected to heavy loads.

Revision history for this message
headlessspider (headlessspider) wrote :

fyi. i just updated to linux kernel 2.6.31-17-generic and the system still "pauses".

not sure if this is because i'm using ext4 now.

Revision history for this message
giorgio_fornara (giorgio-fornara) wrote :

with today's last update the laptop is not more experiencing HD freezes,
maybe a temporary good combination between kernel and other modules.
Anyway NO kernel update as occoured since last bugs, only other modules updates.
hope still stable...
linux 2.6.31-9-rt
using ext3 filesystem

Revision history for this message
giorgio_fornara (giorgio-fornara) wrote :

freezes continues....
a very stupid possible solution is to leave a CD in the CD reader: seems to works, no freezes occurs after.
I saw this idea on another post on the same issue.
but why? does this information doesn't give any clues to developers?

Revision history for this message
chuck-dtol (chuck-colford) wrote :

My apologies in advance if I'm violating any protocols - I'm a bit of a newbie on Linux, but a long-time geek. I may have some info that is useful. If not - you may disregard.

I've been running Ubuntu on my server since version 6. I switched to Ubuntu on all my 3 desktop systems last fall and upgraded to 9.10. Key for me was success with VirtualBox so I could migrate over with a few of my old windows Apps intact. All worked well. About a week before Christmas (2009), I had a hard disk corruption on my root ext3 partition. It was a real pain, but I had full backups and did a restore. I became paranoid and learned how to read and watch the logs. My faith in Linux was somewhat shaken. It didn't seem to be a hard drive failure.

Alas - I had another corruption a week ago. In my logs - I had the dreaded pattern discussed above. My sample (edited for brevity):

kernel: warning: `VirtualBox' uses 32-bit capabilities (legacy support in use)
kernel: device eth0 entered promiscuous mode
kernel: ata1: EH in SWNCQ mode,QC:qc_active 0xFFFC0 sactive 0xFFFC0
kernel: ata1: SWNCQ:qc_active 0x40 defer_bits 0xFFF80 last_issue_tag 0x6
kernel: dhfis 0x40 dmafis 0x0 sdbfis 0x0
kernel: ata1: ATA_REG 0x51 ERR_REG 0x4
kernel: ata1: tag : dhfis dmafis sdbfis sacitve
kernel: ata1: tag 0x6: 1 0 0 1
kernel: ata1.00: exception Emask 0x1 SAct 0xfffc0 SErr 0x0 action 0x6 frozen
kernel: ata1.00: Ata error. fis:0x41
kernel: ata1.00: cmd 61/08:30:5f:4e:77/00:00:1a:00:00/40 tag 6 ncq 4096 out
kernel: res 51/04:08:5f:4e:77/04:00:1a:00:00/40 Emask 0x1 (device error)
kernel: ata1.00: status: { DRDY ERR }
kernel: ata1.00: error: { ABRT }
...
kernel: ata1.00: status: { DRDY }
kernel: ata1: hard resetting link
kernel: ata1: nv: skipping hardreset on occupied port
kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
kernel: ata1.00: configured for UDMA/133
kernel: ata1: EH complete

I have been experimenting to get to the bottom of this so I can trust my filesystem. I have tried 3 different SATA drives, different SATA cables, a different PSU and even a different plugin SATA controller. I still see these SATA link errors on my dual Core AMD 64 bit system.

What I have found (on my system) is a strong correlation between these SATA errors and the use of VirtualBox (I'm using VirtualBox version PUEL v3.1.2). My logs are quiet for many days - until I start doing moderate IO on my Win32 XP Guest OS. Then I see SATA errors on my Linux (Dual Core AMD 64 bit) host. I've seen this on Kernels 2.6.31-17-generic x86_64 and back to 2.6.31-14.

By any chance - are those of you with this problem running VirtualBox and noticing this host SATA problem occurs or gets aggravated when you are doing file IO in the guest? If so - you are not alone. Again - sorry if this info is unhelpful.

Revision history for this message
headlessspider (headlessspider) wrote :

chuck: nope i don't run virtualbox. so i do not think that is the root of this.

Revision history for this message
chuck-dtol (chuck-colford) wrote :

Headlessspider: Thanks. I guess I'm looking for a different needle in the haystack. Good luck finding yours.

Revision history for this message
headlessspider (headlessspider) wrote :

chuck: well, it helps also. we know its both related to disk access and it doesn't matter if its a sata drive or an ide drive. that's something. so i do think the problem is kernel-based. thanks and good luck with yours too.

Revision history for this message
Ralph (rbroom) wrote :

Same problems, hardware: Dell Inspiron 2200 (ATA disk). I replaced a drive because of the error and the new drive reports the same thing. I noticed the problems starting when I switched to Karmic (9.10).

SYSLOG excerpt:

Feb 15 16:24:09 foo kernel: [186477.800141] ata1.01: qc timeout (cmd 0xa0)
Feb 15 16:24:09 foo kernel: [186477.800161] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Feb 15 16:24:09 foo kernel: [186477.800177] ata1.01: cmd a0/00:00:00:00:00/00:00:00:00:00/b0 tag 0
Feb 15 16:24:09 foo kernel: [186477.800179] cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Feb 15 16:24:09 foo kernel: [186477.800180] res 51/20:03:00:00:00/00:00:00:00:00/b0 Emask 0x5 (timeout)
Feb 15 16:24:09 foo kernel: [186477.800185] ata1.01: status: { DRDY ERR }
Feb 15 16:24:09 foo kernel: [186477.800301] ata1: soft resetting link
Feb 15 16:24:09 foo kernel: [186478.030296] ata1.00: configured for UDMA/100
Feb 15 16:24:09 foo kernel: [186478.060395] ata1.01: configured for UDMA/25
Feb 15 16:24:09 foo kernel: [186478.060674] ata1: EH complete

A few lines from LSPCI:

00:1f.0 ISA bridge: Intel Corporation 82801FBM (ICH6M) LPC Interface Bridge (rev 03)
00:1f.1 IDE interface: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) IDE Controller (rev 03)
00:1f.3 SMBus: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) SMBus Controller (rev 03)

I tried adding "options libata noacpi=1" to /etc/modprobe.d/options.conf (per some discussion in the forums) but it did not resolve the issue.

Let me know if further data would be helpful.

Revision history for this message
zilog (gnuffel) wrote :

I found this errata for the ICH8 chip: http://www.intel.com/Assets/PDF/specupdate/313057.pdf (see errata #19). I have set my system not to use MSI (pci=nomsi on the kernel boot cmdline). Hopefully this solves it until a kernel fix is made.

I have just made this change to my system, and will probably need a few weeks to see if it really fixes anything.

Revision history for this message
giorgio_fornara (giorgio-fornara) wrote :

upgraded to Linux version 2.6.31-20-generic.
Freezes ATA continues.
Workaround with "inserted CD" continues.

Revision history for this message
Wolm (torben-wolm) wrote :

I again had three crashes over the last few days.

Running 2.6.27-17-server SMP on the eeebox, and thus it is not possible to try the inserted CD workaround...

Revision history for this message
Tom Walsh (ymmothslaw) wrote :
Download full text (3.3 KiB)

Same problem here as headlessspider in comment #26 -- computer freezes, some interaction is possible (mouse cursors change)... after 30 seconds or so, everything's back to normal. Usually happens about a minute or two after I "unsuspend" the computer.

Samsung NP-130 netbook.

$ lspci

00:00.0 Host bridge: Intel Corporation Mobile 945GME Express Memory Controller Hub (rev 03)
00:02.0 VGA compatible controller: Intel Corporation Mobile 945GME Express Integrated Graphics Controller (rev 03)
00:02.1 Display controller: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller (rev 03)
00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 02)
00:1c.2 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 3 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 (rev 02)
00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2)
00:1f.0 ISA bridge: Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801GBM/GHM (ICH7 Family) SATA IDE Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 02)
02:00.0 Network controller: Realtek Semiconductor Co., Ltd. Device 8192 (rev 01)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8101E/RTL8102E PCI Express Fast Ethernet controller (rev 02)

$ tail -18 /var/log/syslog

Mar 14 22:45:02 dinky wpa_supplicant[908]: CTRL-EVENT-SCAN-RESULTS
Mar 14 22:46:02 dinky wpa_supplicant[908]: CTRL-EVENT-SCAN-RESULTS
Mar 14 22:47:22 dinky wpa_supplicant[908]: CTRL-EVENT-SCAN-RESULTS
Mar 14 22:47:36 dinky kernel: [103609.000609] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Mar 14 22:47:36 dinky kernel: [103609.000651] ata1.00: cmd ca/00:08:fa:d5:a1/00:00:00:00:00/eb tag 0 dma 4096 out
Mar 14 22:47:36 dinky kernel: [103609.000658] res 40/00:01:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
Mar 14 22:47:36 dinky kernel: [103609.000673] ata1.00: status: { DRDY }
Mar 14 22:47:41 dinky kernel: [103614.040291] ata1: link is slow to respond, please be patient (ready=0)
Mar 14 22:47:46 dinky kernel: [103619.024318] ata1: device not ready (errno=-16), forcing hardreset
Mar 14 22:47:46 dinky kernel: [103619.024343] ata1: soft resetting link
Mar 14 22:47:46 dinky kernel: [103619.205328] ata1.00: configured for UDMA/133
Mar 14 22:47:46 dinky kernel: [103619.205350] ata1.00: device reported invalid CHS sector 0
Mar 14 22:47:46 dinky kernel: [103619.205385] ata1: EH complete
Mar 14 22:49:02 dinky wpa_supplicant[908]: CTRL-EVENT-SCAN-RESULTS
Mar 14 22:51:02 dinky wpa_supplicant[908]: CTRL-EVENT-SCAN-RE...

Read more...

Revision history for this message
Charis Kouzinopoulos (charis) wrote :

Same problem here, computer freezes for periodically for 10-30 seconds each with this message in syslog:

[13377.952020] ata3.01: qc timeout (cmd 0xa0)
[13382.992510] ata3: link is slow to respond, please be patient (ready=0)
[13387.992012] ata3: device not ready (errno=-16), forcing hardreset
[13387.992019] ata3: soft resetting link
[13388.180795] ata3.00: configured for UDMA/133
[13388.196701] ata3.01: configured for UDMA/33
[13388.198995] ata3: EH complete

Revision history for this message
Charis Kouzinopoulos (charis) wrote :

Dmesg.log

Revision history for this message
Charis Kouzinopoulos (charis) wrote :

uname -a

Revision history for this message
Charis Kouzinopoulos (charis) wrote :

lspci output

Revision history for this message
Rob Jacobson (rob104) wrote :
Download full text (3.7 KiB)

Please forgive this intrusion, but on reading through this entire thread, I could not but help seeing how little good information must be out there, to help users interpret exceptions correctly. And I thought that perhaps I could offer a little guidance and a few suggestions, that hopefully will improve your own diagnostic efforts, as well as improve the issue reporting quality, and thereby possibly improve the general stability of the kernel and device code. I am not an expert, but have read a lot of syslogs, and tried to help a number of users.

An exception is just the report of something that appears unusual, could be nothing, or could be a symptom of something wrong. An exception handler has kicked in, and will try to report as much as it can, and may also attempt a few actions to resolve the issue, if it appears warranted. The reporting is a sequence of lines that start with a line beginning with "exception", includes various lines with additional information about the issue, and ends with a line with "EH complete" (Error Handler is finished). If there were error flags reported to it, then a verbose version of those flags will be listed. If there were SATA link errors (SErr is non-zero), then they will also be expanded in the following lines. A great resource for these is http://ata.wiki.kernel.org/index.php/Libata_error_messages.

So any exception is analogous to hearing an unusual noise from your car. Something may be wrong, but you need more data, and possibly an experienced mechanic to interpret whatever symptoms you have detected. Some of the messages are exactly what they sound like. For example, "link is slow to respond" and "timeout" and "frozen" just mean that a response did not occur within the normal time frame. They aren't bugs, just symptoms, an indication that something may be wrong. Analogy: your car unexpectedly feels sluggish, not responding as quickly as usual.

Unfortunately, many of the reports above do not have any errors reported, only symptoms of 'sluggishness' or a loss of communications. Something may very well be wrong, but it is not obvious from these reports, and there are a *lot* of very different causes. It could be the device itself (bad media, buggy firmware, too hot, etc), could be the cabling or connections (bad cable, bad or loose connectors, loose backplane, faulty power splitter, etc), could be the controller chipset, could be over-heated chipsets, could be power issues in the device, could be general power issues, could be a mis-configured device, could be incompatible hardware, could be a buggy 'driver' module, could even be bad memory, etc.

A last tip, the single most common (in my experience) issue, and the easiest to fix, is faulty cables. If you see the word ICRC and/or BadCRC within the error handler exception report, then replacing the cable with a good quality cable will (I believe) fix over 80% of these exceptions (perhaps over 95%). I doubt there is a more common reason for RMA'ing drives wrongly, than drive exceptions that actually were caused by bad cables. The next easy fix is check for loose connections in both the data and power cables and any splitters used, and in ...

Read more...

Revision history for this message
Chris Le Sueur (thefishface) wrote :

Exactly the same as Tom Walsh: I get the same message as headlessspider, i.e. a timeout and then reset, during which time there is a level of unresponsiveness. Also exactly like Tom, this occurs fairly reliably a short while after unsuspending or powering on - last time it occurred there was a 4 minute delay between unsuspending and the timeout being reported in syslog. I won't post the output of dmesg unless someone wants it because it's pretty much the same as spider's, with the exception of the following line just before the EH finishes:

ata1.00: device reported invalid CHS sector 0

I am running the Lucid release candidate, 2.6.32-21-generic, the machine is a Samsung N140 netbook, and the disk as a Seagate ST9250315AS.

Revision history for this message
Acetone (hormone) wrote :

I changed the jumper to "master" to "CS" and the problem has been fixed.

but I disconnected the CD / DVD.

Revision history for this message
Ralph (rbroom) wrote :

Followup on my earlier post. The problem does NOT occur if the CD tray is ejected. The CD drive is still connected and available, the tray is just ejected. I've not idea why this would happen.

Revision history for this message
Ralph (rbroom) wrote :

Reconfirmed with 10.04 (Dell Inspiron 2200), kernel 2.6.32-24-generic.

The behavior is a system-freeze (I/O wait, actually) until the reset, after which all is OK. SMART indicates a healthy drive (I get this with multiple drives) and passes all self-tests.

As this is a laptop I'm limited in terms of replacing cables or changing drive-select options. I still feel this is a kernel/driver issue, and would welcome input from someone familiar with what's going on at this level.

Log:
Sep 3 13:18:26 foo kernel: [36907.836314] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 3 13:18:26 foo kernel: [36907.836323] sr 0:0:1:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00 00 00 00
Sep 3 13:18:26 foo kernel: [36907.836342] ata1.01: cmd a0/00:00:00:08:00/00:00:00:00:00/b0 tag 0 pio 16392 in
Sep 3 13:18:26 foo kernel: [36907.836344] res 51/20:03:00:00:00/00:00:00:00:00/b0 Emask 0x1 (device error)
Sep 3 13:18:26 foo kernel: [36907.836348] ata1.01: status: { DRDY ERR }
Sep 3 13:18:28 foo bluetoothd[1322]: Unable to add connection 44
Sep 3 13:18:31 foo kernel: [36912.836148] ata1.01: qc timeout (cmd 0xa0)
Sep 3 13:18:31 foo kernel: [36912.836168] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Sep 3 13:18:31 foo kernel: [36912.836177] sr 0:0:1:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00 00 00 00
Sep 3 13:18:31 foo kernel: [36912.836196] ata1.01: cmd a0/00:00:00:08:00/00:00:00:00:00/b0 tag 0 pio 16392 in
Sep 3 13:18:31 foo kernel: [36912.836198] res 51/20:03:00:00:00/00:00:00:00:00/b0 Emask 0x5 (timeout)
Sep 3 13:18:31 foo kernel: [36912.836202] ata1.01: status: { DRDY ERR }
Sep 3 13:18:36 foo kernel: [36917.880058] ata1: link is slow to respond, please be patient (ready=0)
Sep 3 13:18:41 foo kernel: [36922.864034] ata1: device not ready (errno=-16), forcing hardreset
Sep 3 13:18:41 foo kernel: [36922.864127] ata1: soft resetting link
Sep 3 13:18:41 foo kernel: [36923.105522] ata1.00: configured for UDMA/100
Sep 3 13:18:41 foo kernel: [36923.121459] ata1.01: configured for UDMA/25
Sep 3 13:18:41 foo kernel: [36923.121820] ata1: EH complete

Revision history for this message
headlessspider (headlessspider) wrote :

i agree with your assesment that its a kernel/driver issue because it was
working properly when i started using ubuntu.

-- and if life has failed you, leave the cross you're nailed to
http://noel.alanguilan.com/

On Sat, Sep 4, 2010 at 1:57 AM, Ralph <email address hidden> wrote:

> Reconfirmed with 10.04 (Dell Inspiron 2200), kernel 2.6.32-24-generic.
>
> The behavior is a system-freeze (I/O wait, actually) until the reset,
> after which all is OK. SMART indicates a healthy drive (I get this with
> multiple drives) and passes all self-tests.
>
> As this is a laptop I'm limited in terms of replacing cables or changing
> drive-select options. I still feel this is a kernel/driver issue, and
> would welcome input from someone familiar with what's going on at this
> level.
>
> Log:
> Sep 3 13:18:26 foo kernel: [36907.836314] ata1.01: exception Emask 0x0
> SAct 0x0 SErr 0x0 action 0x0
> Sep 3 13:18:26 foo kernel: [36907.836323] sr 0:0:1:0: CDB: Read
> Capacity(10): 25 00 00 00 00 00 00 00 00 00
> Sep 3 13:18:26 foo kernel: [36907.836342] ata1.01: cmd
> a0/00:00:00:08:00/00:00:00:00:00/b0 tag 0 pio 16392 in
> Sep 3 13:18:26 foo kernel: [36907.836344] res
> 51/20:03:00:00:00/00:00:00:00:00/b0 Emask 0x1 (device error)
> Sep 3 13:18:26 foo kernel: [36907.836348] ata1.01: status: { DRDY ERR }
> Sep 3 13:18:28 foo bluetoothd[1322]: Unable to add connection 44
> Sep 3 13:18:31 foo kernel: [36912.836148] ata1.01: qc timeout (cmd 0xa0)
> Sep 3 13:18:31 foo kernel: [36912.836168] ata1.01: exception Emask 0x0
> SAct 0x0 SErr 0x0 action 0x6 frozen
> Sep 3 13:18:31 foo kernel: [36912.836177] sr 0:0:1:0: CDB: Read
> Capacity(10): 25 00 00 00 00 00 00 00 00 00
> Sep 3 13:18:31 foo kernel: [36912.836196] ata1.01: cmd
> a0/00:00:00:08:00/00:00:00:00:00/b0 tag 0 pio 16392 in
> Sep 3 13:18:31 foo kernel: [36912.836198] res
> 51/20:03:00:00:00/00:00:00:00:00/b0 Emask 0x5 (timeout)
> Sep 3 13:18:31 foo kernel: [36912.836202] ata1.01: status: { DRDY ERR }
> Sep 3 13:18:36 foo kernel: [36917.880058] ata1: link is slow to respond,
> please be patient (ready=0)
> Sep 3 13:18:41 foo kernel: [36922.864034] ata1: device not ready
> (errno=-16), forcing hardreset
> Sep 3 13:18:41 foo kernel: [36922.864127] ata1: soft resetting link
> Sep 3 13:18:41 foo kernel: [36923.105522] ata1.00: configured for UDMA/100
> Sep 3 13:18:41 foo kernel: [36923.121459] ata1.01: configured for UDMA/25
> Sep 3 13:18:41 foo kernel: [36923.121820] ata1: EH complete
>
> --
> Consistent repeating [ata1: link is slow to respond, please be patient ]
> https://bugs.launchpad.net/bugs/297058
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Revision history for this message
Don Rhummy (donrhummy) wrote :

I'm getting this same issue. has anyone solved this?

Here's all the info I can think of:

1. I have two HD's, both set as cable select

2. I have a DVD/CD reader/writer drive

3. My drives are PATA

4. I'm getting slightly different error messages with reg. startup vs fail safe mode (I get the errors on every boot, as well as during running):

REGULAR STARTUP
---------------------------
[ 295.816174] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 295.816196] ata1.00: failed command: WRITE DMA
[ 295.816225] ata1.00: cmd ca/00:08:00:50:7f/00:00:00:00:00/e3 tag 0 dma 4096 out
[ 295.816232] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 295.816246] ata1.00: status: { DRDY }

FAIL SAFE STARTUP
---------------------------
[ 295.816174] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 295.816196] ata1.00: failed command: WRITE DMA
[ 295.816225] ata1.00: cmd ca/00:08:00:50:7f/00:00:00:00:00/e3 tag 0 dma 4096 out
[ 295.816232] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 295.816246] ata1.00: status: { DRDY }
[ 300.856096] ata1: link is slow to respond, please be patient (ready=0)
[ 305.844090] ata1: device not ready (errno=-16), forcing hardreset
[ 305.844114] ata1: soft resetting link
[ 306.024895] ata1.00: configured for UDMA/133
[ 306.024898] ata1.00: configured for UDMA/100
[ 306.024913] ata1.00: device reported invalid CHS sector 0
[ 306.024942] ata1: EH complete

Revision history for this message
Ramon Buckland (ramon-thebuckland) wrote :
Download full text (8.1 KiB)

try disconnecting the CD/DVD drive (if it is laptop with a removeable drive
etc).
Most people seem to have success with removing it. (of course, not the
solution, but it would help the team(s) if it is known this is a cause).

r.

On Tue, Sep 21, 2010 at 5:40 PM, Don Rhummy <email address hidden>wrote:

> I'm getting this same issue. has anyone solved this?
>
> Here's all the info I can think of:
>
> 1. I have two HD's, both set as cable select
>
> 2. I have a DVD/CD reader/writer drive
>
> 3. My drives are PATA
>
> 4. I'm getting slightly different error messages with reg. startup vs
> fail safe mode (I get the errors on every boot, as well as during
> running):
>
> REGULAR STARTUP
> ---------------------------
> [ 295.816174] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> frozen
> [ 295.816196] ata1.00: failed command: WRITE DMA
> [ 295.816225] ata1.00: cmd ca/00:08:00:50:7f/00:00:00:00:00/e3 tag 0 dma
> 4096 out
> [ 295.816232] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> [ 295.816246] ata1.00: status: { DRDY }
>
> FAIL SAFE STARTUP
> ---------------------------
> [ 295.816174] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
> frozen
> [ 295.816196] ata1.00: failed command: WRITE DMA
> [ 295.816225] ata1.00: cmd ca/00:08:00:50:7f/00:00:00:00:00/e3 tag 0 dma
> 4096 out
> [ 295.816232] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> [ 295.816246] ata1.00: status: { DRDY }
> [ 300.856096] ata1: link is slow to respond, please be patient (ready=0)
> [ 305.844090] ata1: device not ready (errno=-16), forcing hardreset
> [ 305.844114] ata1: soft resetting link
> [ 306.024895] ata1.00: configured for UDMA/133
> [ 306.024898] ata1.00: configured for UDMA/100
> [ 306.024913] ata1.00: device reported invalid CHS sector 0
> [ 306.024942] ata1: EH complete
>
> --
> Consistent repeating [ata1: link is slow to respond, please be patient ]
> https://bugs.launchpad.net/bugs/297058
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “linux” package in Ubuntu: Incomplete
>
> Bug description:
> I have a Dell Latitude D630.
>
> With Ubuntu 8.04 and now with 8.10 I consistently receive the following
> error message(s).
>
> Currently I am running an Hitachi 80GB SATA drive, I have also tried a
> Seagate 80G and another Hitachi 80G hard drive and all seem to exhibit the
> problem.
>
> Nov 12 12:01:57 itasca kernel: [ 1226.128187] ata1: link is slow to
> respond, please be patient (ready=0)
> Nov 12 12:02:02 itasca kernel: [ 1231.112167] ata1: device not ready
> (errno=-16), forcing hardreset
> Nov 12 12:02:02 itasca kernel: [ 1231.112191] ata1: soft resetting link
> Nov 12 12:02:02 itasca kernel: [ 1231.292629] ata1.00: configured for PIO3
> Nov 12 12:02:02 itasca kernel: [ 1231.292662] ata1: EH complete
> Nov 12 12:08:27 itasca kernel: [ 1616.132171] ata1: link is slow to
> respond, please be patient (ready=0)
> Nov 12 12:08:32 itasca kernel: [ 1621.116175] ata1: device not ready
> (errno=-16), forcing hardreset
> Nov 12 12:08:32 itasca kernel: [ 1621.116199] ata1: soft resetting link
> Nov 12 12:08:32 itasca kernel: [ 1621.296651] ata1.00: configured for ...

Read more...

Revision history for this message
Ralph (rbroom) wrote :

Ok, I've solved this on my machine, but you aren't going to like it.

Removing the drive (or simply leaving the tray open avoids the problem, but that just narrowed it down to the optical drive and isn't a fix.

I ended up updating the firmware on the optical drive, and the problem has not happened in four weeks.

I have a Dell laptop, and was checking for BIOS or other updates on the Dell support site. I saw a firmware update for the Philips optical drive from a few years ago and saw I was not up to date. After much grinding of teeth (the update process assumes Windows) I got it done. No obvious changes to performance, and now this problem is no longer happening on my Inspiron.

So if there's an update for your drive, give it a shot. I'm sorry this won't apply to everyone, but I hope it helps.

Revision history for this message
Kone (kone) wrote :

I had also similar errors in /var/log/messages:

Nov 16 23:30:22 hostname kernel: res 51/84:af:69:cc:f6/84:00:04:00:00/e0 Emask 0x10 (ATA bus error)
Nov 16 23:30:22 hostname kernel: ata1.00: status: { DRDY ERR }
Nov 16 23:30:22 hostname kernel: ata1.00: error: { ICRC ABRT }
Nov 16 23:30:22 hostname kernel: ata1: soft resetting link
Nov 16 23:30:23 hostname kernel: ata1.00: configured for UDMA/33
Nov 16 23:30:23 hostname kernel: ata1.01: configured for UDMA/100
Nov 16 23:30:23 hostname kernel: ata1: EH complete
Nov 16 23:30:23 hostname kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
Nov 16 23:30:23 hostname kernel: ata1.00: BMDMA stat 0x65
Nov 16 23:30:23 hostname kernel: ata1.00: failed command: READ DMA EXT
Nov 16 23:30:23 hostname kernel: ata1.00: cmd 25/00:30:e8:cb:f6/00:01:04:00:00/e0 tag 0 dma 155648 in
Nov 16 23:30:23 hostname kernel: res 51/84:af:69:cc:f6/84:00:04:00:00/e0 Emask 0x10 (ATA bus error)
Nov 16 23:30:23 hostname kernel: ata1.00: status: { DRDY ERR }
Nov 16 23:30:23 hostname kernel: ata1.00: error: { ICRC ABRT }
Nov 16 23:30:23 hostname kernel: ata1: soft resetting link
Nov 16 23:30:23 hostname kernel: ata1.00: configured for UDMA/33
Nov 16 23:30:23 hostname kernel: ata1.01: configured for UDMA/100
Nov 16 23:30:23 hostname kernel: ata1: EH complete
Nov 16 23:30:23 hostname kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
Nov 16 23:30:23 hostname kernel: ata1.00: BMDMA stat 0x65
Nov 16 23:30:23 hostname kernel: ata1.00: failed command: READ DMA EXT
Nov 16 23:30:23 hostname kernel: ata1.00: cmd 25/00:00:18:cd:f6/00:04:04:00:00/e0 tag 0 dma 524288 in
Nov 16 23:30:23 hostname kernel: res 51/84:1f:f9:cd:f6/84:03:04:00:00/e0 Emask 0x10 (ATA bus error)

I got these about after half an hour of reading a big file on harddrive, maybe 1gb or so. Then machine started having freezes from few seconds to a minute maybe. First I thought it must be a hardware problem. I rebooted and freezing continued. I found this report and as crazy solution as it is, inserting a dvd into dvd drive restored functions back to normal.

Previously I had some weird issues playing dvd's with this drive and suspected there was some weirdness in the firmware. I did not pay more attention to it then but now I'm pretty sure there is something wrong with the dvd drive. The system is redhat linux 6 beta kernel 2.6.32-44.2.el6.i686
The dvd type is: TSSTcorp CDDVDW SN-T083.
This is from dmesg:

ata1.00: ATA-7: SAMSUNG HD103UJ, 1AA01113, max UDMA7
ata1.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata1.01: ATAPI: TSSTcorp CDDVDW SN-T083A, SB01, max UDMA/100
ata1.01: applying bridge limits
ata1.00: configured for UDMA/133
ata1.01: configured for UDMA/100

Revision history for this message
Wolm (torben-wolm) wrote :

As I'm using an eeebox without dvd-drive, I cannot put a disc in to fix this problem.

But I've found this: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/383632 which basically tells you to kill the udev process.

I have done that, and my box has been running for 13 days now without freezing. I'm a bit reluctant to call this a "solution" because I've had long periods earlier where everything seemed fine, and suddenly it started freezing every other day again.

I will report back when the box has run some more time or if it freezes again. But strange that the udev process would cause sata drives to fail...

Revision history for this message
Wolm (torben-wolm) wrote :

My eeebox has now been running for 40 days after killing the udev process...

Revision history for this message
DFOXpro (dfoxpro) wrote :

[Español, por favor traducir]

He tenido un problema parecido durante los ultimos 3 meses casi cada 30 segundos, he probado ubuntu 10.04 y 10.10 y se presenta el mismo problema, cambie el disco y los cables y el problema continua, creo q' puede ser un problema de hardware en la targeta madre, o un daño en la lectura de particiones. Otro detalle q' he notado es q' este error pasa en instalaciones con 2 SO instalados (windows y ubuntu por ejemplo), finalmente he notado q' el problema tambien se extiende a otro OS como en windows y este solo bloquea el compu como por 10 segundos o en el peor de los casos manda una pantalla azul.

Estos errores pueden salir 3 horas en 1 dia a la semana como días completos.

Jan 16 20:40:52 ubuntu kernel: [ 972.873266] ata1: soft resetting link
Jan 16 20:40:52 ubuntu kernel: [ 973.049102] ata1.00: configured for PIO0
Jan 16 20:40:52 ubuntu kernel: [ 973.049149] ata1: EH complete
Jan 16 20:41:06 ubuntu kernel: [ 986.877438] ata1: soft resetting link
Jan 16 20:41:06 ubuntu kernel: [ 987.053378] ata1.00: configured for PIO0
Jan 16 20:41:06 ubuntu kernel: [ 987.053434] ata1: EH complete

Sino es este otro error:

Jan 16 20:47:38 ubuntu kernel: [ 1379.344067] sd 0:0:0:0: [sda] Unhandled error code
Jan 16 20:47:38 ubuntu kernel: [ 1379.344080] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jan 16 20:47:38 ubuntu kernel: [ 1379.344088] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 08 00 00 00 08 00
Jan 16 20:47:38 ubuntu kernel: [ 1379.344900] sd 0:0:0:0: [sda] Unhandled error code
Jan 16 20:47:38 ubuntu kernel: [ 1379.344908] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jan 16 20:47:38 ubuntu kernel: [ 1379.344914] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 00 18 00 00 00 08 00

O la suma de los anteriores

Jan 16 20:45:52 ubuntu kernel: [ 1273.176098] ata1: soft resetting link
Jan 16 20:45:55 ubuntu kernel: [ 1276.213091] ata1.00: configured for PIO0
Jan 16 20:45:55 ubuntu kernel: [ 1276.213146] ata1: EH complete
Jan 16 20:46:31 ubuntu kernel: [ 1312.144086] ata1: link is slow to respond, please be patient (ready=0)
Jan 16 20:46:36 ubuntu kernel: [ 1317.132082] ata1: device not ready (errno=-16), forcing hardreset
Jan 16 20:46:36 ubuntu kernel: [ 1317.132102] ata1: soft resetting link
Jan 16 20:46:41 ubuntu kernel: [ 1322.332094] ata1: link is slow to respond, please be patient (ready=0)
Jan 16 20:47:36 ubuntu kernel: [ 1377.240130] ata1.00: disabled
Jan 16 20:47:36 ubuntu kernel: [ 1377.240177] ata1: EH complete
Jan 16 20:47:36 ubuntu kernel: [ 1377.240280] sd 0:0:0:0: [sda] Unhandled error code
Jan 16 20:47:36 ubuntu kernel: [ 1377.240285] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jan 16 20:47:36 ubuntu kernel: [ 1377.240292] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 60 5b 55 00 00 80 00

Agradesco su atención prestada.

Revision history for this message
Wolm (torben-wolm) wrote :

Update: My eeebox has now been running for 72 days after killing udev. There is not a single error message to be found in /var/log/messages...

Revision history for this message
Davidiam (hectorjerezano) wrote :

I have the same experience as Wolm but when I burning Double layer DVDs and killing udevd I Don get any errores and can burn the DL DVD

Revision history for this message
Wolm (torben-wolm) wrote :

Update: February 28th the machine crashed again. After 4 months without any errors, it suddenly decides the disk is slow to respond...

I give up. It seems Linux is not meant to run on this eeebox.

Revision history for this message
Merl'1 (nicolas-lhoir-launchpad) wrote :

Hello,
I've exactly the same problem on only one of my 3 eeebox.
2 are working fine with debian lenny 5.0.8 with kernel 2.6.27.7-eeebox-r1
Only my ubuntu eeebox 2.6.32-5-openvz-686 is faulty with same behaviour: 2-3 month fine (without killing udev) then, I need 2-3 reboots in order to get ride of "ata1: link is slow to respond".
I am not able to exchange hard disk between them because there are in 3 different location.
The Seagate firmware is the same 0303.
Let me know what and where to investigate in order to solve this bug in ubuntu.
Thanks

Revision history for this message
Brad Figg (brad-figg) wrote : Unsupported series, setting status to "Won't Fix".

This bug was filed against a series that is no longer supported and so is being marked as Won't Fix. If this issue still exists in a supported series, please file a new bug.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: Incomplete → Won't Fix
Changed in linux (Ubuntu):
assignee: nobody → Miguel Alvarado (alvaradoma)
Revision history for this message
Miguel A. Alvarado V. (exodus) wrote :

I believe the problem here is that all your BIOS' are configured to use AHCI. The current Ubuntu kernel is loading it as a module thus loading it very late. To add to this the kernel has another ATA driver compiled inside it which interacts with your disk controller and making it act in this bizare wait state.

After the AHCI module is loaded it fixes everything.

Proposed workaround: Disable AHCI in your kernel, enabling IDE or RAID as the alternative, this should stop the delay you seem to have. I tested it on my machine. Test it in yours so I can follow up.

Proposed fix: Compile AHCI into the kernel and not as a module.

Revision history for this message
Jorge Juan (jjchico) wrote :

The "bug" is still here in 14.04.

$ uname -a
Linux valhalla 3.13.0-44-generic #73-Ubuntu SMP Tue Dec 16 00:22:43 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

$ grep AHCI /boot/config-3.13.0-44-generic
CONFIG_SATA_AHCI=m
CONFIG_SATA_AHCI_PLATFORM=m
# CONFIG_AHCI_IMX is not set
CONFIG_SATA_ACARD_AHCI=m

Yes, a workaround is to set BIOS to use IDE not AHCI but isn't AHCI better than IDE? Hot-plug, queuing, etc. Specillay if you have a e-SATA port.

Is it safe to hot-plug in IDE mode?
Is performance better in AHCI mode?
Why it is so problematic to include the AHCI driver compiled into the kernel?

Fortunately, I only reboot this computer from time to time :)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.