regression: root device not found (mptspi)

Bug #576302 reported by Anton Gyllenberg
28
This bug affects 5 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

After upgrading from Karmic to Lucid, my old Dell Poweredge no longer boots, but fails with:
Gave up waiting for root device. [...]
Alert! /dev/disk/by-uuid/d4cbfbe0-d0af-434f-aa73-40471e6cc040 does not exist. Dropping to a shell!

2.6.31-21-generic OK
2.6.32-21-generic ERROR
2.6.32-22-generic ERROR

I can boot 2.6.31-21-generic, but with 2.6.32-21-generic and 2.6.32-22-generic I get the error. I recall I got the error earlier as well but it resolved itself by rebooting I think. Now this consistently works with 2.6.31-21 and fails with 2.6.32-21 and -22.

My kernel command line is: root=UUID=d4cbfbe0-d0af-434f-aa73-40471e6cc040 ro hpet=force clocksource=hpet acpi=off

Waiting and exiting, or setting rootdelay=120 doesn't help. The device is not found, /dev/sda1 never appears. The end of the kernel output looks like the following:

[ 2.658887] mptspi 0000:03:0c.1: PCI->APIC IRQ transform: INT B -> IRQ 26
[ 2.659014] mptbase: ioc1: Initiating bringup
[ 17.656025] mptbase: ioc0: WARNING - Issuing Reset from mpt_config!!
[ 17.656031] mptbase: ioc0: Initiating recovery
[ 27.328017] ioc1: LSI53C1030 B2: Capabilities={Initiator}
[ 27.912022] mptbase: ioc1: WARNING - Unexpected doorbell active!
[ 67.920021] mptbase: ioc1: ERROR - Doorbell ACK timeout (count=4999), IntStatus=80000000!
[ 78.168019] mptbase: ioc1: WARNING - : alt-ioc Not ready WARNING!
[ 87.224023] mptbase: ioc0: Attempting Retry Config request type 0x4, page 0x1, action 2
[ 87.224078] mptbase: ioc0: Retry completed ret=0x0 timeleft=3750
[ 117.092017] mptbase: ioc1: ERROR - Doorbell INT timeout (count=4999), IntStatus=0!
[ 117.092022] mptbase: ioc1: ERROR - Handshake reply failure!
[ 117.092026] mptbase: ioc1: ERROR - Sending PortEnable failed(-1)!
[ 117.092038] mptbase: ioc1: ERROR - didn't initialize properly! (-4)
[ 117.092110] mptspi: probe of 0000:03:0c.1 failed with error -4

Attached lspci and dmesg output after booting 2.6.31-21-generic that works and with 2.6.32-22 after removing and reinserting mptspi etc.

---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
Architecture: i386
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: anton 2926 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'I82801DBICH4'/'Intel 82801DB-ICH4 with AD1981B at irq 17'
   Mixer name : 'Analog Devices AD1981B'
   Components : 'AC97a:41445374'
   Controls : 26
   Simple ctrls : 19
DistroRelease: Ubuntu 10.04
HibernationDevice: RESUME=
IwConfig:
 lo no wireless extensions.

 eth1 no wireless extensions.

 eth0 no wireless extensions.
MachineType: Dell Computer Corporation Precision WorkStation 450
Package: linux (not installed)
ProcCmdLine: root=UUID=d4cbfbe0-d0af-434f-aa73-40471e6cc040 ro hpet=force clocksource=hpet acpi=off
ProcEnviron:
 LANGUAGE=en_US:en
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-22.36-generic 2.6.32.11+drm33.2
Regression: Yes
RelatedPackageVersions: linux-firmware 1.34
Reproducible: Yes
RfKill:

Tags: lucid regression-release needs-upstream-testing
Uname: Linux 2.6.32-22-generic i686
UserGroups: adm admin audio cdrom dialout dip floppy fuse lpadmin plugdev scanner scard staff video
WpaSupplicantLog:

dmi.bios.date: 07/21/2003
dmi.bios.vendor: Dell Computer Corporation
dmi.bios.version: A03
dmi.board.name: 0F1263
dmi.board.vendor: Dell Computer Corp.
dmi.chassis.type: 3
dmi.chassis.vendor: Dell Computer Corporation
dmi.modalias: dmi:bvnDellComputerCorporation:bvrA03:bd07/21/2003:svnDellComputerCorporation:pnPrecisionWorkStation450:pvr:rvnDellComputerCorp.:rn0F1263:rvr:cvnDellComputerCorporation:ct3:cvr:
dmi.product.name: Precision WorkStation 450
dmi.sys.vendor: Dell Computer Corporation

Revision history for this message
Anton Gyllenberg (antong) wrote :
Revision history for this message
Anton Gyllenberg (antong) wrote :
description: updated
tags: added: kj-triage
Revision history for this message
Tristan Moody (tmoody) wrote :

I had something similar happen booting Karmic today (2.6.31-21-generic) and discovered I was able to reboot if I unloaded mptspi, mptscsih, and mptbase and reloaded them before exiting the busybox shell...dunno if this will work for anyone else.

Revision history for this message
Anton Gyllenberg (antong) wrote :

Tristan, thanks for the tip! I got it to boot by doing just that. Had to wait quite long before mptspi was no longer in use and could be rmmoded.

Attaching a new dmesg.log now that I could boot 2.6.32-22-generic. I guess it is at t=262.064235 where I've done the modprobe mptspi and things start working.

description: updated
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi Anton,

This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux 576302

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-kernel-logs
tags: added: needs-upstream-testing
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Anton Gyllenberg (antong) wrote : AlsaDevices.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Anton Gyllenberg (antong) wrote : AplayDevices.txt

apport information

Revision history for this message
Anton Gyllenberg (antong) wrote : ArecordDevices.txt

apport information

Revision history for this message
Anton Gyllenberg (antong) wrote : BootDmesg.txt

apport information

Revision history for this message
Anton Gyllenberg (antong) wrote : Card0.Amixer.values.txt

apport information

Revision history for this message
Anton Gyllenberg (antong) wrote : Card0.Codecs.codec97.0.ac97.0.0.txt

apport information

Revision history for this message
Anton Gyllenberg (antong) wrote : Card0.Codecs.codec97.0.ac97.0.0.regs.txt

apport information

Revision history for this message
Anton Gyllenberg (antong) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Anton Gyllenberg (antong) wrote : Lspci.txt

apport information

Revision history for this message
Anton Gyllenberg (antong) wrote : Lsusb.txt

apport information

Revision history for this message
Anton Gyllenberg (antong) wrote : PciMultimedia.txt

apport information

Revision history for this message
Anton Gyllenberg (antong) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Anton Gyllenberg (antong) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Anton Gyllenberg (antong) wrote : ProcModules.txt

apport information

Revision history for this message
Anton Gyllenberg (antong) wrote : UdevDb.txt

apport information

Revision history for this message
Anton Gyllenberg (antong) wrote : UdevLog.txt

apport information

Revision history for this message
Anton Gyllenberg (antong) wrote : WifiSyslog.txt

apport information

tags: added: regression-release
removed: needs-kernel-logs needs-upstream-testing
Revision history for this message
Anton Gyllenberg (antong) wrote :

Yes this is still an issue in Lucid with the kernel 2.6.32-22-generic which I believe is still the latest. Ran apport-collect as instructed.

Unfortunately the system on which this happens is in production and I do not wish to upgrade the whole system to a development release. I did test two new kernels however, with good results. As I didn't find info on which is the latest kernel in the current development release of Ubuntu, I took the newest I could find in http://archive.ubuntu.com/ubuntu/pool/main/l/linux/, that is linux-image-2.6.35-5-generic_2.6.35-5.6_i386.deb. I also tested the mainline release http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.35-rc1-lucid/linux-image-2.6.35-020635rc1-generic_2.6.35-020635rc1_i386.deb.

Both of these new kernels I tested worked in that they eventually recover from the mptspi error. It still takes longer than the default rootdelay, but in any case it does recover by itself and with these new kernels and tweaking rootdelay I can get the system to boot without manual intervention.

Good news! So it seems the issue probably is resolved in the next Ubuntu release. But of course not in 10.04 LTS.

Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
Anton Gyllenberg (antong) wrote :

Thanks everybody for the help! For me personally, my system is now in a state where I get by. I can get it to boot by the rmmod/modprobe massage until I upgrade to the next stable Ubuntu release where the issue is resolved.

Changed in linux (Ubuntu):
status: New → Triaged
tags: added: kernel-needs-review kernel-uncat
Revision history for this message
evilonod (dolive) wrote :

My Dell Workstation 450 does the exact same thing. But a rootdelay=120 woks for me.

Revision history for this message
jbowen7 (jbowen7) wrote :

Affects my Poweredge 1750 also. Made installing 10.04 a pain.

Is this a problem with the fusion package. I noticed that when I add rootdelay=120 my system boots, else it fails at initramfs and drops to busybox shell. It complains about not finding my root device.

[ 1.652305] mptspi 0000:04:05.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
[ 1.652594] mptbase: ioc0: Initiating bringup
[ 3.202850] mptspi 0000:04:05.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19
[ 3.203077] mptbase: ioc1: Initiating bringup
[ 18.200021] mptbase: ioc0: WARNING - Issuing Reset from mpt_config!!, doorbell=0x24000000
[ 18.960020] mptbase: ioc0: Attempting Retry Config request type 0x4, page 0x1, action 2
[ 18.960174] mptbase: ioc0: Retry completed ret=0x0 timeleft=3750
[ 33.200014] mptbase: ioc1: ERROR - Wait IOC_READY state (0x20000000) timeout(15)!
[ 63.200012] mptbase: ioc1: ERROR - Wait IOC_READY state (0x20000000) timeout(15)!

Revision history for this message
rbhkamal (rbhkamal) wrote :

I've been having the same exact problem with 10.04 (CentOS was/is fine) NOW 12.04 is out and still the same problem. Exactly the same output as jbowen7.

I haven't tried rootdelay=120 yet... for some reason grub let's pick the recovery but it doesn't recongnize "e" for edit.

Revision history for this message
penalvch (penalvch) wrote :

Anton Gyllenberg, this bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? If so, could you please test for this with the latest development release of Ubuntu? ISO images are available from http://cdimage.ubuntu.com/daily-live/current/ .

If it remains an issue, could you please run the following command in the development release from a Terminal (Applications->Accessories->Terminal), as it will automatically gather and attach updated debug information to this report:

apport-collect -p linux <replace-with-bug-number>

Also, could you please test the latest upstream kernel available following https://wiki.ubuntu.com/KernelMainlineBuilds ? It will allow additional upstream developers to examine the issue. Please do not test the daily kernel folder, but the one all the way at the bottom. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this bug is fixed in the mainline kernel, please add the following tags:
kernel-fixed-upstream
kernel-fixed-upstream-VERSION-NUMBER

where VERSION-NUMBER is the version number of the kernel you tested. For example:
kernel-fixed-upstream-v3.12-rc1

This can be done by clicking on the yellow circle with a black pencil icon next to the word Tags located at the bottom of the bug description. As well, please remove the tag:
needs-upstream-testing

If the mainline kernel does not fix this bug, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-VERSION-NUMBER

As well, please remove the tag:
needs-upstream-testing

Once testing of the upstream kernel is complete, please mark this bug's Status as Confirmed. Please let us know your results. Thank you for your understanding.

Changed in linux (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.