Slave nodes fail to boot from an NVMe disk

Bug #1474970 reported by Evgeny Kozhemyakin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Alexander Gordeev
6.0.x
Won't Fix
High
MOS Maintenance
6.1.x
In Progress
High
Vitaly Sedelnik
7.0.x
Fix Released
High
Alexander Gordeev

Bug Description

"error : no such device: 44d1bf09-4e8a-4f46-aea6-09e364abf5cb.
Entering rescue mode...
grub_rescue>"

The upstream bugfix [1] has been backported by Canonical and released
with Ubuntu 14.04 [2] [3]

[1] http://savannah.gnu.org/bugs/?41883
[2] https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1275162
[3] https://launchpad.net/ubuntu/trusty/+source/grub2/+changelog

Changed in fuel:
milestone: none → 6.1-mu-2
tags: added: customer-found
Changed in fuel:
assignee: nobody → Fuel OSCI Team (fuel-osci)
importance: Undecided → High
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :
Changed in fuel:
status: New → Confirmed
Roman Vyalov (r0mikiam)
Changed in fuel:
assignee: Fuel OSCI Team (fuel-osci) → Fuel build team (fuel-build)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-agent (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/205193

Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote : Re: grub2 package should be upgraded due NVMe issue

> In order to support NVMe disks we should upgrade grub2 package to a more fresh version.

Nope, upgrading core OS components is not an option

Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

Also booting from an NVMe drive with grub works only if the drive can be accessed via BIOS int 13h.
However most of such drives are available via UEFI calls only [1]

[1] http://download.intel.com/support/ssdc/hpssd/sb/nvme_boot_guide_332098001us.pdf

Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

> as discussed here http://savannah.gnu.org/bugs/?41883

"We have developed Legacy OptionROM for NVMe controller device"

I doubt we can do the same (for all NVMe drives out of there)

Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

This bug was fixed in the package grub2 - 2.02~beta2-9ubuntu1

---------------
grub2 (2.02~beta2-9ubuntu1) trusty; urgency=medium

  * Backport patches from upstream to make the network stack more responsive
    on busy networks (LP: #1314134).
  * Add support for nvme device in grub-mkdevicemap (thanks, Dimitri John
    Ledkov; closes: #746396, LP: #1275162).
 -- Colin Watson <email address hidden> Thu, 08 May 2014 13:09:46 +0100

Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

According to Ubuntu changelog [1] grub2 shipped with Ubuntu 14.04 is able to boot from NVMe devices.
Reassigning the bug to fuel provisioning team.

[1] https://launchpad.net/ubuntu/trusty/+source/grub2/+changelog

summary: - grub2 package should be upgraded due NVMe issue
+ Slave nodes fail to boot from an NVMe disk
Changed in fuel:
assignee: Fuel build team (fuel-build) → Fuel provisioning team (fuel-provisioning)
description: updated
Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

Meanwhile the customer can try the classical provisioning which might "just work" with NVMe disks.

Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

Alexei Sheplyakov, the customer reported that the issue is still here even with grub 2.02~beta2-15

Looks like without UEFI support, grub will never find NVMe devive. And size of disk device doesn't matter too.

> Meanwhile the customer can try the classical provisioning which might "just work" with NVMe disks.

It will "just work" only if node has got at least 1 non NVMe disks. That's non an option.

proposed fix is doing the same https://review.openstack.org/#/c/205193/ it's just a work around. We definitely need UEFI support to be implemented to resolve the issue.

Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

Hello mos-sustaining team, please take a look at proposed fix for 7.0 and target it for updates to 6.1 if necessary

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-agent (master)

Reviewed: https://review.openstack.org/205193
Committed: https://git.openstack.org/cgit/stackforge/fuel-agent/commit/?id=355c08a04917f047b88f66242767049d2b1d0ff0
Submitter: Jenkins
Branch: master

commit 355c08a04917f047b88f66242767049d2b1d0ff0
Author: Alexander Gordeev <email address hidden>
Date: Thu Jul 23 20:21:46 2015 +0300

    Do not land /boot on NVMe disks

    There's no point to place /boot partition on NVMe disk
    as grub2 without UEFI support is unable to recognize it and
    read any data from it.

    Unfortunately, fuel-agent doesn't work with grub2-efi.
    It always installs BIOS version of grub2.

    It's just a work around that will allow to land /boot partition
    on non-NVMe disk.

    The absence of non-NVMe disks on a node will cause a failure of
    provisioning.

    DocImpact

    Change-Id: I166c94416ccb152ccd8d1dc780dfb21a774a4f1d
    Co-Authored-By: Atze de Vries <email address hidden>
    Related-Bug: #1474970

Revision history for this message
Evgeny Kozhemyakin (ekozhemyakin) wrote :

Sustaining team,
FYI, the patch has been succesfully tested by the customer.

Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

The fix couldn't be verified without nvme disks.

Otherwise, here come steps to verify:
For positive test case:
1) slave node must have both types of disks: normal (scsi/sata/ata/etc) and nvme.
2) deploy cluster
expected results:
deploy succeded

For negative test case:
1) slave node must have only nvme disks.
2) deploy cluster
expected results:
deploy failed as provisioning of that slave node will fail with the following message: '/boot partition has not been created for some reasons'

Roman Rufanov (rrufanov)
tags: added: support
Revision history for this message
Egor Kotko (ykotko) wrote :

VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "7.0"
  openstack_version: "2015.1.0-7.0"
  api: "1.0"
  build_number: "301"
  build_id: "301"
  nailgun_sha: "4162b0c15adb425b37608c787944d1983f543aa8"
  python-fuelclient_sha: "486bde57cda1badb68f915f66c61b544108606f3"
  fuel-agent_sha: "50e90af6e3d560e9085ff71d2950cfbcca91af67"
  fuel-nailgun-agent_sha: "d7027952870a35db8dc52f185bb1158cdd3d1ebd"
  astute_sha: "6c5b73f93e24cc781c809db9159927655ced5012"
  fuel-library_sha: "5d50055aeca1dd0dc53b43825dc4c8f7780be9dd"
  fuel-ostf_sha: "2cd967dccd66cfc3a0abd6af9f31e5b4d150a11c"
  fuelmain_sha: "a65d453215edb0284a2e4761be7a156bb5627677"

Changed in fuel:
status: Fix Committed → Fix Released
Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

Won't Fix for 6.0-updates as there is no channel of delivery Fule fixes in 6.0

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.