[Hyper-V][Mellanox] net/mlx4_core: Avoid delays during VF driver device shutdown

Bug #1672785 reported by Joshua R. Poulson
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Joseph Salisbury
Xenial
Fix Released
Medium
Joseph Salisbury
Yakkety
Fix Released
Medium
Joseph Salisbury
Zesty
Fix Released
Medium
Joseph Salisbury

Bug Description

Mellanox has submitted the following patch upstream that's important for SR-IOV in Azure.

Please integrate it into the Mellanox mlx4 drivers for lts-xenial, HWE, Zesty, and Azure custom.

https://patchwork.ozlabs.org/patch/738305/

From: Jack Morgenstein <email address hidden>

Some Hypervisors detach VFs from VMs by instantly causing an FLR event
to be generated for a VF.

In the mlx4 case, this will cause that VF's comm channel to be disabled
before the VM has an opportunity to invoke the VF device's "shutdown"
method.

For such Hypervisors, there is a race condition between the VF's
shutdown method and its internal-error detection/reset thread.

The internal-error detection/reset thread (which runs every 5 seconds) also
detects a disabled comm channel. If the internal-error detection/reset
flow wins the race, we still get delays (while that flow tries repeatedly
to detect comm-channel recovery).

The cited commit fixed the command timeout problem when the
internal-error detection/reset flow loses the race.

This commit avoids the unneeded delays when the internal-error
detection/reset flow wins.

Fixes: d585df1c5ccf ("net/mlx4_core: Avoid command timeouts during VF driver device shutdown")
Signed-off-by: Jack Morgenstein <email address hidden>
Reported-by: Simon Xiao <email address hidden>
Signed-off-by: Tariq Toukan <email address hidden>
---
 drivers/net/ethernet/mellanox/mlx4/cmd.c | 11 +++++++++++
 drivers/net/ethernet/mellanox/mlx4/main.c | 11 +++++++++++
 include/linux/mlx4/device.h | 1 +
 3 files changed, 23 insertions(+)

CVE References

Revision history for this message
Joshua R. Poulson (jrp) wrote :

I will add the upstream commit when this patch goes into linux-next.

Changed in linux (Ubuntu):
status: New → Confirmed
Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged
Changed in linux (Ubuntu Xenial):
status: New → Triaged
importance: Undecided → Medium
tags: added: kernel-da-key kernel-hyper-v
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a Zesty test kernel with the patch. The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1672785/zesty/

The patch does not apply cleanly to Xenial. I'll investigate and see if prereq commits are needed and backport. I'll post that kernel shortly.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I backported the patch for Xenial and build a test kernel. It can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1672785/xenial

Can you test out these two kernels?

Changed in linux (Ubuntu Xenial):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Zesty):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Xenial):
status: Triaged → In Progress
Changed in linux (Ubuntu Zesty):
status: Triaged → In Progress
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built a Xenial test kernel with all the patches from the following bugs:

bug 1670518
  PCI: hv: Allocate physically contiguous hypercall params buffer
  PCI: hv: Make unnecessarily global IRQ masking functions static
  PCI: hv: Delete the device earlier from hbus->children for hot-remove
  PCI: hv: Fix hv_pci_remove() for hot-remove

bug 1672785
  net/mlx4_core: Avoid delays during VF driver device shutdown

bug 1667531
  tools: hv: Enable network manager for bonding scripts on RH
  [net-next] tools: hv: Add clean up function for Ubuntu config
  bcc5a76 tools: hv: Add a script to help bonding synthetic and VF NICs

bug 1667527
 4a9b0933bdfc PCI: hv: Use device serial number as PCI domain

bug 1667007
 d3de209 net/mlx4_core: Use cq quota in SRIOV when creating completion EQs

bug 1650058
 14c84da90b0d net/mlx4_en: Fix bad WQE issue
 c46100f413ca net/mlx4_core: Fix racy CQ (Completion Queue) free
 f4f73e2e6308 net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions
 3c05ac20fe6e net/mlx4_core: Avoid command timeouts during VF driver device shutdown

The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/HyperVCombined/

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Were you able to test for this specific bug using the combined test kernel posted in comment #4? A SRU request will be submitted once testing is verified.

Revision history for this message
Adrian Suhov (asuhov) wrote :

I tested the kernel and no problems were found. You can submit the request.

Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Zesty):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Yakkety):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Joseph Salisbury (jsalisbury)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

During a review of the SRU request, it was found that the backport of commit 4cbe4dac82e423 did not remove the definition of MLX4_INTERFACE_STATE_SHUTDOWN. However, MLX4_INTERFACE_STATE_NOWAIT was added by the backport with the same value as MLX4_INTERFACE_STATE_SHUTDOWN.

What was needed is commit b4353708f5a as a prereq. This commit removes MLX4_INTERFACE_STATE_SHUTDOWN and all references to it.

I built a v2 Yakkety test kernel with these two commits, which can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1672785/yakkety/

If this test kernel fixes the bug, I will re-submit the SRU request.

Changed in linux (Ubuntu Xenial):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Yakkety):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Yakkety):
status: Fix Committed → In Progress
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (21.0 KiB)

This bug was fixed in the package linux - 4.10.0-19.21

---------------
linux (4.10.0-19.21) zesty; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1680535

  * ADT regressions caused by "audit: fix auditd/kernel connection state
    tracking" (LP: #1680532)
    - SAUCE: Revert "audit: fix auditd/kernel connection state tracking"

  * Miscellaneous Ubuntu changes
    - [Config] updateconfigs to update CONFIG_GENERIC_CSUM for ppc64el
      This cleans up behind a Kconfig change that went undetected.

linux (4.10.0-18.20) zesty; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1680168

  * smartpqi driver needed in initram disk and installer (LP: #1680156)
    - UBUNU: [Config] Add smartpqi to d-i

linux (4.10.0-17.19) zesty; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1679718

  * Fix CVE-2017-7308 (LP: #1678009)
    - net/packet: fix overflow in check for priv area size
    - net/packet: fix overflow in check for tp_frame_nr
    - net/packet: fix overflow in check for tp_reserve

  * apparmor: oops on boot if parameters set on grub command line (LP: #1678048)
    - SAUCE: apparmor: fix parameters so that the permission test is bypassed at boot

  * apparmor: does not provide a way to detect policy updataes (LP: #1678032)
    - SAUCE: apparmor: add policy revision file interface

  * apparmor does not make support of query data visible (LP: #1678023)
    - SAUCE: apparmor: add label data availability to the feature set

  * apparmor query interface does not make supported query info available
    (LP: #1678030)
    - SAUCE: apparmor: add information about the query inteface to the feature set

  * change_profile incorrect when using namespaces with a compound stack
    (LP: #1677959)
    - SAUCE: apparmor: fix label parse for stacked labels

  * Zesty update to v4.10.8 stable release (LP: #1678930)
    - xfrm: policy: init locks early
    - xfrm_user: validate XFRM_MSG_NEWAE XFRMA_REPLAY_ESN_VAL replay_window
    - xfrm_user: validate XFRM_MSG_NEWAE incoming ESN size harder
    - KVM: nVMX: Fix nested VPID vmx exec control
    - KVM: x86: cleanup the page tracking SRCU instance
    - virtio_balloon: init 1st buffer in stats vq
    - pinctrl: qcom: Don't clear status bit on irq_unmask
    - c6x/ptrace: Remove useless PTRACE_SETREGSET implementation
    - h8300/ptrace: Fix incorrect register transfer count
    - mips/ptrace: Preserve previous registers for short regset write
    - sparc/ptrace: Preserve previous registers for short regset write
    - metag/ptrace: Preserve previous registers for short regset write
    - metag/ptrace: Provide default TXSTATUS for short NT_PRSTATUS
    - metag/ptrace: Reject partial NT_METAG_RPIPE writes
    - qla2xxx: Allow vref count to timeout on vport delete.
    - sched/rt: Add a missing rescheduling point
    - usb: musb: fix possible spinlock deadlock
    - Linux 4.10.8

  * [Hyper-V] pci-hyperv: Use device serial number as PCI domain (LP: #1667527)
    - net/mlx4_core: Use cq quota in SRIOV when creating completion EQs
    - PCI: hv: Use device serial number as PCI domain

  * Miscellaneous Ubuntu changes
    - [Config] flash-kernel should be a...

Changed in linux (Ubuntu Zesty):
status: Fix Committed → Fix Released
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
tags: added: verification-needed-yakkety
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-yakkety' to 'verification-done-yakkety'. If the problem still exists, change the tag 'verification-needed-yakkety' to 'verification-failed-yakkety'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (29.1 KiB)

This bug was fixed in the package linux - 4.4.0-75.96

---------------
linux (4.4.0-75.96) xenial; urgency=low

  * linux: 4.4.0-75.96 -proposed tracker (LP: #1684441)

  * [Hyper-V] hv: util: move waiting for release to hv_utils_transport itself
    (LP: #1682561)
    - Drivers: hv: util: move waiting for release to hv_utils_transport itself

linux (4.4.0-74.95) xenial; urgency=low

  * linux: 4.4.0-74.95 -proposed tracker (LP: #1682041)

  * [Hyper-V] hv: vmbus: Raise retry/wait limits in vmbus_post_msg()
    (LP: #1681893)
    - Drivers: hv: vmbus: Raise retry/wait limits in vmbus_post_msg()

linux (4.4.0-73.94) xenial; urgency=low

  * linux: 4.4.0-73.94 -proposed tracker (LP: #1680416)

  * CVE-2017-6353
    - sctp: deny peeloff operation on asocs with threads sleeping on it

  * vfat: missing iso8859-1 charset (LP: #1677230)
    - [Config] NLS_ISO8859_1=y

  * Regression: KVM modules should be on main kernel package (LP: #1678099)
    - [Config] powerpc: Add kvm-hv and kvm-pr to the generic inclusion list

  * linux-lts-xenial 4.4.0-63.84~14.04.2 ADT test failure with linux-lts-xenial
    4.4.0-63.84~14.04.2 (LP: #1664912)
    - SAUCE: apparmor: fix link auditing failure due to, uninitialized var

  * regession tests failing after stackprofile test is run (LP: #1661030)
    - SAUCE: fix regression with domain change in complain mode

  * Permission denied and inconsistent behavior in complain mode with 'ip netns
    list' command (LP: #1648903)
    - SAUCE: fix regression with domain change in complain mode

  * unexpected errno=13 and disconnected path when trying to open /proc/1/ns/mnt
    from a unshared mount namespace (LP: #1656121)
    - SAUCE: apparmor: null profiles should inherit parent control flags

  * apparmor refcount leak of profile namespace when removing profiles
    (LP: #1660849)
    - SAUCE: apparmor: fix ns ref count link when removing profiles from policy

  * tor in lxd: apparmor="DENIED" operation="change_onexec"
    namespace="root//CONTAINERNAME_<var-lib-lxd>" profile="unconfined"
    name="system_tor" (LP: #1648143)
    - SAUCE: apparmor: Fix no_new_privs blocking change_onexec when using stacked
      namespaces

  * apparmor oops in bind_mnt when dev_path lookup fails (LP: #1660840)
    - SAUCE: apparmor: fix oops in bind_mnt when dev_path lookup fails

  * apparmor auditing denied access of special apparmor .null fi\ le
    (LP: #1660836)
    - SAUCE: apparmor: Don't audit denied access of special apparmor .null file

  * apparmor label leak when new label is unused (LP: #1660834)
    - SAUCE: apparmor: fix label leak when new label is unused

  * apparmor reference count bug in label_merge_insert() (LP: #1660833)
    - SAUCE: apparmor: fix reference count bug in label_merge_insert()

  * apparmor's raw_data file in securityfs is sometimes truncated (LP: #1638996)
    - SAUCE: apparmor: fix replacement race in reading rawdata

  * unix domain socket cross permission check failing with nested namespaces
    (LP: #1660832)
    - SAUCE: apparmor: fix cross ns perm of unix domain sockets

  * Xenial update to v4.4.59 stable release (LP: #1678960)
    - xfrm: policy: init locks early
    - virtio_balloon: init ...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (14.5 KiB)

This bug was fixed in the package linux - 4.8.0-49.52

---------------
linux (4.8.0-49.52) yakkety; urgency=low

  * linux: 4.8.0-49.52 -proposed tracker (LP: #1684427)

  * [Hyper-V] hv: util: move waiting for release to hv_utils_transport itself
    (LP: #1682561)
    - Drivers: hv: util: move waiting for release to hv_utils_transport itself

linux (4.8.0-48.51) yakkety; urgency=low

  * linux: 4.8.0-48.51 -proposed tracker (LP: #1682034)

  * [Hyper-V] hv: vmbus: Raise retry/wait limits in vmbus_post_msg()
    (LP: #1681893)
    - Drivers: hv: vmbus: Raise retry/wait limits in vmbus_post_msg()

linux (4.8.0-47.50) yakkety; urgency=low

  * linux: 4.8.0-47.50 -proposed tracker (LP: #1679678)

  * CVE-2017-6353
    - sctp: deny peeloff operation on asocs with threads sleeping on it

  * CVE-2017-5986
    - sctp: avoid BUG_ON on sctp_wait_for_sndbuf

  * vfat: missing iso8859-1 charset (LP: #1677230)
    - [Config] NLS_ISO8859_1=y

  * [Hyper-V] pci-hyperv: Use device serial number as PCI domain (LP: #1667527)
    - net/mlx4_core: Use cq quota in SRIOV when creating completion EQs

  * Regression: KVM modules should be on main kernel package (LP: #1678099)
    - [Config] powerpc: Add kvm-hv and kvm-pr to the generic inclusion list

  * linux-lts-xenial 4.4.0-63.84~14.04.2 ADT test failure with linux-lts-xenial
    4.4.0-63.84~14.04.2 (LP: #1664912)
    - SAUCE: apparmor: fix link auditing failure due to, uninitialized var

  * regession tests failing after stackprofile test is run (LP: #1661030)
    - SAUCE: fix regression with domain change in complain mode

  * Permission denied and inconsistent behavior in complain mode with 'ip netns
    list' command (LP: #1648903)
    - SAUCE: fix regression with domain change in complain mode

  * unexpected errno=13 and disconnected path when trying to open /proc/1/ns/mnt
    from a unshared mount namespace (LP: #1656121)
    - SAUCE: apparmor: null profiles should inherit parent control flags

  * apparmor refcount leak of profile namespace when removing profiles
    (LP: #1660849)
    - SAUCE: apparmor: fix ns ref count link when removing profiles from policy

  * tor in lxd: apparmor="DENIED" operation="change_onexec"
    namespace="root//CONTAINERNAME_<var-lib-lxd>" profile="unconfined"
    name="system_tor" (LP: #1648143)
    - SAUCE: apparmor: Fix no_new_privs blocking change_onexec when using stacked
      namespaces

  * apparmor oops in bind_mnt when dev_path lookup fails (LP: #1660840)
    - SAUCE: apparmor: fix oops in bind_mnt when dev_path lookup fails

  * apparmor auditing denied access of special apparmor .null fi\ le
    (LP: #1660836)
    - SAUCE: apparmor: Don't audit denied access of special apparmor .null file

  * apparmor label leak when new label is unused (LP: #1660834)
    - SAUCE: apparmor: fix label leak when new label is unused

  * apparmor reference count bug in label_merge_insert() (LP: #1660833)
    - SAUCE: apparmor: fix reference count bug in label_merge_insert()

  * apparmor's raw_data file in securityfs is sometimes truncated (LP: #1638996)
    - SAUCE: apparmor: fix replacement race in reading rawdata

  * unix domain socket cross permission check failing with n...

Changed in linux (Ubuntu Yakkety):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.