Kdump triggered manually after cpu offline operation fails to collect dump

Bug #1410817 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned
Utopic
Fix Released
Medium
Chris J Arges

Bug Description

SRU Justification:

[Impact]
Kdump triggered manually after cpu offline operation fails to collect dump

[Test Case]
See Steps to Reproduce below.

[Fix]
$ git describe --contains c1caae3de46a072d0855729aed6e793e536a4a55
v3.19-rc3~1^2~1

--

---Problem Description---
Kdump triggered manually after cpu offline operation fails to collect dump

---uname output---
Linux ubuntu 3.18.0-9-generic #10-Ubuntu SMP Mon Jan 12 21:35:28 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux

Machine Type = P8

---System Hang---
We have to reboot the LPAR and gain access to the machine again.

---Steps to Reproduce---
Install a Power VM LPAR with Ubuntu 15.04 ISO using Virtual DVD.
Then offline one of the cpu's of the machine.

root@ubuntu:~# lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Thread(s) per core: 8
Core(s) per socket: 1
Socket(s): 2
NUMA node(s): 2
Model: IBM,8284-22A
Hypervisor vendor: pHyp
Virtualization type: para
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0-15
NUMA node2 CPU(s):

root@ubuntu:~# chcpu -d 15
CPU 15 disabled

root@ubuntu:~# lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-14
Off-line CPU(s) list: 15
Thread(s) per core: 7
Core(s) per socket: 1
Socket(s): 2
NUMA node(s): 2
Model: IBM,8284-22A
Hypervisor vendor: pHyp
Virtualization type: para
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s): 0-14
NUMA node2 CPU(s):

Configure and enable kdump on the LPAR.

root@ubuntu:~# /etc/init.d/kdump-tools status
current state : ready to kdump
root@ubuntu:~# kdump-config load
Modified cmdline:BOOT_IMAGE=/boot/vmlinux-3.18.0-9-generic root=UUID=70957e56-8669-466f-b0e7-140f2ec39a04 ro splash quiet irqpoll maxcpus=1 nousb elfcorehdr=155072K
segment[0].mem:0x8000000 memsz:24510464
segment[1].mem:0x9760000 memsz:65536
segment[2].mem:0x9770000 memsz:65536
segment[3].mem:0x9780000 memsz:65536
segment[4].mem:0x9790000 memsz:22020096
segment[5].mem:0xec70000 memsz:196608
 * loaded kdump kernel
root@ubuntu:~#

root@ubuntu:~# kdump-config show
USE_KDUMP: 1
KDUMP_SYSCTL: kernel.panic_on_oops=1
KDUMP_COREDIR: /var/crash
crashkernel addr:
current state: ready to kdump

kexec command:
  /sbin/kexec -p --args-linux --command-line="BOOT_IMAGE=/boot/vmlinux-3.18.0-9-generic root=UUID=70957e56-8669-466f-b0e7-140f2ec39a04 ro splash quiet irqpoll maxcpus=1 nousb" --initrd=/boot/initrd.img-3.18.0-9-generic /boot/vmlinux-3.18.0-9-generic
root@ubuntu:~# kdump-config status
current state : ready to kdump

root@ubuntu:~# sysctl -w kernel.sysrq=1
kernel.sysrq = 1
root@ubuntu:~# cat /proc/sys/kernel/sysrq
1

Trigger the crash manually using sysrq-trigger.

root@ubuntu:~# echo c > /proc/sysrq-trigger

root@ubuntu:~# [ 311.088315] SysRq : Trigger a crash
[ 311.088331] Unable to handle kernel paging request for data at address 0x00000000
[ 311.088336] Faulting instruction address: 0xc0000000005f9094
[ 311.088341] Oops: Kernel access of bad area, sig: 11 [#1]
[ 311.088344] SMP NR_CPUS=2048 NUMA pSeries
[ 311.088349] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp bridge stp llc iptable_filter ip_tables x_tables pseries_rng rtc_generic binfmt_misc
[ 311.088372] CPU: 14 PID: 1705 Comm: bash Not tainted 3.18.0-9-generic #10-Ubuntu
[ 311.088377] task: c00000027773e470 ti: c0000002782d0000 task.ti: c0000002782d0000
[ 311.088381] NIP: c0000000005f9094 LR: c0000000005fa12c CTR: c0000000005f9060
[ 311.088385] REGS: c0000002782d39d0 TRAP: 0300 Not tainted (3.18.0-9-generic)
[ 311.088389] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 28242822 XER: 00000001
[ 311.088401] CFAR: c0000000000084d8 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1
GPR00: c0000000005fa12c c0000002782d3c50 c000000001426890 0000000000000063
GPR04: c000000001b85c28 c000000001b965e0 00000000000000ff c0000000015e71f0
GPR08: c000000000e76890 0000000000000001 0000000000000000 0000000000000001
GPR12: c0000000005f9060 c000000007b37e00 0000000000000000 0000000022000000
GPR16: 000000001016d6e8 0000010000088208 0000000010143eb8 00000000100c9390
GPR20: 0000000000000000 000000001017b008 0000000010143d18 0000000000000000
GPR24: 0000000010156c00 0000000010178868 c0000000013756a8 0000000000000004
GPR28: 0000000000000063 c00000000133f598 c000000001375a68 0000000000000000
[ 311.088459] NIP [c0000000005f9094] sysrq_handle_crash+0x34/0x50
[ 311.088463] LR [c0000000005fa12c] __handle_sysrq+0xec/0x280
[ 311.088467] Call Trace:
[ 311.088470] [c0000002782d3c50] [c000000000056604] ht64_call_hpte_insert1+0x4/0x3c (unreliable)
[ 311.088476] [c0000002782d3c70] [c0000000005fa12c] __handle_sysrq+0xec/0x280
[ 311.088481] [c0000002782d3d10] [c0000000005fa928] write_sysrq_trigger+0x78/0xa0
[ 311.088488] [c0000002782d3d40] [c000000000345a10] proc_reg_write+0xb0/0x110
[ 311.088494] [c0000002782d3d90] [c0000000002b954c] vfs_write+0xdc/0x260
[ 311.088499] [c0000002782d3de0] [c0000000002ba0ec] SyS_write+0x6c/0x110
[ 311.088504] [c0000002782d3e30] [c00000000000927c] syscall_exit+0x0/0x7c
[ 311.088508] Instruction dump:
[ 311.088511] 3842d830 7c0802a6 f8010010 f821ffe1 60000000 60000000 3d22001b 39491cdc
[ 311.088519] 39200001 912a0000 7c0004ac 39400000 <992a0000> 38210020 e8010010 7c0803a6
[ 311.088528] ---[ end trace 8543f2d87847eab7 ]---
[ 311.090822]
[ 311.090851] Sending IPI to other CPUs
[ 311.091870] IPI complete
[ 312.466826] Kernel panic - not syncing: Could not enable big endian exceptions

root@ubuntu:~# which kdump
/sbin/kdump
root@ubuntu:~# dpkg -S /sbin/kdump
kexec-tools: /sbin/kdump
root@ubuntu:~# dpkg --list | grep kexec
ii kexec-tools 1:2.0.7-5ubuntu1 ppc64el tools to support fast kexec reboots
ii pxe-kexec 0.2.4-3 ppc64el Fetch PXE configuration file and netboot using kexec

The fix patch is available upstream
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c1caae3de46a072d0855729aed6e793e536a4a55

Thanks
Hari

CVE References

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-120463 severity-critical targetmilestone-inin1504
Luciano Chavez (lnx1138)
affects: ubuntu → linux (Ubuntu)
tags: added: kernel-da-key
Chris J Arges (arges)
Changed in linux (Ubuntu):
status: New → In Progress
assignee: nobody → Chris J Arges (arges)
importance: Undecided → Medium
Revision history for this message
Chris J Arges (arges) wrote :

Vivid will pick up this fix when it rebases to 3.19. SRU'ing the fix for 3.16.

description: updated
Changed in linux (Ubuntu Utopic):
assignee: nobody → Chris J Arges (arges)
importance: Undecided → Medium
Changed in linux (Ubuntu):
assignee: Chris J Arges (arges) → nobody
Changed in linux (Ubuntu Utopic):
status: New → In Progress
Changed in linux (Ubuntu):
status: In Progress → New
importance: Medium → Undecided
Chris J Arges (arges)
Changed in linux (Ubuntu):
status: New → Triaged
Brad Figg (brad-figg)
Changed in linux (Ubuntu Utopic):
status: In Progress → Fix Committed
Chris J Arges (arges)
Changed in linux (Ubuntu):
status: Triaged → Fix Committed
Chris J Arges (arges)
Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-utopic' to 'verification-done-utopic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-utopic
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla
Download full text (3.8 KiB)

------- Comment From <email address hidden> 2015-02-16 07:18 EDT-------
I have followed the steps provided in the link:
https://wiki.ubuntu.com/Testing/EnableProposed

root@ubuntu:~# cat /etc/apt/sources.list | tail
# deb-src http://archive.canonical.com/ubuntu vivid partner

## Uncomment the following two lines to add software from Ubuntu's
## 'extras' repository.
## This software is not part of Ubuntu, but is offered by third-party
## developers who want to ship their latest software.
# deb http://extras.ubuntu.com/ubuntu vivid main
# deb-src http://extras.ubuntu.com/ubuntu vivid main

deb http://ports.ubuntu.com/ubuntu-ports vivid-proposed restricted main multiverse universe

root@ubuntu:~# cat /etc/apt/preferences.d/proposed-updates
Package: *
Pin: release a=vivid-proposed
Pin-Priority: 400

root@ubuntu:~# apt-get update

root@ubuntu:~# apt-get install linux-generic/vivid-proposed
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Release 'vivid-proposed' for 'linux-generic' was not found

root@ubuntu:~# apt-get install linux-image-*/vivid-propsed
Reading package lists... Done
Building dependency tree
Reading state information... Done
Note, selecting 'linux-image-3.18.0-11-generic' for regex 'linux-image-*'
Note, selecting 'linux-image-extra-3.18.0-9-generic' for regex 'linux-image-*'
Note, selecting 'linux-image-3.18.0-9-generic' for regex 'linux-image-*'
Note, selecting 'linux-image-3.18.0-8-generic-dbgsym' for regex 'linux-image-*'
Note, selecting 'linux-image' for regex 'linux-image-*'
Note, selecting 'linux-image-virtual' for regex 'linux-image-*'
Note, selecting 'linux-image-3.18.0-11-generic-dbgsym' for regex 'linux-image-*'
Note, selecting 'linux-image-3.0' for regex 'linux-image-*'
Note, selecting 'linux-image-extra-3.18.0-12-generic' for regex 'linux-image-*'
Note, selecting 'linux-image-3.18.0-12-generic' for regex 'linux-image-*'
Note, selecting 'linux-image-extra-3.18.0-8-generic' for regex 'linux-image-*'
Note, selecting 'linux-image-3.18.0-8-generic' for regex 'linux-image-*'
Note, selecting 'linux-image-2.6-rt' for regex 'linux-image-*'
Note, selecting 'linux-image-extra-3.18.0-13-generic' for regex 'linux-image-*'
Note, selecting 'linux-image-3.18.0-13-generic' for regex 'linux-image-*'
Note, selecting 'linux-image-generic' for regex 'linux-image-*'
Note, selecting 'linux-image-extra-3.18.0-11-generic' for regex 'linux-image-*'
E: Release 'vivid-propsed' for 'linux-image' was not found
E: Release 'vivid-propsed' for 'linux-image-generic' was not found
E: Release 'vivid-propsed' for 'linux-image-3.18.0-13-generic' was not found
E: Release 'vivid-propsed' for 'linux-image-3.0' was not found
E: Release 'vivid-propsed' for 'linux-image-extra-3.18.0-13-generic' was not found
E: Release 'vivid-propsed' for 'linux-image-virtual' was not found
E: Release 'vivid-propsed' for 'linux-image-2.6-rt' was not found
E: Release 'vivid-propsed' for 'linux-image-3.18.0-9-generic' was not found
E: Release 'vivid-propsed' for 'linux-image-3.18.0-8-generic-dbgsym' was not found
E: Release 'vivid-propsed' for 'linux-image-3.18.0-11-generic-dbgsym' was not found
E: Release 'vivid-propsed' for 'linux-image-3....

Read more...

tags: removed: verification-needed-utopic
Revision history for this message
Chris J Arges (arges) wrote :

Hi can you verify for utopic-proposed instead of vivid-proposed?
Thanks,

tags: added: verification-needed-utopic
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-02-19 05:12 EDT-------
I have changed the /etc/apt/sources.list file with utopic-proposed as mentioned.

root@ubuntu:~# cat /etc/apt/sources.list | tail

## Uncomment the following two lines to add software from Ubuntu's
## 'extras' repository.
## This software is not part of Ubuntu, but is offered by third-party
## developers who want to ship their latest software.
# deb http://extras.ubuntu.com/ubuntu vivid main
# deb-src http://extras.ubuntu.com/ubuntu vivid main

#deb http://ports.ubuntu.com/ubuntu-ports vivid-proposed restricted main multiverse universe
deb http://ports.ubuntu.com/ubuntu-ports utopic-proposed restricted main multiverse universe

root@ubuntu:~# apt-get install linux-generic/utopic-proposed
Reading package lists... Done
Building dependency tree
Reading state information... Done
Selected version '3.16.0.31.32' (Ubuntu:14.10/utopic-proposed [ppc64el]) for 'linux-generic'
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
linux-generic : Depends: linux-image-generic (= 3.16.0.31.32) but 3.18.0.13.13 is to be installed
Depends: linux-headers-generic (= 3.16.0.31.32) but 3.18.0.13.13 is to be installed
E: Unable to correct problems, you have held broken packages.

The above issue is occurring since I am trying to install the utopic-proposed kernel images on the Ubuntu 15.04 daily build ie the vivid kernel image.

Do you want me to test the utopic-proposed kernel installed on Ubuntu 14.10 GAed ISO.

tags: removed: verification-needed-utopic
Revision history for this message
Chris J Arges (arges) wrote :

Yes, please install the utopic kernel on utopic. Installing the utopic kernel in a vivid install isn't a supported configuration. Thanks

tags: added: verification-needed-utopic
Revision history for this message
Breno Leitão (breno-leitao) wrote :

Chris,

Is it possible to give us one or more days to test this fix? I expect to have it validated by Monday.

Revision history for this message
Breno Leitão (breno-leitao) wrote :

Checked internally at IBM. Marking it as verification done.

tags: added: verification-done-utopic
removed: verification-needed-utopic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (28.3 KiB)

This bug was fixed in the package linux - 3.16.0-31.41

---------------
linux (3.16.0-31.41) utopic; urgency=low

  [ Seth Forshee ]

  * Release Tracking Bug
    - LP: #1419961

  [ Andy Whitcroft ]

  * [Debian] arm64 -- build ubuntu drivers
    - LP: #1411284
  * hyper-v -- fix comment handing in /etc/network/interfaces
    - LP: #1413020

  [ Ben Hutchings ]

  * SAUCE: rtsx_usb_ms: Use msleep_interruptible() in polling loop
    - LP: #1413149

  [ Brad Figg ]

  * SAUCE: Config IWLWIFI_UAPSD=N

  [ Kamal Mostafa ]

  * [Packaging] force "dpkg-source -I -i" behavior

  [ Kukjin Kim ]

  * SAUCE: (no-up) ARM: SAMSUNG: fix the CPU_ID for EXYNOS5440
    - LP: #1411062

  [ Leann Ogasawara ]

  * ubuntu: AUFS -- Resolve build failure union has no member named
    'd_child'

  [ Ming Lei ]

  * SAUCE: (no-up) ARM: EXYNOS: fix booting oops on exynos5440
    - LP: #1411062
  * SAUCE: (no-up) ARM: exynos5440-sd5v1: switch to fixed-link DT binding
    - LP: #1417339
  * SAUCE: (no-up) net: stmmac: add fixed_phy support via fixed-link DT
    binding
    - LP: #1417339

  [ Upstream Kernel Changes ]

  * Revert "[SCSI] mpt2sas: Remove phys on topology change."
    - LP: #1419125
  * Revert "[SCSI] mpt3sas: Remove phys on topology change"
    - LP: #1419125
  * Revert "ARM: 7830/1: delay: don't bother reporting bogomips in
    /proc/cpuinfo"
    - LP: #1419125
  * powerpc/powernv: Don't call generic code on offline cpus
    - LP: #1400411
  * powerpc/powernv: Return to cpu offline loop when finished in KVM guest
    - LP: #1400411
  * powerpc/powernv: Switch off MMU before entering nap/sleep/rvwinkle mode
    - LP: #1400411
  * powerpc/powernv: Enable Offline CPUs to enter deep idle states
    - LP: #1400411
  * powernv/cpuidle: Redesign idle states management
    - LP: #1400411
  * powernv/powerpc: Add winkle support for offline cpus
    - LP: #1400411
  * powerpc/kdump: Ignore failure in enabling big endian exception during
    crash
    - LP: #1410817
  * powerpc/perf/hv-24x7: Use kmem_cache_free
    - LP: #1410519
  * powerpc/perf/hv-24x7: use kmem_cache instead of aligned stack
    allocations
    - LP: #1410519
  * powerpc/perf/hv-24x7: Use per-cpu page buffer
    - LP: #1410519
  * power/perf/hv-24x7: Use kmem_cache_free() instead of kfree
    - LP: #1410519
  * KVM: x86: SYSENTER emulation is broken
    - LP: #1414651
    - CVE-2015-0239
  * powerpc/xmon: Fix another endiannes issue in RTAS call from xmon
    - LP: #1415919
  * HID: i2c-hid: call the hid driver's suspend and resume callbacks
    - LP: #1417363
  * HID: i2c-hid: Do not free buffers in i2c_hid_stop()
    - LP: #1417363
  * ALSA: hda - add mic mute led hook for dell machines
    - LP: #1418832
  * ALSA: hda - move DELL_WMI_MIC_MUTE_LED to the tail in the quirk chain
    - LP: #1381856, #1418832
  * ALSA: hda - fix the mic mute led problem for Latitude E5550
    - LP: #1381856, #1418832
  * drm/i915: don't warn if backlight unexpectedly enabled
    - LP: #1419125
  * drm/i915/dp: only use training pattern 3 on platforms that support it
    - LP: #1419125
  * udptunnel: Add SKB_GSO_UDP_TUNNEL during gro_complete.
    - LP: #1419125
  * s390/3215: fix hanging console issue
    - LP...

Changed in linux (Ubuntu Utopic):
status: Fix Committed → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote : crash log offlin'ing cpu 5 with latest kernel in utopic-proposed

------- Comment on attachment From <email address hidden> 2015-02-20 13:55 EDT-------

With the fix patch included in latest kernel from utopic-proposed,
vmcore core is saved successfully & this issue is no longer reproducible.

Thanks
Hari

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.