[SRU][Ubuntu 22.04.1]: Observed "Array Index out of bounds" Call Trace multiple times on Ubuntu 22.04.1 OS during boot

Bug #2008157 reported by Michael Reed
76
This bug affects 16 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
In Progress
Medium
Michael Reed
Jammy
Fix Released
Medium
Michael Reed
Kinetic
Fix Committed
Medium
AceLan Kao

Bug Description

SRU Justification:

[Impact]

When booted into Ubuntu 22.04.1 OS after installation, observed "Array Index out of bounds" Call Trace multiple times in dmesg.

Call Trace is as follow:
[ 6.125704] UBSAN: array-index-out-of-bounds in /build/linux-JjvoxS/linux-5.15.0/drivers/scsi/megaraid/megaraid_sas_fp.c:103:32
[ 6.125705] index 1 is out of range for type 'MR_LD_SPAN_MAP [1]'
[ 6.125707] CPU: 0 PID: 18 Comm: kworker/0:1 Not tainted 5.15.0-53-generic #59-Ubuntu
[ 6.125709] Hardware name: Dell Inc. , BIOS 11/08/2022
[ 6.125710] Workqueue: events work_for_cpu_fn
[ 6.125716] Call Trace:
[ 6.125718] <TASK>
[ 6.125720] show_stack+0x52/0x5c
[ 6.125725] dump_stack_lvl+0x4a/0x63
[ 6.125731] dump_stack+0x10/0x16
[ 6.125732] ubsan_epilogue+0x9/0x49
[ 6.125734] __ubsan_handle_out_of_bounds.cold+0x44/0x49
[ 6.125736] ? MR_PopulateDrvRaidMap+0x194/0x580 [megaraid_sas]
[ 6.125747] mr_update_load_balance_params+0xb9/0xc0 [megaraid_sas]
[ 6.125753] MR_ValidateMapInfo+0x8d/0x290 [megaraid_sas]
[ 6.125757] megasas_init_adapter_fusion+0x3ce/0x420 [megaraid_sas]
[ 6.125762] ? megasas_setup_reply_map+0x49/0xac [megaraid_sas]
[ 6.125768] megasas_init_fw.cold+0x87c/0x10c8 [megaraid_sas]
[ 6.125774] megasas_probe_one+0x15c/0x4e0 [megaraid_sas]
[ 6.125779] local_pci_probe+0x48/0x90
[ 6.125783] work_for_cpu_fn+0x17/0x30
[ 6.125785] process_one_work+0x228/0x3d0
[ 6.125786] worker_thread+0x223/0x420
[ 6.125787] ? process_one_work+0x3d0/0x3d0
[ 6.125788] kthread+0x127/0x150
[ 6.125790] ? set_kthread_struct+0x50/0x50
[ 6.125791] ret_from_fork+0x1f/0x30
[ 6.125796] </TASK>
[ 6.125796] ================================================================================

Steps to reproduce:
1. Connect PERC H355 controller to the system
2. Create RAID1 using drives connected to PERC Controller
3. Install Ubuntu 22.04.1 on VD
4. Boot into OS after installation
5. Multiple Call Traces of "array-index-out-of-bounds" are seen

Expected Behavior:
OS should boot without this Call Trace

[Fix]

[PATCH v3 0/6] Replace one-element arrays with flexible-array members
https://<email address hidden>/

48658213 scsi: megaraid_sas: Use struct_size() in code related to struct MR_PD_CFG_SEQ_NUM_SYNC

41e83026 scsi: megaraid_sas: Use struct_size() in code related to struct MR_FW_RAID_MAP

ee92366a scsi: megaraid_sas: Replace one-element array with flexible-array member in MR_PD_CFG_SEQ_NUM_SYNC

eeb3bab7 scsi: megaraid_sas: Replace one-element array with flexible-array member in MR_DRV_RAID_MAP

204a29a1 scsi: megaraid_sas: Replace one-element array with flexible-array member in MR_FW_RAID_MAP_DYNAMIC

ac23b92b scsi: megaraid_sas: Replace one-element array with flexible-array member in MR_FW_RAID_MAP

[Test Plan]

1. Connect PERC H355 controller to the system
2. Create RAID1 using drives connected to PERC Controller
3. Install Ubuntu 22.04.1 on VD
4. Boot into OS after installation
OS should boot without the Call Trace listed in the Impact field

[Where problems could occur]

[Other Info]
https://code.launchpad.net/~mreed8855/ubuntu/+source/linux/+git/jammy/+ref/array_bounds_lp_2008157

CVE References

Revision history for this message
Michael Reed (mreed8855) wrote :

I have created a test kernel. Please test it and provide feedback.

https://people.canonical.com/~mreed/dell/lp_1999503_array_index/

Revision history for this message
Michael Reed (mreed8855) wrote :

The test kernel was tested on 01-13-2023 and the issue was not seen

Michael Reed (mreed8855)
description: updated
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Libera.chat.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/2008157/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
Michael Reed (mreed8855)
Changed in ubuntu:
assignee: nobody → Michael Reed (mreed8855)
importance: Undecided → Medium
status: New → In Progress
affects: ubuntu → linux (Ubuntu)
Stefan Bader (smb)
Changed in linux (Ubuntu Jammy):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/5.15.0-72.79 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux verification-needed-jammy
AceLan Kao (acelankao)
Changed in linux (Ubuntu Kinetic):
status: New → In Progress
assignee: nobody → AceLan Kao (acelankao)
Stefan Bader (smb)
Changed in linux (Ubuntu Kinetic):
importance: Undecided → Medium
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (23.7 KiB)

This bug was fixed in the package linux - 5.15.0-72.79

---------------
linux (5.15.0-72.79) jammy; urgency=medium

  * jammy/linux: 5.15.0-72.79 -proposed tracker (LP: #2016548)

  * Add split lock detection for EMR (LP: #2015855)
    - x86/split_lock: Enumerate architectural split lock disable bit

  * selftest: fib_tests: Always cleanup before exit (LP: #2015956)
    - selftest: fib_tests: Always cleanup before exit

  * Add support for intel EMR cpu (LP: #2015372)
    - platform/x86: intel-uncore-freq: add Emerald Rapids support
    - perf/x86/intel/cstate: Add Emerald Rapids
    - perf/x86/rapl: Add support for Intel Emerald Rapids
    - intel_idle: add Emerald Rapids Xeon support
    - tools/power/x86/intel-speed-select: Add Emerald Rapid quirk
    - tools/power turbostat: Introduce support for EMR
    - powercap: intel_rapl: add support for Emerald Rapids
    - EDAC/i10nm: Add Intel Emerald Rapids server support

  * Kernel livepatch ftrace graph fix (LP: #2013603)
    - kprobes: treewide: Remove trampoline_address from
      kretprobe_trampoline_handler()
    - kprobes: treewide: Make it harder to refer kretprobe_trampoline directly
    - kprobes: Add kretprobe_find_ret_addr() for searching return address
    - s390/unwind: recover kretprobe modified return address in stacktrace
    - s390/unwind: fix fgraph return address recovery

  * Jammy update: v5.15.98 upstream stable release (LP: #2015600)
    - Linux 5.15.98

  * Jammy update: v5.15.97 upstream stable release (LP: #2015599)
    - ionic: refactor use of ionic_rx_fill()
    - Fix XFRM-I support for nested ESP tunnels
    - arm64: dts: rockchip: drop unused LED mode property from rk3328-roc-cc
    - ARM: dts: rockchip: add power-domains property to dp node on rk3288
    - HID: elecom: add support for TrackBall 056E:011C
    - ACPI: NFIT: fix a potential deadlock during NFIT teardown
    - btrfs: send: limit number of clones and allocated memory size
    - ASoC: rt715-sdca: fix clock stop prepare timeout issue
    - IB/hfi1: Assign npages earlier
    - neigh: make sure used and confirmed times are valid
    - HID: core: Fix deadloop in hid_apply_multiplier.
    - x86/cpu: Add Lunar Lake M
    - staging: mt7621-dts: change palmbus address to lower case
    - bpf: bpf_fib_lookup should not return neigh in NUD_FAILED state
    - net: Remove WARN_ON_ONCE(sk->sk_forward_alloc) from sk_stream_kill_queues().
    - vc_screen: don't clobber return value in vcs_read
    - scripts/tags.sh: Invoke 'realpath' via 'xargs'
    - scripts/tags.sh: fix incompatibility with PCRE2
    - usb: dwc3: pci: add support for the Intel Meteor Lake-M
    - USB: serial: option: add support for VW/Skoda "Carstick LTE"
    - usb: gadget: u_serial: Add null pointer check in gserial_resume
    - USB: core: Don't hold device lock while reading the "descriptors" sysfs file
    - Linux 5.15.97

  * Jammy update: v5.15.96 upstream stable release (LP: #2015595)
    - drm/etnaviv: don't truncate physical page address
    - wifi: rtl8xxxu: gen2: Turn on the rate control
    - drm/edid: Fix minimum bpc supported with DSC1.2 for HDMI sink
    - clk: mxl: Switch from direct readl/writel based IO to regmap based IO
    - ...

Changed in linux (Ubuntu Jammy):
status: Fix Committed → Fix Released
Vinay HM (vinay-hm)
tags: added: verification-done-jammy
removed: verification-needed-jammy
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-riscv-5.15/5.15.0-1034.38~20.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-focal-linux-riscv-5.15 verification-needed-focal
Revision history for this message
Vadim Sukhomlinov (vsukhoml) wrote :

I verified that on 22.04 5.15.0-72.79 fixes this issue. Thanks!

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-intel-iotg/5.15.0-1031.36 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-intel-iotg verification-needed-jammy
removed: verification-done-jammy
tags: added: verification-done-jammy
removed: verification-needed-jammy
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-aws/5.15.0-1038.43 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-aws verification-needed-jammy
removed: verification-done-jammy
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-azure/5.15.0-1040.47 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-azure
Stefan Bader (smb)
Changed in linux (Ubuntu Kinetic):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux/5.19.0-47.49 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-kinetic' to 'verification-done-kinetic'. If the problem still exists, change the tag 'verification-needed-kinetic' to 'verification-failed-kinetic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-kinetic-linux verification-needed-kinetic
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-aws-5.15/5.15.0-1046.51~20.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal-linux-aws-5.15' to 'verification-done-focal-linux-aws-5.15'. If the problem still exists, change the tag 'verification-needed-focal-linux-aws-5.15' to 'verification-failed-focal-linux-aws-5.15'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-focal-linux-aws-5.15-v2 verification-needed-focal-linux-aws-5.15
Revision history for this message
Vitaly Protsko (atanw) wrote :
Download full text (5.0 KiB)

Bug is still here. 5.15.0-91-generic

Nov 16 21:15:29 mon-host kernel: [ 101.739280] ================================================================================
Nov 16 21:15:29 mon-host kernel: [ 101.785597] UBSAN: array-index-out-of-bounds in /build/linux-90ta4T/linux-5.15.0/drivers/edac/i5000_edac.c:956:20
Nov 16 21:15:29 mon-host kernel: [ 101.786940] IPMI message handler: version 39.2
Nov 16 21:15:29 mon-host kernel: [ 101.836146] index 4 is out of range for type 'u16 [4]'
Nov 16 21:15:29 mon-host kernel: [ 101.836152] CPU: 0 PID: 447 Comm: systemd-udevd Not tainted 5.15.0-91-generic #101-Ubuntu
Nov 16 21:15:29 mon-host kernel: [ 101.836156] Hardware name: Dell Inc. PowerEdge 1950/0D8635, BIOS 2.7.0 10/30/2010
Nov 16 21:15:29 mon-host kernel: [ 101.836158] Call Trace:
Nov 16 21:15:29 mon-host kernel: [ 101.836162] <TASK>
Nov 16 21:15:29 mon-host kernel: [ 101.836166] show_stack+0x52/0x5c
Nov 16 21:15:29 mon-host kernel: [ 101.836175] dump_stack_lvl+0x4a/0x63
Nov 16 21:15:29 mon-host kernel: [ 101.836182] dump_stack+0x10/0x16
Nov 16 21:15:29 mon-host kernel: [ 101.836184] ubsan_epilogue+0x9/0x36
Nov 16 21:15:29 mon-host kernel: [ 101.836187] __ubsan_handle_out_of_bounds.cold+0x44/0x49
Nov 16 21:15:29 mon-host kernel: [ 101.836190] ? i5000_get_mc_regs.isra.0+0x14c/0x1c0 [i5000_edac]
Nov 16 21:15:29 mon-host kernel: [ 101.836197] i5000_probe1+0x506/0x5c0 [i5000_edac]
Nov 16 21:15:29 mon-host kernel: [ 101.836201] ? pci_bus_read_config_byte+0x40/0x70
Nov 16 21:15:29 mon-host kernel: [ 101.862944] ? do_pci_enable_device+0x54/0x110
Nov 16 21:15:29 mon-host kernel: [ 101.862948] i5000_init_one+0x26/0x30 [i5000_edac]
Nov 16 21:15:29 mon-host kernel: [ 101.862952] local_pci_probe+0x4b/0x90
Nov 16 21:15:29 mon-host kernel: [ 101.862956] pci_device_probe+0x119/0x1f0
Nov 16 21:15:29 mon-host kernel: [ 101.862960] really_probe+0x222/0x420
Nov 16 21:15:29 mon-host kernel: [ 101.862964] __driver_probe_device+0xe8/0x140
Nov 16 21:15:29 mon-host kernel: [ 101.862966] driver_probe_device+0x23/0xc0
Nov 16 21:15:29 mon-host kernel: [ 101.862969] __driver_attach+0xf7/0x1f0
Nov 16 21:15:29 mon-host kernel: [ 101.862971] ? __device_attach_driver+0x140/0x140
Nov 16 21:15:29 mon-host kernel: [ 101.862974] bus_for_each_dev+0x7f/0xd0
Nov 16 21:15:29 mon-host kernel: [ 101.862978] driver_attach+0x1e/0x30
Nov 16 21:15:29 mon-host kernel: [ 101.862980] bus_add_driver+0x148/0x220
Nov 16 21:15:29 mon-host kernel: [ 101.862982] ? vunmap_range_noflush+0x3d5/0x470
Nov 16 21:15:29 mon-host kernel: [ 101.862987] driver_register+0x95/0x100
Nov 16 21:15:29 mon-host kernel: [ 101.862990] ? 0xffffffffc03d8000
Nov 16 21:15:29 mon-host kernel: [ 101.862993] __pci_register_driver+0x68/0x70
Nov 16 21:15:29 mon-host kernel: [ 101.862996] i5000_init+0x36/0x1000 [i5000_edac]
Nov 16 21:15:29 mon-host kernel: [ 101.863000] do_one_initcall+0x49/0x1e0
Nov 16 21:15:29 mon-host kernel: [ 101.863005] ? kmem_cache_alloc_trace+0x19e/0x2e0
Nov 16 21:15:29 mon-host kernel: [ 101.863011] do_init_module+0x52/0x260
Nov 16 21:15:29 mon-host kernel: [ 101.863016] load_module+0xb2b/0xbc0
Nov 16 21:15:29 mon-host kernel: [ 101.8630...

Read more...

Revision history for this message
Frantisek K. (frantisekk) wrote :
Download full text (4.7 KiB)

Bug is still here...

Ubuntu 22.04 jammy

Broadcom MegaRAID 9580-8i8e

Linux version 6.2.0-37-generic (buildd@bos03-amd64-055) (x86_64-linux-gnu-gcc-11 (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 2 18:01:13 UTC 2 (Ubuntu 6.2.0-37.38~22.04.1-generic 6.2.16)

...
[ 3.869311] megasas: 07.719.03.00-rc1
[ 3.869764] megaraid_sas 0000:43:00.0: BAR:0x0 BAR's base_addr(phys):0x0000028080f00000 mapped virt_addr:0x0000000065c51147
[ 3.869770] megaraid_sas 0000:43:00.0: FW now in Ready state
[ 3.869775] megaraid_sas 0000:43:00.0: 63 bit DMA mask and 63 bit consistent mask
[ 3.869973] megaraid_sas 0000:43:00.0: firmware supports msix : (128)
[ 3.872925] megaraid_sas 0000:43:00.0: requested/available msix 72/72 poll_queue 0
[ 3.872928] ================================================================================
[ 3.872951] UBSAN: array-index-out-of-bounds in /build/linux-hwe-6.2-I50pf3/linux-hwe-6.2-6.2.0/arch/x86/include/asm/topology.h:72:28
[ 3.872979] index -1 is out of range for type 'cpumask *[1024]'
[ 3.872995] CPU: 33 PID: 538 Comm: systemd-udevd Not tainted 6.2.0-37-generic #38~22.04.1-Ubuntu
[ 3.872999] Hardware name: Supermicro Super Server/H12SSL-i, BIOS 2.5 09/08/2022
[ 3.873001] Call Trace:
[ 3.873004] <TASK>
[ 3.873007] dump_stack_lvl+0x48/0x70
[ 3.873015] dump_stack+0x10/0x20
[ 3.873018] __ubsan_handle_out_of_bounds+0xa2/0x100
[ 3.873024] ? __pfx_default_calc_sets+0x10/0x10
[ 3.873030] megasas_alloc_irq_vectors+0x215/0x220 [megaraid_sas]
[ 3.873043] megasas_init_fw+0x617/0x1320 [megaraid_sas]
[ 3.873057] megasas_probe_one+0x18d/0x5a0 [megaraid_sas]
[ 3.873069] local_pci_probe+0x4b/0xb0
[ 3.873075] pci_call_probe+0x55/0x190
[ 3.873080] pci_device_probe+0x84/0x120
[ 3.873084] ? srso_return_thunk+0x5/0x10
[ 3.873090] really_probe+0x1ed/0x450
[ 3.873095] __driver_probe_device+0x8a/0x190
[ 3.873099] driver_probe_device+0x23/0xd0
[ 3.873102] __driver_attach+0x10f/0x220
[ 3.873106] ? __pfx___driver_attach+0x10/0x10
[ 3.873109] bus_for_each_dev+0x83/0xe0
[ 3.873114] driver_attach+0x1e/0x30
[ 3.873116] bus_add_driver+0x152/0x250
[ 3.873119] ? srso_return_thunk+0x5/0x10
[ 3.873124] driver_register+0x83/0x160
[ 3.873127] __pci_register_driver+0x68/0x80
[ 3.873131] megasas_init+0xdb/0xff0 [megaraid_sas]
[ 3.873143] ? __pfx_init_module+0x10/0x10 [megaraid_sas]
[ 3.873153] do_one_initcall+0x49/0x240
[ 3.873159] ? srso_return_thunk+0x5/0x10
[ 3.873162] ? kmalloc_trace+0x2a/0xb0
[ 3.873168] do_init_module+0x52/0x240
[ 3.873173] load_module+0xb96/0xd60
[ 3.873177] ? security_kernel_post_read_file+0x5c/0x80
[ 3.873183] ? srso_return_thunk+0x5/0x10
[ 3.873186] ? kernel_read_file+0x25c/0x2b0
[ 3.873194] __do_sys_finit_module+0xcc/0x150
[ 3.873197] ? srso_return_thunk+0x5/0x10
[ 3.873200] ? __do_sys_finit_module+0xcc/0x150
[ 3.873209] __x64_sys_finit_module+0x18/0x30
[ 3.873213] do_syscall_64+0x5c/0x90
[ 3.873217] ? srso_return_thunk+0x5/0x10
[ 3.873221] ? syscall_exit_to_user_mode+0...

Read more...

Revision history for this message
DronKram (dronkram) wrote :

The bug is still present.
I have checked on 22.04.3 (kernel 5.15.0-91)
DELL PowerEdge R350
PERC H755

During boot, there are "array-index-out-of-bounds", "index -1 is out of range", and stack trace present. There is a part of dmesg output - https://pastebin.com/raw/Zfe19x12

Is it safe to use the server and store data with this issue present or is it better to wait for a fix?

Thanks.

Revision history for this message
DronKram (dronkram) wrote :

Hello.
Any updates on this?

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-mtk/5.15.0-1030.34 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy-linux-mtk' to 'verification-done-jammy-linux-mtk'. If the problem still exists, change the tag 'verification-needed-jammy-linux-mtk' to 'verification-failed-jammy-linux-mtk'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: kernel-spammed-jammy-linux-mtk-v2 verification-needed-jammy-linux-mtk
Revision history for this message
James Dingwall (a-james-launchpad) wrote (last edit ):
Download full text (8.8 KiB)

I have also encountered the same trace reported in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2008157/comments/15 on a Dell R450 system:

# lspci -vv -s 65:00.0
65:00.0 RAID bus controller: Broadcom / LSI MegaRAID 12GSAS/PCIe Secure SAS38xx
        DeviceName: SL3 RAID
        Subsystem: Dell MegaRAID 12GSAS/PCIe Secure SAS38xx
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at bbc00000 (64-bit, prefetchable) [size=1M]
        Region 2: Memory at bbd00000 (64-bit, prefetchable) [size=1M]
        Region 4: Memory at bbe00000 (32-bit, non-prefetchable) [size=1M]
        Region 5: I/O ports at a000 [size=256]
        Expansion ROM at <ignored> [disabled]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
                Address: 0000000000000000 Data: 0000
                Masking: 00000000 Pending: 00000000
        Capabilities: [70] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
                DevCtl: CorrErr- NonFatalErr+ FatalErr+ UnsupReq+
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 512 bytes, MaxReadReq 4096 bytes
                DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
                LnkCap: Port #0, Speed 16GT/s, Width x8, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 16GT/s (ok), Width x8 (ok)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, NROPrPrP-, LTR-
                         10BitTagComp+, 10BitTagReq-, OBFF Not Supported, ExtFmt-, EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS-, TPHComp-, ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF Disabled
                         AtomicOpsCtl: ReqEn-
                LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
                ...

Read more...

Revision history for this message
James Dingwall (a-james-launchpad) wrote :

I've had a look at the x86 code where this error comes from:

/* Returns a pointer to the cpumask of CPUs on Node 'node'. */
static inline const struct cpumask *cpumask_of_node(int node)
{
 return node_to_cpumask_map[node];
}

So it seems the error arises because it is asked for a negative index from the node_to_cpumask_map array. Interestingly the same name for sparc has handling for the case the argument is -1:

#define cpumask_of_node(node) ((node) == -1 ? \
                               cpu_all_mask : \
                               &numa_cpumask_lookup_table[node])

and so does powerpc:

#define cpumask_of_node(node) ((node) == -1 ? \
                               cpu_all_mask : \
                               node_to_cpumask_map[node])

in drivers/base/arch_numa.c there is a debug version for this name which returns cpu_all_mask as sparc and powerpc for NUMA_NO_NODE (include/linux/numa.h:#define NUMA_NO_NODE (-1)):

#ifdef CONFIG_DEBUG_PER_CPU_MAPS

/*
 * Returns a pointer to the bitmask of CPUs on Node 'node'.
 */
const struct cpumask *cpumask_of_node(int node)
{

        if (node == NUMA_NO_NODE)
                return cpu_all_mask;

        if (WARN_ON(node < 0 || node >= nr_node_ids))
                return cpu_none_mask;

        if (WARN_ON(node_to_cpumask_map[node] == NULL))
                return cpu_online_mask;

        return node_to_cpumask_map[node];
}
EXPORT_SYMBOL(cpumask_of_node);

#endif

Based on these alternative implementations it seems as though cpumask_of_node for x86 should be something like:

/* Returns a pointer to the cpumask of CPUs on Node 'node'. */
static inline const struct cpumask *cpumask_of_node(int node)
{
 return node == -1 ? cpu_all_mask : node_to_cpumask_map[node];
}

I have insufficient knowledge about this stuff to say that is definitely the case though.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.