Alignment trap/Unhandled fault errors on boot

Bug #494831 reported by Paul Larson
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-mvl-dove (Ubuntu)
Fix Released
Critical
Eric Miao
Karmic
Fix Released
Undecided
Unassigned
Lucid
Fix Released
Critical
Eric Miao

Bug Description

Booting the Alpha 1 image, I get stuck at a point where I'm seeing console messages on the screen

...
 * Starting init crypto disks...
Alignment trap: not handling instruction ed9f9b93 at [<000626f2>]
Unhandled fault: alignment exception (0x011) at 0x00062942
(The previous two messages repeat 13 more times with the same addresses)

I can alt-f2 and switch to another console, dmesg also shows another message at the end of that:
Recursive core dump detected, aborting

I was able to manually bring up the network, but trying to scp anything such as dmesg info across to another machine resulted in a few more Alignment trap/Unhandled fault errors, and ssh failed.

Trying to run some commands, such as apport-cli, fail immediately with these errors.
I straced apport-cli as an example, and at the end, I had:
close(3)
munmap(0x40020000, 4096)
Alignment trap: not handling instruction eddf7a13 at [<0002b8d2>]
Unhandled fault: alignment exception (0x011) at 0x0002b922
--- SIGILL (Illegal instruction) @ 0 (0) ---
Alignment trap: not handling instruction eddf7a13 at [<0002b8d2>]
Unhandled fault: alignment exception (0x011) at 0x0002b922

Does this appear more likely to be kernel, or toolchain? I did not have this same problem on a different SoC with alpha1

Paul Larson (pwlars)
Changed in linux-mvl-dove (Ubuntu):
assignee: nobody → Eric Miao (eric.y.miao)
importance: Undecided → Critical
tags: added: iso-testing
Revision history for this message
Alexander Sack (asac) wrote :

does a lucid chroot in karmic work?

Changed in linux-mvl-dove (Ubuntu Lucid):
status: New → Confirmed
milestone: none → lucid-alpha-2
tags: added: armel armv7
Revision history for this message
Paul Larson (pwlars) wrote :

Behavior is similar under chroot, except that instead of getting the errors in dmesg, commands such as ssh just return immediately with "Illegal instruction"

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

See this bug for details of our current understanding on these SIGILLs:

https://bugs.launchpad.net/ubuntu/+source/squashfs/+bug/494667

What is the kernel version currently in use on the Marvell boards? If it's already 2.6.31 I would not expect to be having this problem— the imx51 buildds exhibit it because they're running on a 2.6.28 kernel.

If unaligned pointer use can be eliminated from the affected packages easily, it might be worth doing. (For squashfs-tools, it looks difficult and would require a lot of rewriting.)

To sanity-check whether our understanding of the problem is correct, can you:
  a) run under gdb, then when you get a SIGILL, do x/i $pc-2 (I would expect a ldm/stm/ldrd/strd instruction to have caused the fault)
  b) build the affected packages with -marm and see if the problem goes away? If not, something different may be going on...

In any case, we need to patch the kernel if is has some missing alignment fixup support after all, since we can't guarantee to fix all packages not to play fast and loose with unaligned pointers...

Revision history for this message
Paul Larson (pwlars) wrote :

Kernel version is 2.6.31-701. Unfortunately, I cannot run anything under gdb. Simply typing gdb by itself also results in errors like what I posted earlier.

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

Can you send me a coredump?

Checking the kernel sources for mvl, the required alignment fixup support appears to be present, so it looks like the problem is not caused by missing patches:

linux/arch/arm/mm/alignment.c:639:
/*
 * Convert Thumb-2 32 bit LDM, STM, LDRD, STRD to equivalent instruction
 * handlable by ARM alignment handler, also find the corresponding handler,
 * so that we can reuse ARM userland alignment fault fixups for Thumb.
 *
 * @pinstr: original Thumb-2 instruction; returns new handlable instruction
 * @regs: register context.
 * @poffset: return offset from faulted addr for later writeback
 *
 * NOTES:
 * 1. Comments below refer to ARMv7 DDI0406A Thumb Instruction sections.
 * 2. Register name Rt from ARMv7 is same as Rd from ARMv6 (Rd is Rt)
 */
static void *
do_alignment_t32_to_handler(unsigned long *pinstr, struct pt_regs *regs,
                            union offset_union *poffset)

[...]

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

^ Note, that was 2.6.31-701.2

Revision history for this message
Eric Miao (eric.y.miao) wrote :

This seems to be related to a Thumb2 errata on the Dove processor, patch is hereby attached. And a generated kernel seems to be working fine within a lucid chroot. I'll make this merged soon once verified this to be working in a live system instead of a chroot.

Changed in linux-mvl-dove (Ubuntu Lucid):
status: Confirmed → In Progress
Revision history for this message
Paul Larson (pwlars) wrote :

Eric, could you please also provide us with a build that contains this patch?

Revision history for this message
Eric Miao (eric.y.miao) wrote :
Revision history for this message
Paul Larson (pwlars) wrote :

Tested, seems to clear up the problems I was seeing in this bug

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-mvl-dove - 2.6.31-701.4

---------------
linux-mvl-dove (2.6.31-701.4) lucid; urgency=low

  [ Raymond Huang ]

  * SAUCE: dove: erratum of VLDR instruction bug in Thumb-2 mode
    - LP: #494831
 -- Andy Whitcroft <email address hidden> Thu, 17 Dec 2009 08:56:58 +0000

Changed in linux-mvl-dove (Ubuntu Lucid):
status: In Progress → Fix Released
Revision history for this message
Martin Pitt (pitti) wrote : Please test proposed package

Accepted linux into karmic-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

tags: added: verification-needed
Changed in linux-mvl-dove (Ubuntu Karmic):
status: New → Fix Committed
Revision history for this message
Martin Pitt (pitti) wrote :

Sorry, previous comment should have said "Accepted linux-mvl-dove ..."

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (15.7 KiB)

This bug was fixed in the package linux-mvl-dove - 2.6.31-210.21

---------------
linux-mvl-dove (2.6.31-210.21) karmic-proposed; urgency=low

  [ Raymond Huang ]

  * SAUCE: dove: erratum of VLDR instruction bug in Thumb-2 mode
    - LP: #494831

linux-mvl-dove (2.6.31-210.20) karmic-proposed; urgency=low

  [ Upstream Kernel Changes ]

  * ARM: Fix signal restart issues with NX and OABI compat
    - LP: #453682

linux-mvl-dove (2.6.31-210.19) karmic-proposed; urgency=low

  [ Upstream Kernel Changes ]

  * Rebased to 2.6.31-17.54

  [ Ubuntu: 2.6.31-17.54 ]

  * Same as unreleased 2.6.31-17.53 with security release merged.

  [ Ubuntu: 2.6.31-16.53 ]

  * ext4: Fix insufficient checks in EXT4_IOC_MOVE_EXT
    - LP: #492659
    - CVE-2009-4131

linux-mvl-dove (2.6.31-210.18) karmic-proposed; urgency=low

  [ Stefan Bader ]

  * Rebase to 2.6.31-17.53

  [ Ubuntu: 2.6.31-17.53 ]

  * SAUCE: AppArmor: Fix oops after profile removal
    - LP: #475619
  * SAUCE: AppArmor: Fix Oops when in apparmor_bprm_set_creds
    - LP: #437258
  * SAUCE: AppArmor: Fix cap audit_caching preemption disabling
    - LP: #479102
  * SAUCE: AppArmor: Fix refcounting bug causing leak of creds
    - LP: #479115
  * SAUCE: AppArmor: Fix oops there is no tracer and doing unsafe
    transition.
    - LP: #480112
  * Revert "[Upstream] (drop after 2.6.31) usb-storage: Workaround devices
    with bogus sense size"
    - LP: #461556
  * Revert "[Upstream] (drop after 2.6.31) Input: synaptics - add another
    Protege M300 to rate blacklist"
    - LP: #480144
  * [Config] udeb: Add squashfs to fs-core-modules
    - LP: #352615
  * Revert "e1000e: swap max hw supported frame size between 82574 and
    82583"
    - LP: #461556
  * Revert "drm/i915: Fix FDI M/N setting according with correct color
    depth"
    - LP: #480144
  * Revert "agp/intel: Add B43 chipset support"
    - LP: #480144
  * Revert "drm/i915: add B43 chipset support"
    - LP: #480144
  * Revert "ACPI: Attach the ACPI device to the ACPI handle as early as
    possible"
    - LP: #327499, #480144
  * SCSI: Retry ADD_TO_MLQUEUE return value for EH commands
    - LP: #461556
  * SCSI: Fix protection scsi_data_buffer leak
    - LP: #461556
  * SCSI: sg: Free data buffers after calling blk_rq_unmap_user
    - LP: #461556
  * ARM: pxa: workaround errata #37 by not using half turbo switching
    - LP: #461556
  * tracing/filters: Fix memory leak when setting a filter
    - LP: #461556
  * x86/paravirt: Use normal calling sequences for irq enable/disable
    - LP: #461556
  * USB: ftdi_sio: remove tty->low_latency
    - LP: #461556
  * USB: ftdi_sio: remove unused rx_byte counter
    - LP: #461556
  * USB: ftdi_sio: clean up read completion handler
    - LP: #461556
  * USB: ftdi_sio: re-implement read processing
    - LP: #461556
  * USB: pl2303: fix error characters not being reported to ldisc
    - LP: #461556
  * USB: digi_acceleport: Fix broken unthrottle.
    - LP: #461556
  * USB: serial: don't call release without attach
    - LP: #461556
  * USB: option: Toshiba G450 device id
    - LP: #461556
  * USB: ipaq: fix oops when device is plugged in
    - LP: #461556
  * USB: cp210x: Add support for the DW700 UA...

Changed in linux-mvl-dove (Ubuntu Karmic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.