Usually misses 2nd processor

Bug #97554 reported by Sam Liddicott
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Medium
Unassigned
linux-source-2.6.20 (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

Binary package hint: linux-image-2.6.20-13-generic

Usually when I boot an SMP kernel, it fails to detect the 2nd CPU (hyper-threading).
Maybe it succeeds 10% of the time.
Here's the dmesg output; what else is needed?

[ 0.000000] Linux version 2.6.20-13-lowlatency (root@palmer) (gcc version 4.1.2 (Ubuntu 4.1.2-0ubuntu4)) #2 SMP PREEMPT Sun Mar 25 00:23:53 UTC 2007
 (Ubuntu Unofficial)
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] sanitize start
[ 0.000000] sanitize end
[ 0.000000] copy_e820_map() start: 0000000000000000 size: 000000000009fc00 end: 000000000009fc00 type: 1
[ 0.000000] copy_e820_map() type is E820_RAM
[ 0.000000] copy_e820_map() start: 000000000009fc00 size: 0000000000000400 end: 00000000000a0000 type: 2
[ 0.000000] copy_e820_map() start: 00000000000f0000 size: 0000000000010000 end: 0000000000100000 type: 2
[ 0.000000] copy_e820_map() start: 0000000000100000 size: 000000001fef0000 end: 000000001fff0000 type: 1
[ 0.000000] copy_e820_map() type is E820_RAM
[ 0.000000] copy_e820_map() start: 000000001fff0000 size: 0000000000003000 end: 000000001fff3000 type: 4
[ 0.000000] copy_e820_map() start: 000000001fff3000 size: 000000000000d000 end: 0000000020000000 type: 3
[ 0.000000] copy_e820_map() start: 00000000fec00000 size: 0000000001400000 end: 0000000100000000 type: 2
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
[ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 000000001fff0000 (usable)
[ 0.000000] BIOS-e820: 000000001fff0000 - 000000001fff3000 (ACPI NVS)
[ 0.000000] BIOS-e820: 000000001fff3000 - 0000000020000000 (ACPI data)
[ 0.000000] BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
[ 0.000000] 0MB HIGHMEM available.
[ 0.000000] 511MB LOWMEM available.
[ 0.000000] found SMP MP-table at 000f53a0
[ 0.000000] Entering add_active_range(0, 0, 131056) 0 entries of 256 used
[ 0.000000] Zone PFN ranges:
[ 0.000000] DMA 0 -> 4096
[ 0.000000] Normal 4096 -> 131056
[ 0.000000] HighMem 131056 -> 131056
[ 0.000000] early_node_map[1] active PFN ranges
[ 0.000000] 0: 0 -> 131056
[ 0.000000] On node 0 totalpages: 131056
[ 0.000000] DMA zone: 32 pages used for memmap
[ 0.000000] DMA zone: 0 pages reserved
[ 0.000000] DMA zone: 4064 pages, LIFO batch:0
[ 0.000000] Normal zone: 991 pages used for memmap
[ 0.000000] Normal zone: 125969 pages, LIFO batch:31
[ 0.000000] HighMem zone: 0 pages used for memmap
[ 0.000000] DMI 2.3 present.
[ 0.000000] ACPI: RSDP (v000 IntelR ) @ 0x000f6d80
[ 0.000000] ACPI: RSDT (v001 IntelR AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x1fff3000
[ 0.000000] ACPI: FADT (v001 IntelR AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x1fff3040
[ 0.000000] ACPI: MADT (v001 IntelR AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x1fff7240
[ 0.000000] ACPI: DSDT (v001 INTELR AWRDACPI 0x00001000 MSFT 0x0100000e) @ 0x00000000
[ 0.000000] ACPI: PM-Timer IO Port: 0x408
[ 0.000000] ACPI: Local APIC address 0xfee00000
[ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[ 0.000000] Processor #0 15:2 APIC version 20
[ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
[ 0.000000] Processor #1 15:2 APIC version 20
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
[ 0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
[ 0.000000] IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[ 0.000000] ACPI: IRQ0 used by override.
[ 0.000000] ACPI: IRQ2 used by override.
[ 0.000000] ACPI: IRQ9 used by override.
[ 0.000000] Enabling APIC mode: Flat. Using 1 I/O APICs
[ 0.000000] Using ACPI (MADT) for SMP configuration information
[ 0.000000] Allocating PCI resources starting at 30000000 (gap: 20000000:dec00000)
[ 0.000000] Detected 2793.118 MHz processor.
[ 26.233694] Built 1 zonelists. Total pages: 130033
[ 26.233698] Kernel command line: root=UUID=983492f3-99eb-4dc4-a47d-afd1bde6ea94 ro quiet splash
[ 26.233844] mapped APIC to ffffd000 (fee00000)
[ 26.233846] mapped IOAPIC to ffffc000 (fec00000)
[ 26.233849] Enabling fast FPU save and restore... done.
[ 26.233852] Enabling unmasked SIMD FPU exception support... done.
[ 26.233862] Initializing CPU#0
[ 26.233926] PID hash table entries: 2048 (order: 11, 8192 bytes)
[ 26.235253] Console: colour VGA+ 80x25
[ 26.235469] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
[ 26.235666] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
[ 26.245691] Memory: 510104k/524224k available (2018k kernel code, 13548k reserved, 899k data, 328k init, 0k highmem)
[ 26.245700] virtual kernel memory layout:
[ 26.245701] fixmap : 0xfff4e000 - 0xfffff000 ( 708 kB)
[ 26.245702] pkmap : 0xff800000 - 0xffc00000 (4096 kB)
[ 26.245703] vmalloc : 0xe0800000 - 0xff7fe000 ( 495 MB)
[ 26.245704] lowmem : 0xc0000000 - 0xdfff0000 ( 511 MB)
[ 26.245705] .init : 0xc03df000 - 0xc0431000 ( 328 kB)
[ 26.245706] .data : 0xc02f8a95 - 0xc03d96d4 ( 899 kB)
[ 26.245707] .text : 0xc0100000 - 0xc02f8a95 (2018 kB)
[ 26.245710] Checking if this processor honours the WP bit even in supervisor mode... Ok.
[ 26.304798] Calibrating delay using timer specific routine.. 5588.37 BogoMIPS (lpj=2794187)
[ 26.304844] Security Framework v1.0.0 initialized
[ 26.304851] SELinux: Disabled at boot.
[ 26.304872] Mount-cache hash table entries: 512
[ 26.305023] CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000
[ 26.305036] CPU: Trace cache: 12K uops, L1 D cache: 8K
[ 26.305038] CPU: L2 cache: 512K
[ 26.305040] CPU: Physical Processor ID: 0
[ 26.305043] CPU: After all inits, caps: bfebfbff 00000000 00000000 00003080 00004400 00000000 00000000
[ 26.305053] Compat vDSO mapped to ffffe000.
[ 26.305056] Remapping vsyscall page to ffffe000
[ 26.305068] Checking 'hlt' instruction... OK.
[ 26.308877] SMP alternatives: switching to UP code
[ 26.309153] Early unpacking initramfs... done
[ 26.528095] ACPI: Core revision 20060707
[ 26.533985] ACPI: Looking for DSDT in initramfs... error, file /DSDT.aml not found.
[ 26.537539] CPU0: Intel(R) Pentium(R) 4 CPU 2.80GHz stepping 09
[ 26.537565] SMP alternatives: switching to SMP code
[ 26.537679] Booting processor 1/1 eip 3000
[ 31.669074] Not responding.
[ 31.669077] Inquiring remote APIC #1...
[ 31.669079] ... APIC #1 ID: failed
[ 31.669183] ... APIC #1 VERSION: failed
[ 31.669287] ... APIC #1 SPIV: failed
[ 31.669390] CPU #1 not responding - cannot use it.
[ 31.669423] Total of 1 processors activated (5588.37 BogoMIPS).
[ 31.669564] ENABLING IO-APIC IRQs
[ 31.669759] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 31.781595] Brought up 1 CPUs
[ 31.782046] Booting paravirtualized kernel on bare hardware
[ 31.782150] Time: 7:32:07 Date: 02/28/107

Revision history for this message
Cesare Tirabassi (norsetto) wrote :

Is this a self-compiled kernel?

Revision history for this message
Sam Liddicott (sam-liddicott) wrote :

It's a standard feisty kernel, generic or low-latency.

The edgy kernels actually crashed unless it was a cold boot, hence my eary feisty switch

Revision history for this message
Cesare Tirabassi (norsetto) wrote :

If you are 100% sure it is not an hardware failure than your best bet is to contact the kernel guru's directly (http://kernel.org).

Revision history for this message
Cesare Tirabassi (norsetto) wrote :

Before you ask ... they do accept bug reports on non MM kernels.

Revision history for this message
Brian Murray (brian-murray) wrote :

Thanks for taking the time to report this bug and helping to make Ubuntu better. The output of 'sudo lspci -vv' and 'sudo lspci -vvn' may also be helpful. What type of hardware is this? Thanks in advance.

Changed in linux-source-2.6.20:
assignee: nobody → brian-murray
status: Unconfirmed → Needs Info
Revision history for this message
Sam Liddicott (sam-liddicott) wrote :

lspci -vv attached

Revision history for this message
Sam Liddicott (sam-liddicott) wrote :

lspci -vvn attached

thanks for looking at this

Changed in linux-source-2.6.20:
assignee: brian-murray → ubuntu-kernel-team
status: Needs Info → Confirmed
Revision history for this message
Sam Liddicott (sam-liddicott) wrote :

It booted with 2 cpu's, so have the same logs for comparison

Revision history for this message
Sam Liddicott (sam-liddicott) wrote :

lspci -vv

Revision history for this message
Sam Liddicott (sam-liddicott) wrote :

lspci -vvn
with 2 cpu active

Revision history for this message
Sam Liddicott (sam-liddicott) wrote :

The main observations are that when only 1 cpu is found it takes an extra 10 seconds to build the zonelists and it takes another 5 seconds for the other cpu to be declared not responding.

Revision history for this message
Launchpad Janitor (janitor) wrote : This bug is now reported against the 'linux' package

Beginning with the Hardy Heron 8.04 development cycle, all open Ubuntu kernel bugs need to be reported against the "linux" kernel package. We are automatically migrating this bug to the new "linux" package. However, development has already began for the upcoming Intrepid Ibex 8.10 release. It would be helpful if you could test the upcoming release and verify if this is still an issue - http://www.ubuntu.com/testing . If the issue still exists, please update this report by changing the Status of the "linux" task from "Incomplete" to "New". We appreciate your patience and understanding as we make this transition. Thanks!

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

Revision history for this message
Sam Liddicott (sam-liddicott) wrote : Re: [Bug 97554] Re: Usually misses 2nd processor

I've moved to Interpid Ibex and I still have this problem even with
kernel 2.6.27-2-generic

Still often, only 1 cpu is recognized, although on Sunday 2 CPU were
recognized.

Sam

Changed in linux:
assignee: nobody → ubuntu-kernel-team
importance: Undecided → Medium
status: Incomplete → Triaged
Revision history for this message
Sam Liddicott (sam-liddicott) wrote :

Still a problem with: 2.6.27-4-generic

Sample dmesg:
[ 0.004383] CPU: Physical Processor ID: 0
[ 0.004402] Checking 'hlt' instruction... OK.
[ 0.022333] ACPI: Core revision 20080609
[ 0.024434] ACPI: Checking initramfs for custom DSDT
[ 0.387239] ENABLING IO-APIC IRQs
[ 0.387426] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 0.427304] CPU0: Intel(R) Pentium(R) 4 CPU 2.80GHz stepping 09
[ 0.428026] Booting processor 1/1 ip 6000
[ 5.529911] Not responding.
[ 5.529967] Inquiring remote APIC #1...
[ 5.529971] ... APIC #1 ID: failed
[ 5.530075] ... APIC #1 VERSION: failed
[ 5.530179] ... APIC #1 SPIV: failed
[ 5.530371] Brought up 1 CPUs
[ 5.530376] Total of 1 processors activated (5585.99 BogoMIPS).
[ 5.530401] CPU0 attaching sched-domain:
[ 5.530408] domain 0: span 0 level CPU
[ 5.530413] groups: 0
[ 5.530838] net_namespace: 840 bytes
[ 5.530851] Booting paravirtualized kernel on bare hardware

Revision history for this message
Launchpad Janitor (janitor) wrote : Kernel team bugs

Per a decision made by the Ubuntu Kernel Team, bugs will longer be assigned to the ubuntu-kernel-team in Launchpad as part of the bug triage process. The ubuntu-kernel-team is being unassigned from this bug report. Refer to https://wiki.ubuntu.com/KernelTeamBugPolicies for more information. Thanks.

Revision history for this message
Trey Blancher (ectospasm) wrote :

I'm still getting the bug in 2.6.27-11, though at the moment all four of my cores have been recognized.

Revision history for this message
Trey Blancher (ectospasm) wrote :

Attaching what it should look like.

Revision history for this message
Trey Blancher (ectospasm) wrote :

My problem has returned, and thus far I have been unsuccessful in returning to the desired functionality.

Revision history for this message
Sam Liddicott (sam-liddicott) wrote : RE: [Bug 97554] Re: Usually misses 2nd processor

I've given jaunty a few goes now and I often miss the second core still.

Sam

Revision history for this message
Trey Blancher (ectospasm) wrote :

This may be a fluke, but I was trying to see if this would work, and it did: I booted into Windows, to verify that it saw all four of my cores. It did (Task Manager/Performance showed four graphs for CPU usage), and when I rebooted into Intrepid, all four cores were recognized by the kernel. I also noticed that the kernel didn't pause on APIC loading this time, either (I have taken out the "quiet" kernel parameter, so I have better insight as to what's going on with the booting kernel).

Just FYI. I will hold off on Jaunty until it's fully released, then I intend to reinstall this machine with an x86_64 system, to see if that makes a difference.

Revision history for this message
nicolas314 (nicolas314-deactivatedaccount) wrote :

I observed the same kind of defect on an AMD Phenom 9550 Quad Core: most of the time only one CPU is recognized and the others are indicated as "not responding", and sometimes all four are present. Disabling or enabling ACPI does not change anything, the cores are not more often recognized.

One thing that changed the deal was to switch off the Ubuntu splash boot and disable any kind of VGA=xxx line in the kernel arguments. For some reason the boot process is much faster that way and has always caught all four CPUs so far, but that probably needs more experimentation.

N.

Revision history for this message
nicolas314 (nicolas314-deactivatedaccount) wrote :

Finally got it to work by trial and error. If I boot the machine with 'noapic' and remove all splash and vga= options I systematically get all cores recognized. Seems like 'noapic' is the real key here. Without it, booting can bring up 1, 2 or 4 cores in a seemingly random fashion.

Revision history for this message
Trey Blancher (ectospasm) wrote :

> Finally got it to work by trial and error. If I boot the machine with 'noapic' and remove all splash and
> vga= options I systematically get all cores recognized. Seems like 'noapic' is the real key here.
> Without it, booting can bring up 1, 2 or 4 cores in a seemingly random fashion.

See, that's weird, because it used to be that disabling APIC would disable SMP. Maybe it doesn't work that way anymore, I'll try to give it a shot. Thanks for the workaround!

Revision history for this message
Trey Blancher (ectospasm) wrote :

OK, after setting the kernel line with "noapic" and removing "splash", it boots with all cores. I haven't done more than this boot, although... the real first boot gave me some dreaded "not responding" messages, then it hung on initializing APIC.

I don't understand why the splash screen makes a difference here, but again I'm not a developer nor a PC architecture geek.

Revision history for this message
Trey Blancher (ectospasm) wrote :

Well, I thought it was fixed, but I had to reboot due to a power outage, and now I only have one core active (as seen through /proc/cpuinfo).

I haven't upgraded to Jaunty yet, I was planning on doing a reinstall with x86_64, to see if that makes a difference. I'll let you know what results I get.

Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

This bug report was marked as Triaged a while ago but has not had any updated comments for quite some time. Please let us know if this issue remains in the current Ubuntu release, http://www.ubuntu.com/getubuntu/download . If the issue remains, click on the current status under the Status column and change the status back to "New". Thanks.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-triage
Changed in linux (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Trey Blancher (ectospasm) wrote :

This problem still exists, in both Jaunty and Karmic. See my extensive troubleshooting in Karmic here:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/515270

I've moved this machine to another purpose, it now runs Windows 7. I'm too much of a Linux geek and I haven't figured out how to verify the Win7 machine is loaded with four cores, so I haven't ruled out the possibility of a hardware problem. In my previous testing, Windows XP showed four cores (before I reimaged it).

Revision history for this message
Sam Liddicott (sam-liddicott) wrote :

Still a problem for me.

Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

This bug report was marked as Incomplete and has not had any updated comments for quite some time. As a result this bug is being closed. Please reopen if this is still an issue in the current Ubuntu release http://www.ubuntu.com/getubuntu/download . Also, please be sure to provide any requested information that may have been missing. To reopen the bug, click on the current status under the Status column and change the status back to "New". Thanks.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-expired
Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Sam Liddicott (sam-liddicott) wrote :

Jeremy, you marked this as incomplete on 2010-03-13 but you didn't say what information you still wanted.

10 hours ago, you said (in an automated fashion) "and has not had any updated comments for quite some time" but comments were made by Jeremy with notes to extensive trouble shooting on the same day you marked it as incomplete.

I commented a month later to say I still had the trouble too.

Please justify the change from triaged to incomplete and say what information you still need.

Changed in linux (Ubuntu):
status: Expired → New
Revision history for this message
Sam Liddicott (sam-liddicott) wrote :

I will run "apport-collect -p linux 97554" this week with a fresh install

Revision history for this message
Andy Whitcroft (apw) wrote :

@Sam -- it doesn't look like you ever got to this apport-collect? Is this issue still present? Please do confirm that and get the apport information you indicated above.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Sam Liddicott (sam-liddicott) wrote :

10.10 live CD detects both processors, so I think you can close this one

Revision history for this message
Sam Liddicott (sam-liddicott) wrote :

Do you still want the apport? If so can you have it from the live cd or do you want an installation?

Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

This bug report was marked as Incomplete and has not had any updated comments for quite some time. As a result this bug is being closed. Please reopen if this is still an issue in the current Ubuntu development release http://cdimage.ubuntu.com/daily-live/current/ . Also, please be sure to provide any requested information that may have been missing. To reopen the bug, click on the current status under the Status column and change the status back to "New". Thanks.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.