Comment 9 for bug 791850

Revision history for this message
Stefan Bader (smb) wrote : Re: oneiric cluster compute instances do not boot

All the data structures look ok, cpu#0 has queued the mtrr_work_handler for all other cpus (for simplicity only looked and vcpu=2 here) and went into

       while (atomic_read(&data.count))
                cpu_relax();

which translates into:

0xffffffff81022a6b <set_mtrr+203>: nopl 0x0(%rax,%rax,1)
/home/smb/oneiric-amd64/ubuntu-2.6/arch/x86/include/asm/processor.h: 704
0xffffffff81022a70 <set_mtrr+208>: pause
/home/smb/oneiric-amd64/ubuntu-2.6/arch/x86/include/asm/atomic.h: 25
0xffffffff81022a72 <set_mtrr+210>: mov (%rax),%edx
/home/smb/oneiric-amd64/ubuntu-2.6/arch/x86/kernel/cpu/mtrr/main.c: 274
0xffffffff81022a74 <set_mtrr+212>: test %edx,%edx
0xffffffff81022a76 <set_mtrr+214>: jne 0xffffffff81022a70 <set_mtrr+208>

There does not seem to be a sensible way how cpu#0 should end up on a deeper call chain like it does appear to be. And cpu#1 should start the mtrr_work_handler via its migration task, which also does not seem to happen...

  PID PPID CPU TASK ST %MEM VSZ RSS COMM
      0 0 0 ffffffff81c0b020 RU 0.0 0 0 [swapper]
> 0 2 1 ffff8805abd38000 RU 0.0 0 0 [kworker/0:0]
> 1 0 0 ffff8805abd00000 RU 0.0 0 0 [swapper]
      2 0 0 ffff8805abd016f0 IN 0.0 0 0 [kthreadd]
      3 2 0 ffff8805abd02de0 IN 0.0 0 0 [ksoftirqd/0]
      4 2 0 ffff8805abd044d0 IN 0.0 0 0 [kworker/0:0]
      5 2 0 ffff8805abd05bc0 IN 0.0 0 0 [kworker/u:0]
      6 2 0 ffff8805abd20000 IN 0.0 0 0 [migration/0]
      7 2 1 ffff8805abd216f0 SW 0.0 0 0 [migration/1]
      8 2 1 ffff8805abd22de0 SW 0.0 0 0 [kworker/1:0]
      9 2 1 ffff8805abd244d0 SW 0.0 0 0 [ksoftirqd/1]
     10 2 0 ffff8805abd25bc0 IN 0.0 0 0 [kworker/0:1]