Comment 31 for bug 1011792

Revision history for this message
Rick Branson (rbranson) wrote :

We've been able to reproduce the bug in a more isolated environment.

I wrote a Python script (pgslam.py) that generates the (correct enough) similar load to our production traffic. In addition, I wrote a bash script that will setup a hi1.4xlarge EC2 instance to reproduce the issue. During the tests, I launched the pgslam.py script from another instance and pointed it at the instance prepared with the bash script:

This command results in the EC2 instance built with that script locking up in under a minute:

$ python pgslam.py 'host=10.10.10.10 user=pgslam password=pgslam' 800

These messages appear in the console log:

706342.844192] BUG: soft lockup - CPU#7 stuck for 23s! [postgres:9266]
[706342.844272] Stack:
[706342.844296] Call Trace:
[706342.844409] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
[706370.844190] BUG: soft lockup - CPU#7 stuck for 23s! [postgres:9266]
[706370.844519] Stack:
[706370.844549] Call Trace:
[706370.844916] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
[706371.320186] INFO: rcu_sched detected stalls on CPUs/tasks: { 0 11 13} (detected by 7, t=15002 jiffies)
[706406.844191] BUG: soft lockup - CPU#7 stuck for 24s! [postgres:9266]
[706406.844293] Stack:
[706406.844330] Call Trace:
[706406.844461] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
[706434.844191] BUG: soft lockup - CPU#7 stuck for 22s! [postgres:9266]
[706434.844273] Stack:
[706434.844297] Call Trace:
[706434.844411] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
[706462.844192] BUG: soft lockup - CPU#7 stuck for 22s! [postgres:9266]
[706462.844273] Stack:
[706462.844297] Call Trace:
[706462.844412] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc