DRBD 8.0.11 is unusably slow
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Hardy |
Invalid
|
Medium
|
Unassigned |
Bug Description
I'm using DRBD as a backend for postgre in a two node HA setup and I'm
experiencing severe slowdowns. An analysis follows below. All results
have been obtained with linuxHA being off and nothing resource
intensive running on the servers. I have executed each benchmark
several times (at least 3 times each) to make sure I'm not falling prey
to statistic outliers.
Upgrading to 8.0.13 (I rebuilt the ubuntu package with the new upstream sources) and setting the no-disk-flushes and no-md-flushes solved the problem. The write speed for 1000 512byte chunks is running at about 3.3MB/s. The 8.0.11 version does not support those options, which leaves people with server hardware (a battery backed write cache is REQUIRED to enable these options) with an unusably slow DRBD setup. The upgrade to 8.0.13 is very easy and it is a bugfix only release.
Please note the following two performance figures below:
Speed for: sudo dd if=/dev/zero of=/dev/drbd0 bs=512
Disconnected: 3.5 MB/s
Connected: 3.5 kB/s
= Network latency =
asterix02@
192.168.1.1
TCP latency using 192.168.1.1: 0.2479 microseconds
asterix01@
192.168.1.2
TCP latency using 192.168.1.2: 0.2463 microseconds
= Network throughput =
asterix01@
M -c 192.168.1.2
-------
Client connecting to 192.168.1.2, TCP port 5001
TCP window size: 0.02 MByte (default)
-------
[ 3] local 192.168.1.1 port 39381 connected with 192.168.1.2 port 5001
[ 3] 0.0-10.0 sec 1116 MBytes 112 MBytes/sec
= Local disk (IBM Serveraid RAID 1) =
asterix01@
oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 9.97766 s, 108 MB/s
asterix02@
oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 9.84385 s, 109 MB/s
asterix01@
count=1000 oflag=direct
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 0.136389 s, 3.8 MB/s
asterix02@
count=1000 oflag=direct
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 0.140179 s, 3.7 MB/s
= DRBD default configuration =
asterix02@
disk {
size 0s _is_default; # bytes
on-io-error detach;
fencing dont-care _is_default;
}
syncer {
rate 33792k; # bytes/second
after -1 _is_default;
al-extents 127 _is_default;
}
_this_host {
device "/dev/drbd0";
disk "/dev/sda5";
meta-disk internal;
}
== Disconnected DRBD ==
asterix02@
oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 43.7656 s, 24.5 MB/s
asterix02@
count=1000 oflag=direct
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 0.14615 s, 3.5 MB/s
== Connected DRBD (no resync happening) ==
asterix02@
oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 53.9678 s, 19.9 MB/s
asterix02@
count=1000 oflag=direct
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 144.54 s, 3.5 kB/s
= Optimised DRBD =
disk {
size 0s _is_default; # bytes
on-io-error detach;
fencing dont-care _is_default;
}
net {
timeout 20; # 1/10 seconds
max-buffers 8192;
connect-int 10 _is_default; # seconds
ping-int 1; # seconds
sndbuf-size 131070 _is_default; # bytes
ko-count 0 _is_default;
rr-conflict disconnect _is_default;
}
syncer {
rate 33792k; # bytes/second
after -1 _is_default;
al-extents 2129;
}
protocol C;
_this_host {
device "/dev/drbd0";
disk "/dev/sda5";
meta-disk internal;
address 192.168.1.2:7788;
}
_remote_host {
address 192.168.1.1:7788;
}
asterix02@
version: 8.0.11 (api:86/proto:86)
GIT-hash: b3fe2bdfd3b9f7c
2008-02-12 11:56:43
0: cs:Connected st:Primary/
ns:1048576 nr:0 dw:32129175 dr:66621614 al:2934 bm:578 lo:0 pe:0
ua:0 ap:0
resync: used:0/31 hits:196447 misses:193 starving:0 dirty:0
changed:193
act_log: used:0/2129 hits:218622 misses:256 starving:0 dirty:0
changed:256
== Disconnected DRBD ==
asterix02@
oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 8.48373 s, 127 MB/s
asterix02@
count=1000 oflag=direct
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 0.145206 s, 3.5 MB/s
== Connected DRBD (no resync happening) ==
asterix02@
oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 20.0519 s, 53.5 MB/s
asterix02@
count=1000 oflag=direct
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 144.431 s, 3.5 kB/s
Changed in drbd8: | |
importance: | Undecided → Medium |
status: | New → In Progress |
affects: | drbd8 (Ubuntu) → linux (Ubuntu) |
Unfortunately upgrading drbd8 is not an option for a LTS due to the fact that hardy is a long term release. However intrepid has 8.2.6 and you can possibly ask for a backport.
Regards
chuck