Houston, we have a problem

Since putting in the new colo box, we (myself and the two “tenants” on the Xen user domains (domU)) have noticed it being very slow. At times it seems like the first time you try something it will be very slow, but if you try again immediately it will run quickly. For instance, sometimes a page load will time out, but you hit refresh and it will load quite quickly. I’ve started to suspect the problem is the disks, because the CPU is pretty fast. In order to pin point the problem, I’ve decided to try and benchmark the colo box against my home computer. Both computers have SATA 3Gb/s disks in a software RAID-1 (mirror). Both computers have dual core CPUs (although the home one is a Core2 Duo at 1.86GHz and the colo is a Xeon at 3.0GHz). However, the colo is also running tons of other stuff and it’s running in a Xen domU, so that might slow things down a bit.

For CPU benchmarking, I used openssl speed. On my home computer:


OpenSSL 0.9.8k 25 Mar 2009
built on: Thu Feb 10 01:45:33 UTC 2011
options:bn(64,64) md2(int) rc4(ptr,char) des(idx,cisc,16,int) aes(partial) blowfish(ptr2) 
compiler: cc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wa,--noexecstack -g -Wall -DMD32_REG_T=int -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: times
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2               1233.34k     2506.47k     3384.51k     3707.98k     3831.21k
mdc2                 0.00         0.00         0.00         0.00         0.00 
md4              36174.15k   112076.35k   277944.27k   437643.17k   530925.48k
md5              28182.23k    84092.25k   191095.81k   280655.53k   325096.79k
hmac(md5)        25911.56k    79791.57k   187327.49k   279393.28k   328818.69k
sha1             29258.92k    82639.24k   173652.82k   239959.04k   271652.18k
rmd160           19419.02k    48166.40k    90977.71k   117022.04k   127735.13k
rc4             220315.24k   236429.10k   241096.79k   243352.92k   244665.00k
des cbc          35345.48k    36484.82k    36806.04k    36817.08k    36894.14k
des ede3         13849.45k    14087.21k    14170.88k    14210.39k    14218.58k
idea cbc             0.00         0.00         0.00         0.00         0.00 
seed cbc             0.00         0.00         0.00         0.00         0.00 
rc2 cbc          18850.88k    19148.14k    19207.19k    19135.83k    19188.39k
rc5-32/12 cbc        0.00         0.00         0.00         0.00         0.00 
blowfish cbc     60789.70k    64889.00k    65634.30k    65987.24k    66166.78k
cast cbc         49565.96k    52046.28k    52523.07k    52824.70k    52819.29k
aes-128 cbc      83657.27k   114224.51k   125890.65k   129412.10k   130304.68k
aes-192 cbc      73978.66k    98561.69k   106482.04k   108560.78k   109460.46k
aes-256 cbc      67350.22k    86850.75k    92235.60k    94101.13k    94601.11k
camellia-128 cbc        0.00         0.00         0.00         0.00         0.00 
camellia-192 cbc        0.00         0.00         0.00         0.00         0.00 
camellia-256 cbc        0.00         0.00         0.00         0.00         0.00 
sha256           20433.66k    47227.73k    84241.41k   105419.09k   113986.22k
sha512           14730.78k    58787.35k   102487.16k   152436.27k   177884.21k
aes-128 ige     102774.68k   112141.99k   113067.61k   115188.39k   115731.11k
aes-192 ige      89810.17k    97180.77k    96548.70k    98161.72k    98565.92k
aes-256 ige      78696.62k    84907.28k    86186.26k    86598.55k    86936.57k
                  sign    verify    sign/s verify/s
rsa  512 bits 0.000172s 0.000016s   5814.4  63256.8
rsa 1024 bits 0.000839s 0.000045s   1191.8  22346.4
rsa 2048 bits 0.005198s 0.000155s    192.4   6451.4
rsa 4096 bits 0.036703s 0.000578s     27.2   1730.2
                  sign    verify    sign/s verify/s
dsa  512 bits 0.000167s 0.000173s   5994.7   5797.1
dsa 1024 bits 0.000442s 0.000502s   2263.4   1991.3
dsa 2048 bits 0.001501s 0.001734s    666.3    576.6

And on the colo box:

OpenSSL 0.9.8g 19 Oct 2007
built on: Sun Dec  5 14:49:39 UTC 2010
options:bn(64,64) md2(int) rc4(ptr,char) des(idx,cisc,16,int) aes(partial) blowfish(ptr2) 
compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -O3 -Wa,--noexecstack -g -Wall -DMD32_REG_T=int -DMD5_ASM
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: times
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md2               1728.59k     3527.49k     4775.17k     5231.65k     5377.87k
mdc2                 0.00         0.00         0.00         0.00         0.00 
md4              51041.10k   169654.17k   427971.84k   697366.53k   849767.08k
md5              39189.35k   120937.99k   288182.87k   437397.16k   520276.65k
hmac(md5)        42188.92k   126917.10k   299906.48k   439377.65k   517373.95k
sha1             39513.83k   112423.55k   241473.79k   338144.26k   381433.17k
rmd160           29294.66k    74915.86k   144438.45k   187218.50k   205974.19k
rc4             321875.09k   359484.53k   364062.19k   366331.90k   366250.67k
des cbc          51954.89k    54163.67k    54521.77k    54981.63k    54719.83k
des ede3         20366.52k    20704.36k    20754.26k    20727.84k    20802.03k
idea cbc             0.00         0.00         0.00         0.00         0.00 
seed cbc             0.00         0.00         0.00         0.00         0.00 
rc2 cbc          29953.81k    30576.43k    30882.99k    30934.02k    30911.15k
rc5-32/12 cbc        0.00         0.00         0.00         0.00         0.00 
blowfish cbc     98816.99k   105788.44k   108073.22k   108569.26k   108849.83k
cast cbc         77625.50k    83221.25k    84764.93k    85218.38k    85251.41k
aes-128 cbc     137038.37k   153613.38k   158563.10k   156533.98k   157990.91k
aes-192 cbc     125592.77k   135880.02k   139036.76k   136925.87k   138809.49k
aes-256 cbc     110799.76k   121811.20k   123537.58k   122071.04k   123587.24k
camellia-128 cbc        0.00         0.00         0.00         0.00         0.00 
camellia-192 cbc        0.00         0.00         0.00         0.00         0.00 
camellia-256 cbc        0.00         0.00         0.00         0.00         0.00 
sha256           27242.51k    61915.07k   108322.82k   133711.53k   143127.89k
sha512           19178.29k    78770.56k   130188.71k   191564.46k   221899.43k
aes-128 ige     143272.78k   159565.76k   163319.21k   162576.38k   163086.12k
aes-192 ige     127164.53k   140307.03k   143258.97k   141633.88k   142940.81k
aes-256 ige     114340.36k   123836.44k   126530.56k   125999.79k   126656.51k
                  sign    verify    sign/s verify/s
rsa  512 bits 0.000212s 0.000013s   4724.8  75053.7
rsa 1024 bits 0.000751s 0.000036s   1331.9  27723.3
rsa 2048 bits 0.004329s 0.000113s    231.0   8864.5
rsa 4096 bits 0.027680s 0.000392s     36.1   2553.7
                  sign    verify    sign/s verify/s
dsa  512 bits 0.000137s 0.000150s   7303.4   6675.7
dsa 1024 bits 0.000354s 0.000423s   2827.9   2364.4
dsa 2048 bits 0.001100s 0.001291s    908.9    774.7

What we see there is the colo box is a bit faster, although not as good as you’d expect from the difference in CPU speeds, but I haven’t tried to control for the other processes running.

Next for disk benchmarking, I used tiobench, a program that does reading and writing from multiple threads.

Here’s my home computer:

Unit information
================
File size = megabytes
Blk Size  = bytes
Rate      = megabytes per second
CPU%      = percentage of CPU used during the test
Latency   = milliseconds
Lat%      = percent of requests that took longer than X seconds
CPU Eff   = Rate divided by CPU% - throughput per cpu load

Sequential Reads
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32-29-generic             2000  4096    1   60.02 6.452%     0.065       99.95   0.00000  0.00000   930
2.6.32-29-generic             2000  4096    2   79.61 15.96%     0.092      369.12   0.00000  0.00000   499
2.6.32-29-generic             2000  4096    4   68.04 27.04%     0.202      912.16   0.00000  0.00000   252
2.6.32-29-generic             2000  4096    8   92.42 77.07%     0.314      843.82   0.00000  0.00000   120

Random Reads
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32-29-generic             2000  4096    1    1.25 0.557%     3.137       46.57   0.00000  0.00000   223
2.6.32-29-generic             2000  4096    2    1.07 0.480%     5.877      147.51   0.00000  0.00000   223
2.6.32-29-generic             2000  4096    4    0.79 0.407%    17.564      690.39   0.00000  0.00000   195
2.6.32-29-generic             2000  4096    8    1.27 6.769%    18.965     1016.62   0.00000  0.00000    19

Sequential Writes
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32-29-generic             2000  4096    1   40.34 19.74%     0.063     6055.90   0.00020  0.00000   204
2.6.32-29-generic             2000  4096    2   46.33 43.27%     0.108     1246.39   0.00000  0.00000   107
2.6.32-29-generic             2000  4096    4   32.59 58.54%     0.369     8947.65   0.00215  0.00000    56
2.6.32-29-generic             2000  4096    8   27.81 104.9%     0.542     6347.88   0.00508  0.00000    27

Random Writes
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.32-29-generic             2000  4096    1    1.41     0%     0.008        0.04   0.00000  0.00000     0
2.6.32-29-generic             2000  4096    2    1.28     0%     0.012        6.51   0.00000  0.00000     0
2.6.32-29-generic             2000  4096    4    1.13 0.288%     0.022       15.57   0.00000  0.00000   391
2.6.32-29-generic             2000  4096    8    1.17 2.178%     0.032       25.06   0.00000  0.00000    54

And on my colo box:


Unit information
================
File size = megabytes
Blk Size  = bytes
Rate      = megabytes per second
CPU%      = percentage of CPU used during the test
Latency   = milliseconds
Lat%      = percent of requests that took longer than X seconds
CPU Eff   = Rate divided by CPU% - throughput per cpu load

Sequential Reads
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.26-2-xen-amd64            400   4096    1   24.87 0.870%     0.157      747.91   0.00000  0.00000  2857
2.6.26-2-xen-amd64            400   4096    2   26.72 1.816%     0.278     1272.15   0.00000  0.00000  1471
2.6.26-2-xen-amd64            400   4096    4   51.55 8.763%     0.295      759.98   0.00000  0.00000   588
2.6.26-2-xen-amd64            400   4096    8   55.49 22.41%     0.519     1036.44   0.00000  0.00000   248

Random Reads
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.26-2-xen-amd64            400   4096    1    0.19 0.009%    20.471     4484.46   0.20000  0.00000  1953
2.6.26-2-xen-amd64            400   4096    2    0.26 0.127%    26.839     2869.00   0.10000  0.00000   206
2.6.26-2-xen-amd64            400   4096    4    0.33 0.537%    36.469     1297.61   0.00000  0.00000    62
2.6.26-2-xen-amd64            400   4096    8    0.39 1.406%    61.470     2777.42   0.20000  0.00000    28

Sequential Writes
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.26-2-xen-amd64            400   4096    1   10.36 2.186%     0.332     2917.16   0.00586  0.00000   474
2.6.26-2-xen-amd64            400   4096    2    9.75 4.009%     0.758     7747.94   0.00781  0.00000   243
2.6.26-2-xen-amd64            400   4096    4   10.01 8.086%     1.283     4718.16   0.00879  0.00000   124
2.6.26-2-xen-amd64            400   4096    8    7.62 10.96%     3.347     7253.95   0.04883  0.00000    69

Random Writes
                              File  Blk   Num                   Avg      Maximum      Lat%     Lat%    CPU
Identifier                    Size  Size  Thr   Rate  (CPU%)  Latency    Latency      >2s      >10s    Eff
---------------------------- ------ ----- ---  ------ ------ --------- -----------  -------- -------- -----
2.6.26-2-xen-amd64            400   4096    1    0.09 0.014%    28.788   112760.08   0.05000  0.02500   651
2.6.26-2-xen-amd64            400   4096    2    0.09 0.013%    21.702    77408.58   0.07500  0.02500   651
2.6.26-2-xen-amd64            400   4096    4    0.09 0.162%   101.883   143896.67   0.12500  0.07500    53
2.6.26-2-xen-amd64            400   4096    8    0.08 0.059%    73.916   121588.99   0.20000  0.12500   139

It looks to me like the read and write speeds are worse on the colo box, but more importantly, look at the maximum latencies, especially on Random Writes. When a write pauses for 12 seconds, you know somebody is going to notice!

I think it’s time to rip out those Caviar Green disks and put in something faster. I’d been told that they were a mistake for a server box, but I didn’t believe it. I figured it would be too busy for them to do any of their power saving mumbo-jumbo. It appears I was wrong.

11 thoughts on “Houston, we have a problem”

  1. Can this excess latency be explained by the aggressive power-down policy of the ‘caviar green’ products?

  2. Ga, this post is killing anyone who reads your LJ reposting ‘cos of massively over length lines inside pre sections 🙁

  3. I’m not sure I believe those latency values; “121588.99” milliseconds is 121 seconds – 2 minutes.

    Spinup delay, alone, can not explain that.

    Are you seeing sata errors in dom0 ?

    1. I’m not seeing any errors. The other possibility is that I did something really really wrong when I set up Xen on this box, but I did pretty much the same thing I did as on the old colo box.

      There are a bunch of errors from 04:54 this morning about “ext3_orphan_cleanup”. Other than that, nothing.

      1. Orphan cleanup would imply some form of unclean shutdown at some point, leading to filesystem corruption.

        But that shouldn’t cause 120second latency, either!

        Does “smartctl” show a large number of stop/starts (as per rw’s post)? If not then the “green” factor isn’t likely to be contributing. If it is then you can try disabling it.

        1. smartctl shows a lot of start/stops, however the number hasn’t increased since yesterday. It does, however, show two bad sectors. That could be a problem. Could it be that’s tripping the software RAID to rebuild or something?

          SMART Self-test log structure revision number 1
          Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
          # 1 Extended offline Completed: read failure 10% 3350 1952874952
          # 2 Conveyance offline Completed without error 00% 2124 –
          # 3 Extended offline Completed without error 00% 2120 –
          # 4 Extended offline Aborted by host 90% 2117 –

          1. The start/stop count suddenly not increasing for a day would make me think of ongoing mirror rebuilds with WD “green” disks. What does syslog have to say about that, mdadm at least dutifully logs such stuff:
            /var/log/syslog.3.gz:Mar 20 00:31:10 eeegitt mdadm[792]: Fail event detected on md device /dev/md0, component device /dev/sdc1
            /var/log/syslog.3.gz:Mar 20 00:47:51 eeegitt mdadm[792]: SpareActive event detected on md device /dev/md0, component device /dev/sdc1

            And the SMART error would make me run to the next shop and replace the disk – at least my disks at home all had the grace of hiding the “normal” bad sectors from SMART until they ran out of internal replacement sectors.

Comments are closed.