Rsync ate my disk

Rsync has variously saved my life on occasions and then again is has completely brought the whole place to a grinding halt. Essentially the cause is that sometimes BackupPC decides it's time for a complete full backup. This entails asking Rsync to compute the hash (checksum/md5sum) of every file on storage. Needless to say the NFS server ends up with an iolatency of near 100% and the entire place grind to a halt.

Various solutions revolving around ionice completely fail to resolve the cache overload. The storage system here uses an SSD cache in front of raided spinning metal. So by the time ionice intervenes things have long gone.

The solution comes from Russia. https://habr.com/ru/post/332614/

Essentially the very clever idea is to use SAR to measure latency. Once it passes the usable threshold a pkill signal is sent to rsync to stop. When iolatency drops below the 30% limit another signal is sent to continue. Simple as that!

#!/bin/bash

INTERVAL=10
CNTR=0

while :
do
    CUR_LA=`LANG=C sar 1 $INTERVAL | grep Average | awk '{print $6}' | perl -pe 'if ($_ > 30) { print "HIGH "} else {print "LOW "}'`
    echo $CUR_LA
    MARKER=`echo $CUR_LA | awk '{print $1}'`
    if [ "$MARKER" = "LOW" ]
    then
        CNTR=0
        pkill dd -x --signal CONT
        continue
    else
        let "CNTR=$CNTR+1"
                pkill dd -x --signal STOP
    fi
    if [ "$CNTR" = "5" ]
    then
        echo "CNTR = $CNTR - CONT / 2 sec / STOP to avoid socket timeouts"
        CNTR=0
        pkill dd -x --signal CONT
        sleep 2
        pkill dd -x --signal STOP
    fi
done