rdiff-backup/rsync +ssh performance comparison

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

rdiff-backup/rsync +ssh performance comparison

Vadim Kouzmine
As a comment to my letter:

Subject:  [rdiff-backup-users] installation on Trustix 2.2, performance,
possible bug
Date:  Fri, 03 Feb 2006 15:29:38 -0500

Performed tests copying my home directory (~1GB, 21668 files) on
workstation (Gentoo Linux, rdiff-backup 1.0.1, python 2.4.2, rsync
2.6.0, OpenSSH_4.2p1/no_compression)

1)rdiff-backup /src -> /dst  3m 11s

2)rsync /src -> /dst         2m 45s

3)rdiff-backup localhost::/src -> /dst:  12m 5s

4)rsync localhost:/src -> /dst: 3m 9s

So rdiff-backup is a little slower than rsync when working on local
files, and 4 times slower when working trough ssh.

Any idea?

Thanks,
--
Vadim Kouzmine <[hidden email]>



_______________________________________________
rdiff-backup-users mailing list at [hidden email]
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: rdiff-backup/rsync +ssh performance comparison

Douglas Bollinger
On Mon, 06 Feb 2006 17:23:50 -0500
Vadim Kouzmine <[hidden email]> wrote:

> Performed tests copying my home directory (~1GB, 21668 files) on
> workstation (Gentoo Linux, rdiff-backup 1.0.1, python 2.4.2, rsync
> 2.6.0, OpenSSH_4.2p1/no_compression)
>
> 1)rdiff-backup /src -> /dst  3m 11s
>
> 2)rsync /src -> /dst         2m 45s
>
> 3)rdiff-backup localhost::/src -> /dst:  12m 5s
>
> 4)rsync localhost:/src -> /dst: 3m 9s
>
> So rdiff-backup is a little slower than rsync when working on local
> files, and 4 times slower when working trough ssh.

Looking in the mailing list archives, it seems that rsync is always quite a
bit faster than rdiff-backup over a network, at least the initial run.
There's been some discussion about this, but I haven't seen a real
definitive answer on whether rdiff-backup can be tweaked to be closer to
rsync performance.

My GUI has a display that shows network throughput out of eth0.  The most
of eth0 I've ever seen rdiff-backup use was 2 MB/s while this hardware has
no problem geting 11.8 MB/s with scp.  While I understand there is quite a
bit of overhead associated with syncing a bunch of small files, I've always
wondered why big files, say 600 MB, still only transfer at 2 MB/s with
rdiff-backup.

--
How do I type "for i in *.dvi do xdvi $i done" in a GUI?
        -- Discussion in comp.os.linux.misc on the intuitiveness of interfaces


_______________________________________________
rdiff-backup-users mailing list at [hidden email]
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: rdiff-backup/rsync +ssh performance comparison

Vadim Kouzmine
On Mon, 2006-02-06 at 19:52 -0500, Douglas Bollinger wrote:

> On Mon, 06 Feb 2006 17:23:50 -0500
> Vadim Kouzmine <[hidden email]> wrote:
>
> > Performed tests copying my home directory (~1GB, 21668 files) on
> > workstation (Gentoo Linux, rdiff-backup 1.0.1, python 2.4.2, rsync
> > 2.6.0, OpenSSH_4.2p1/no_compression)
> >
> > 1)rdiff-backup /src -> /dst  3m 11s
> >
> > 2)rsync /src -> /dst         2m 45s
> >
> > 3)rdiff-backup localhost::/src -> /dst:  12m 5s
> >
> > 4)rsync localhost:/src -> /dst: 3m 9s
> >
> > So rdiff-backup is a little slower than rsync when working on local
> > files, and 4 times slower when working trough ssh.
>
> Looking in the mailing list archives, it seems that rsync is always quite a
> bit faster than rdiff-backup over a network, at least the initial run.
> There's been some discussion about this, but I haven't seen a real
> definitive answer on whether rdiff-backup can be tweaked to be closer to
> rsync performance.
>
> My GUI has a display that shows network throughput out of eth0.  The most
> of eth0 I've ever seen rdiff-backup use was 2 MB/s while this hardware has
> no problem geting 11.8 MB/s with scp.  While I understand there is quite a
> bit of overhead associated with syncing a bunch of small files, I've always
> wondered why big files, say 600 MB, still only transfer at 2 MB/s with
> rdiff-backup.

Let me summarize the information I've gathered during my tests, on two
platforms (Gentoo, Trustix 2.2) and different versions of python (2.2.3,
2.3.5, 2.4.2):

- on 3GHz P4 Xeon I get ~2MB/s steady transfer through ssh, on P4 3GHz
it's ~1.9MB/s;
- I made tests ssh-ing to LOCALHOST, so no network is involved here.
Although on gigabit lan I got the same results;
- 2MB/s rate actually doesn't depend on file size, at least it seams so.
We all expect it to be fast on big files and much slower on small files,
but looks like it's limited to 2MB/s on big files and the hardware I
used is so fast so small files don't make it worse;
- initial and incremental transfer rates seem to be almost equal;
- I see 20-35MB/s transfer rate on the same hardware with scp/rsync (on
big files of course);
- system resources remain practically IDLE when rdiff-backup is working
through ssh - I observe 3-6% cpu usage, almost no context change, ~2MB/s
disk read/write activity, etc, etc;
- vmstat with big-enough interval (5-10 sec) shows constant ~2MB/s read
rate from disk, never more;
- I tried to monitor running rdiff-backup server/client processes and
ssh process it spawned, attaching strace and measuring time spent in
system routines during 30 seconds interval. All 3 processes run usual
amount of system calls, no kind system call takes more than a fraction
of a second in total.

I understand that rsync and rdiff-backup are different, share no code
and may be serve different purpose. But why does rdiff-backup work so
significantly slower through ssh, refusing to use system resources
available, when rsync works just fine in this case???


Thanks,
--
Vadim Kouzmine <[hidden email]>



_______________________________________________
rdiff-backup-users mailing list at [hidden email]
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: rdiff-backup/rsync +ssh performance comparison

dean gaudet-4
On Mon, 6 Feb 2006, Vadim Kouzmine wrote:

> Let me summarize the information I've gathered during my tests, on two
> platforms (Gentoo, Trustix 2.2) and different versions of python (2.2.3,
> 2.3.5, 2.4.2):
>
> - on 3GHz P4 Xeon I get ~2MB/s steady transfer through ssh, on P4 3GHz
> it's ~1.9MB/s;
> - I made tests ssh-ing to LOCALHOST, so no network is involved here.

actually we can get rid of ssh entirely as well and just use pipes
between the rdiff-backup and "rdiff-backup --server"... to reduce the
variables even further.

when i compare:

        rdiff-backup src dst1

with:

        rdiff-backup --remote-schema '%s' src 'rdiff-backup --server'::dst2

i'm seeing an avg of 0.9s real-time for the first and 1.4s real-time
for the second using an extracted copy of rdiff-backup-1.0.4.tar.gz as
the src.  (the results are quite repeatable so i didn't bother with a
larger src.)

my suspicion would be something serializing in the protocol...

-dean


_______________________________________________
rdiff-backup-users mailing list at [hidden email]
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: rdiff-backup/rsync +ssh performance comparison

Vadim Kouzmine
> On Mon, 6 Feb 2006, Vadim Kouzmine wrote:
>
> > Let me summarize the information I've gathered during my tests, on two
> > platforms (Gentoo, Trustix 2.2) and different versions of python (2.2.3,
> > 2.3.5, 2.4.2):
> >
> > - on 3GHz P4 Xeon I get ~2MB/s steady transfer through ssh, on P4 3GHz
> > it's ~1.9MB/s;
> > - I made tests ssh-ing to LOCALHOST, so no network is involved here.

Suddenly it's changed for me. Now I get about 4MB/s average on the same
hardware, same data, through the network. This is what's changed:

1) rdiff-backup 1.1.5 vs. 1.0.4
2) python 2.4.2 vs 2.2.3
3) called as:
 "--remote-schema 'ssh -c blowfish %s /usr/local/Python/bin/rdiff-backup
--server --restrict-read-only /'"
vs
"--ssh-no-compression --restrict-read-only /"
4) openssh 4.3 vs 4.2
5) destination is external USB drive (actually slow) vs local RAID array

I don't have time to identify what exactly made it faster, but it's now
2 (!) times faster and it's great!

>
> actually we can get rid of ssh entirely as well and just use pipes
> between the rdiff-backup and "rdiff-backup --server"... to reduce the
> variables even further.
>
> when i compare:
>
> rdiff-backup src dst1
>
> with:
>
> rdiff-backup --remote-schema '%s' src 'rdiff-backup --server'::dst2
>
> i'm seeing an avg of 0.9s real-time for the first and 1.4s real-time
> for the second using an extracted copy of rdiff-backup-1.0.4.tar.gz as
> the src.  (the results are quite repeatable so i didn't bother with a
> larger src.)
>
> my suspicion would be something serializing in the protocol...

I'll check it with netcat tunnels, and probably with netcat through ssh
forwarded port, and post results. Any other idea to make a tunnel?

Vadim



_______________________________________________
rdiff-backup-users mailing list at [hidden email]
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: rdiff-backup/rsync +ssh performance comparison

Dave Kempe
Vadim Kouzmine wrote:
> 1) rdiff-backup 1.1.5 vs. 1.0.4
> 2) python 2.4.2 vs 2.2.3

I noticed a speed up (just visually) between python 2.2 and python 2.3

dave


_______________________________________________
rdiff-backup-users mailing list at [hidden email]
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: rdiff-backup/rsync +ssh performance comparison

Vadim Kouzmine
In reply to this post by dean gaudet-4
On Mon, 2006-02-20 at 21:11 -0800, dean gaudet wrote:

> actually we can get rid of ssh entirely as well and just use pipes
> between the rdiff-backup and "rdiff-backup --server"... to reduce the
> variables even further.
>
> when i compare:
>
> rdiff-backup src dst1
>
> with:
>
> rdiff-backup --remote-schema '%s' src 'rdiff-backup --server'::dst2
>
> i'm seeing an avg of 0.9s real-time for the first and 1.4s real-time
> for the second using an extracted copy of rdiff-backup-1.0.4.tar.gz as
> the src.  (the results are quite repeatable so i didn't bother with a
> larger src.)
>
> my suspicion would be something serializing in the protocol...
>
> -dean

Some test results below. Backup src is single 512MB file with urandom
stuff. Servers are 3GHz P4 Xeon, gigabit lan. rdiff-backup 1.1.5 /
python 2.4.2. Time in SECONDS.

1) rdiff-backup src dst
16.9s

2) rdiff-backup --remote-schema '%s' src 'rdiff-backup --server'::dst
21.3s

3) rdiff-backup -remote-schema 'ssh %s rdiff-backup' server2::src dst
80.0s   (76.7s with blowfish and 75.9s with arcfour)

4) rdiff-backup through netcat (no ssh/no encryption)
76.4s   (just like with ssh?!)

5) rdiff-backup -remote-schema 'ssh -C %s rdiff-backup' server2::src dst
129.1s   (compression makes it so much slower!)

6) scp server2:src/bigfile dst/
13.6s

7) rsync server2:src/ dst/
14.1s

So, yes, something is wrong with serializing. And looks like it's
seriously affected by latency, compare 2) with 3),4).

------
Just in case this is how I run it with netcat. Sure must be an easier
way :)

On src (remote server2):

nc -l -p 5555 <afifo | rdiff-backup --server >afifo

On dst (local receiver):

rdiff-backup --remote-schema 'nc %s 5555' server2:src dst

--
Vadim <[hidden email]>



_______________________________________________
rdiff-backup-users mailing list at [hidden email]
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki