adding --resume back

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

adding --resume back

Marco Mariani
Hi!

I'm looking at possible ways to add the resume feature to backups
(either initial
or incremental, ideally both) that were interrupted due to an unreliable
network.

AFAIK the feature was added in 2002, with version 0.5.0, and removed 10
months later,
in version 0.11.1, with the CHANGELOG comment

     All "resume" related functionality, like --checkpoint-interval:
     This was complicated to implement, and didn't seem to work all
     that well.

Later, the same feature has been requested a few times, and denied:

https://lists.gnu.org/archive/html/rdiff-backup-users/2012-05/msg00006.html

 > Not that I know of. Because of the complexity of the underlying archive,
 > rdiff-backup does not like a failed or interrupted previous backup
attempt at
 > all and tries to remove one if it finds it. Otherwise the risk would
be that
 > you corrupt the archive and lose your data history. Although it might be
 > possible in theory for rdiff-backup to continue a previously-interrupted
 > session, the code to do this doesn't exist and isn't likely to be
written.


A more extensive, though speculative, rationale is given in

http://nongnu.13855.n7.nabble.com/can-rdiff-backup-be-stopped-paused-restarted-HOWTO-td89419.html

 > As I see it, the problem is that rdiff-backup saves increment files
as it
 > goes along updating the remote repository. It does this in such a way
that
 > it can undo the increments if necessary, with
--check-destination-dir, but
 > I think it might not be able (currently) to:
 >
 > * determine which increments have already been applied when
restarting the
 > backup, and not apply them again; and
 >
 > * handle the case where a file that was incremented during the last run
 > has subsequently changed and needs to be incremented again (merging
 > increments); and
 >
 > * handle the case where the increments created so far do not match
the log
 > file written so far (because the two cannot be updated atomically in
 > step).


Now I can add some constraints, and avoid content changes between a failed
backup and a resume.
If I take care of 1) and 3) and don't care about increment merging, does
the idea of saving snapshots
and reloading with an explicit --resume option become viable?
Has anybody attempted to do that since february 17, 2002?


A different approach has also been proposed:

http://lists.gnu.org/archive/html/rdiff-backup-users/2009-03/msg00083.html

 > Here is what I propose: when regressing a repository prior to a backup,
 > rdiff-backup takes all "new" files (files that have been added during the
 > failed backup) and moves them to a temporary location inside of the
 > rdiff-backup-data folder. Then, when the backup runs, if it
encounters a new
 > file, it first checks to see if the file exists in this temporary
location, and
 > if it does, it diffs against that file (or moves it to the target
location,
 > then diffs; I don't know which would be easier). At the end of any
backup run,
 > rdiff-backup empties out this folder.
 >
 > Thoughts/reactions?


Reactions were basically "use rsync + rdiff-backup", but it's not an
option for
me, unless there is a way to avoid doubling the required disk space.
Hard links won't work.

The second proposal makes sense to me, and seems easier to implement
than the checkpoints.
Am I missing something obvious?


I am open to evaluate other backup solutions, but I have some non
negotiable requirements:

  - must support both pull and push
  - must efficiently store big files with binary delta
  - must be open source
  - must work on unreliable networks

This leaves out 99% of the alternatives, and I am willing to implement
the last
point for rdiff-backup. Suggestions are very appreciated.


Regards,
Marco


_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: adding --resume back

Marco Mariani
On 09/03/2014 05:37 PM, Marco Mariani wrote:

> Now I can add some constraints, and avoid content changes between a
> failed
> backup and a resume.

To be clear, I do not care about the case of changing files in the
folder that is being backed up,
like some of the previous proponents of the resume feature.
But I do care about preserving partially transferred files. I don't only
have slow connections,
but they can unexpectedly drop. I reckon this may not be a common use case.


Regards,
Marco


_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: adding --resume back

Dominic Raferd-3
Hello Marco

If you want to do some coding work on rdiff-backup then that would be welcome, but I think you will be on your own. rdiff-backup is not under active development at present, sadly.

An alternative is to see if you can prevent the problem of a broken archive arising in the first place.

Automatic regression
As you probably know, if rdiff-backup is interrupted so that the archive is left in an inconsistent state then on the next run it should automatically 'regress' the archive back to its previous consistent state. (There is no official way to force this regression unless rdiff-backup considers that there has been an error, but there is an unofficial way.) So in theory your archive can always be restored to its previous consistent state, but I am not sure if this can be relied upon if you pile failed session upon failed session. Even if it does, it isn't much use if your communication drops are so frequent that you rarely complete a single backup session, because your last usable backup data might be too old for your purpose.

Lengthening session time-out interval
It has been reported that rdiff-backup can work reliably over an unreliable internet connection if you set

ServerAliveInterval 300

in the (Linux) client machine's /.ssh/config file; similarly you can set

ClientAliveInterval 300

in the (Linux) server machine's /etc/ssh/sshd_config. Both machines will then keep an ssh connection open for 5 minutes (300 seconds) which might be enough to give a very high probability that any temporary service drop will not affect the ongoing rdiff-backup session.

Backup to a snapshot
If your rdiff-backup repository is on a non-root LVM-based volume, you could try the following:
  • On the server (destination), create a read/write LVM snapshot of the volume holding your (consistent / good) rdiff-backup repository (discarding any pre-existing snapshot, and allowing the snapshot plenty of additional space)
  • Mount this snapshot and run rdiff-backup to the repository therein i.e. on the snapshot - not the original repository location
  • If and only if rdiff-backup completes ok, use LVM's lvconvert --merge function to replace the original data with the snapshot data - for details see http://www.thegoldfish.org/2011/09/reverting-to-a-previous-snapshot-using-linux-lvm/
  • Conversely, if rdiff-backup fails for any reason, you can discard the snapshot and the original repository remains unaffected
  • The risk of rdiff-backup failure has been replaced by the risk of LVM merge failure, but as this activity is entirely local the risk should be much lower
  • For the very cautious, make/update a separate backup (e.g. with rsync) of the rdiff-backup repository onto a separate physical volume before starting the procedure above
  • As an alternative to LVM (upon which you can mount any filesystem), I think btrfs offers its own built-in snapshot capability

Dominic

On 03/09/2014 16:48, Marco Mariani wrote:
On 09/03/2014 05:37 PM, Marco Mariani wrote:

Now I can add some constraints, and avoid content changes between a 
failed
backup and a resume.
To be clear, I do not care about the case of changing files in the 
folder that is being backed up,
like some of the previous proponents of the resume feature.
But I do care about preserving partially transferred files. I don't only 
have slow connections,
but they can unexpectedly drop. I reckon this may not be a common use case.


Regards,
Marco


_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki



--
TimeDicer: Free File Recovery from Whenever

_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: adding --resume back

Marco Mariani
On 09/04/2014 12:21 PM, Dominic Raferd wrote:

If you want to do some coding work on rdiff-backup then that would be welcome,
but I think you will be on your own. rdiff-backup is not under active
development at present, sadly.

Yes, I know that, but any suggestion is useful.

<http://www.timedicer.co.uk/programs/help/rdiff-backup-regress.sh.php>.) So in
theory your archive can always be restored to its previous consistent state, but
I am not sure if this can be relied upon if you pile failed session upon failed
session. Even if it does, it isn't much use if your communication drops are so
frequent that you rarely complete a single backup session, because your last
usable backup data might be too old for your purpose.

Not only at the level of backup sessions, I must consider the case of single
files that are so big they won't go though in a single session. The network
may be mobile or in a hostile environment. It may be up only part of the day.
Servers may be restarted or have power failures once per day.


_*Backup to a snapshot*_
If your rdiff-backup repository is on a non-root LVM-based volume, you could try
the following:

   * On the server (destination), create a read/write LVM snapshot of the volume
     holding your (consistent / good) rdiff-backup repository (discarding any
     pre-existing snapshot, and allowing the snapshot plenty of additional space)
[...]

If I had enough space for the LVM snapshot, I would probably rsync the current
data and run rdiff-backup locally on the destination every time rsync succeeds.
This would provide - in our setup - the same protection as LVM with respect
to broken increments, but also resume a partial session after network shortage
and server restarts.

Thanks a lot for your reply, I think that my only option is to change the code.


Regards,
Marco

_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: adding --resume back

Dominic Raferd-3
On 04/09/2014 13:01, Marco Mariani wrote:

> Even if it does, it isn't much use if your communication drops are so
>> frequent that you rarely complete a single backup session, because
>> your last
>> usable backup data might be too old for your purpose.
>
> Not only at the level of backup sessions, I must consider the case of
> single
> files that are so big they won't go though in a single session. The
> network
> may be mobile or in a hostile environment. It may be up only part of
> the day.
> Servers may be restarted or have power failures once per day.
>
>
>> _*Backup to a snapshot*_
>> If your rdiff-backup repository is on a non-root LVM-based volume,
>> you could try
>> the following:
>>
>>    * On the server (destination), create a read/write LVM snapshot of
>> the volume
>>      holding your (consistent / good) rdiff-backup repository
>> (discarding any
>>      pre-existing snapshot, and allowing the snapshot plenty of
>> additional space)
> [...]
>
> If I had enough space for the LVM snapshot, I would probably rsync the
> current
> data and run rdiff-backup locally on the destination every time rsync
> succeeds.
> This would provide - in our setup - the same protection as LVM with
> respect
> to broken increments, but also resume a partial session after network
> shortage
> and server restarts.
>
> Thanks a lot for your reply, I think that my only option is to change
> the code.
>

You are facing quite a tough situation! You didn't comment on the idea
of lengthening the ssh timeouts, but given the severity of the
situations you have to allow for, maybe this can't help. I should point
out that using an LVM snapshot should not need nearly as much space as
rsync because it only has to store the differences between the old
rdiff-backup archive and the new, and it does not have to persist once
the backup is complete. Still rsync is a simpler (and more familiar)
solution and surely the lack of disk space is cheaper to fix than the
value of your time recoding rdiff-backup?

My practice FWIW is to run rdiff-backup locally and use rsync with
--link-dest to synchronise a remote copy of the rdiff-backup repository.

If you do decide to work on the code, I think you should start from the
1.3.3 (or 1.2.8) codebase. I expect that the latest in cvs has been
messed around with over the last 5 years and its consistency is unknown
whereas 1.3.3 and 1.2.8 are stable. I am sure all users would be
grateful for your contribution.

Dominic

_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: adding --resume back

Chris Wilson-5
Hi all,

On Thu, 4 Sep 2014, Dominic Raferd wrote:

>> If I had enough space for the LVM snapshot, I would probably rsync the
>> current
>> data and run rdiff-backup locally on the destination every time rsync
>> succeeds.
>> This would provide - in our setup - the same protection as LVM with respect
>> to broken increments, but also resume a partial session after network
>> shortage
>> and server restarts.
>
> You are facing quite a tough situation! You didn't comment on the idea of
> lengthening the ssh timeouts, but given the severity of the situations you
> have to allow for, maybe this can't help. I should point out that using an
> LVM snapshot should not need nearly as much space as rsync because it only
> has to store the differences between the old rdiff-backup archive and the
> new, and it does not have to persist once the backup is complete. Still rsync
> is a simpler (and more familiar) solution and surely the lack of disk space
> is cheaper to fix than the value of your time recoding rdiff-backup?

That gives me an interesting idea. Since the rdiff-backup destination is a
mirror of the source, we can rsync over it. Of course if we did this on a
real repository it would destroy it, but we could safely do this:

* create a writable LVM snapshot containing the rdiff-backup repository,
* delete the rdiff-backup-data directory from it,
* rsync from the target over the snapshot's rdiff directory,
* run rdiff-backup from the snapshot back to the original location,
* then discard the snapshot.

Cheers, Chris.
--
_____ __     _
\  __/ / ,__(_)_  | Chris Wilson <[hidden email]> Cambs UK |
/ (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Ruby/Perl/SQL Developer |
\__/_/_/_//_/___/ | We are GNU : free your mind & your software |

_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: adding --resume back

Dominic Raferd-3

On 04/09/2014 15:02, Chris Wilson wrote:

> Hi all,
>
> On Thu, 4 Sep 2014, Dominic Raferd wrote:
>
>>> If I had enough space for the LVM snapshot, I would probably rsync
>>> the current
>>> data and run rdiff-backup locally on the destination every time
>>> rsync succeeds.
>>> This would provide - in our setup - the same protection as LVM with
>>> respect
>>> to broken increments, but also resume a partial session after
>>> network shortage
>>> and server restarts.
>>
>> You are facing quite a tough situation! You didn't comment on the
>> idea of lengthening the ssh timeouts, but given the severity of the
>> situations you have to allow for, maybe this can't help. I should
>> point out that using an LVM snapshot should not need nearly as much
>> space as rsync because it only has to store the differences between
>> the old rdiff-backup archive and the new, and it does not have to
>> persist once the backup is complete. Still rsync is a simpler (and
>> more familiar) solution and surely the lack of disk space is cheaper
>> to fix than the value of your time recoding rdiff-backup?
>
> That gives me an interesting idea. Since the rdiff-backup destination
> is a mirror of the source, we can rsync over it. Of course if we did
> this on a real repository it would destroy it, but we could safely do
> this:
>
> * create a writable LVM snapshot containing the rdiff-backup repository,
> * delete the rdiff-backup-data directory from it,
> * rsync from the target over the snapshot's rdiff directory,
> * run rdiff-backup from the snapshot back to the original location,
> * then discard the snapshot.
>
> Cheers, Chris.

I like that idea. But whereas the initial snapshot takes up almost no
disk space, I think that deleting its rdiff-backup-data directory would
cause it to swell in size by the size of the rdiff-backup-data
directory, or perhaps somewhat more (since snapshotting works at block
level). Better just to rsync to it ignoring the rdiff-backup-data
directory on the destination (I think using rsync --excl
/rdiff-backup-data should do it). Then, as you suggest, run rdiff-backup
from the snapshot back to the original location, again ignoring
/rdiff-backup-data on the source (rdiff-backup-data --exclude
/rdiff-backup-data) - or maybe rdiff-backup would ignore it anyway?

Cheers, Dominic


_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: adding --resume back

Dominic Raferd-3
On 04/09/2014 15:39, Dominic Raferd wrote:

> I like that idea. But whereas the initial snapshot takes up almost no
> disk space, I think that deleting its rdiff-backup-data directory
> would cause it to swell in size by the size of the rdiff-backup-data
> directory, or perhaps somewhat more (since snapshotting works at block
> level). Better just to rsync to it ignoring the rdiff-backup-data
> directory on the destination (I think using rsync --excl
> /rdiff-backup-data should do it). Then, as you suggest, run
> rdiff-backup from the snapshot back to the original location, again
> ignoring /rdiff-backup-data on the source (rdiff-backup-data --exclude
> /rdiff-backup-data) - or maybe rdiff-backup would ignore it anyway?
> Cheers, Dominic

To correct what I wrote above, by experiment I have realised that
deleting data from an LVM snapshot does *not* cause substantial swelling
of the snapshot. So Chris's suggested strategy should work fine.

Dominic
--
*TimeDicer* <http://www.timedicer.co.uk>: Free File Recovery from Whenever

_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki