Backup recovery downloads all signatures

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Backup recovery downloads all signatures

duplicity-talk mailing list
Hello,

Since July last year, I perform nightly backups on a server which are
encrypted and sent to a NAS with rsync. The script requires a full
back-up every 3M, the duplicity command looks like this:

duplicity --name $BACKUP_NAME --encrypt-sign-key $KEY_ID
    --include-filelist $tmplist --full-if-older-than $FULL_EVERY
    / $DESTINATION

Today I had to recover a couple of files (from another machine), which I
did with this command:

duplicity --encrypt-sign-key $KEY_ID --file-to-restore etc/systemd
    $DESTINATION /root/backup

When running this command, I noticed that duplicity seems to download
all the signatures since the first backup. Below you can see an extract
of the output.

Why is that? I had understood that one reason for doing regular full
backups what precisely to avoid this. Downloading all these signatures
takes ages. Am I missing something?

Thanks for your help,
Romain

*OUTPUT EXTRACT:*

Copying duplicity-full-signatures.20170725T131721Z.sigtar.gpg to local
cache.
Copying duplicity-full-signatures.20171024T031102Z.sigtar.gpg to local
cache.
Copying duplicity-full-signatures.20180122T041102Z.sigtar.gpg to local
cache.
Copying duplicity-full-signatures.20180423T031102Z.sigtar.gpg to local
cache.
Copying duplicity-full.20170725T131721Z.manifest.gpg to local cache.
Copying duplicity-full.20171024T031102Z.manifest.gpg to local cache.
Copying duplicity-full.20180122T041102Z.manifest.gpg to local cache.
Copying duplicity-full.20180423T031102Z.manifest.gpg to local cache.
Copying duplicity-inc.20170725T131721Z.to.20170726T031101Z.manifest.gpg
to local cache.
Copying duplicity-inc.20170726T031101Z.to.20170727T031101Z.manifest.gpg
to local cache.
Copying duplicity-inc.20170727T031101Z.to.20170728T031101Z.manifest.gpg
to local cache.

[../..]

Copying duplicity-inc.20180628T031102Z.to.20180629T031102Z.manifest.gpg
to local cache.
Copying duplicity-inc.20180629T031102Z.to.20180630T031103Z.manifest.gpg
to local cache.
Copying duplicity-inc.20180630T031103Z.to.20180701T031102Z.manifest.gpg
to local cache.
Copying duplicity-inc.20180701T031102Z.to.20180702T031102Z.manifest.gpg
to local cache.
Copying
duplicity-new-signatures.20170725T131721Z.to.20170726T031101Z.sigtar.gpg
to local cache.
Copying
duplicity-new-signatures.20170726T031101Z.to.20170727T031101Z.sigtar.gpg
to local cache.
Copying
duplicity-new-signatures.20170727T031101Z.to.20170728T031101Z.sigtar.gpg
to local cache.

[../..]

Copying
duplicity-new-signatures.20180629T031102Z.to.20180630T031103Z.sigtar.gpg
to local cache.
Copying
duplicity-new-signatures.20180630T031103Z.to.20180701T031102Z.sigtar.gpg
to local cache.
Copying
duplicity-new-signatures.20180701T031102Z.to.20180702T031102Z.sigtar.gpg
to local cache.
Warning, found incomplete backup sets, probably left from aborted session
Last full backup date: Mon Apr 23 05:11:02 2018



_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Backup recovery downloads all signatures

duplicity-talk mailing list
On 03.07.2018 16:11, Romain Thouvenin via Duplicity-talk wrote:
> When running this command, I noticed that duplicity seems to download
> all the signatures since the first backup. Below you can see an extract
> of the output.
>
> Why is that? I had understood that one reason for doing regular full
> backups what precisely to avoid this.

nope. the reason for shorter chains is that one corrupt volume might render the whole chain useless. atleast all files that were initially or partly stored in it will fail to restore.

>Downloading all these signatures
> takes ages. Am I missing something?

well, yes. duplicity always works with a complete archive folder, regardless if it is actually needed or not. it is simply the way it was designed.

you can easily work around it though by assigning a new subfolder every 3 month as backup target. this way the resynced archive folder will hold only this time span.

i am not sure we really need the refreshed archive dir for a plain restore though. or at least if, then we could limit it to the given restore date chain.
@Ken: what do you think?

..ede/duply.net

_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Backup recovery downloads all signatures

duplicity-talk mailing list
Hi Romain,

When duplicity starts it checks to see that it has a complete local cache of metadata.  It prepares the cache for any command that may be present.  It could be optimized by command, but it's not.  The best way to avoid this is to always leave the cache in place.  The second is to follow ede's recommendation and make each full backup in its own directory, under its own backup name.  This may require writing a script so the 'full' command is issued every three months after creating the new directory.  Be sure all the 'inc' commands use the same '--name'.

...Thanks,
...Ken


On Wed, Jul 4, 2018 at 6:39 AM <[hidden email]> wrote:
On 03.07.2018 16:11, Romain Thouvenin via Duplicity-talk wrote:
> When running this command, I noticed that duplicity seems to download
> all the signatures since the first backup. Below you can see an extract
> of the output.
>
> Why is that? I had understood that one reason for doing regular full
> backups what precisely to avoid this.

nope. the reason for shorter chains is that one corrupt volume might render the whole chain useless. atleast all files that were initially or partly stored in it will fail to restore.

>Downloading all these signatures
> takes ages. Am I missing something?

well, yes. duplicity always works with a complete archive folder, regardless if it is actually needed or not. it is simply the way it was designed.

you can easily work around it though by assigning a new subfolder every 3 month as backup target. this way the resynced archive folder will hold only this time span.

i am not sure we really need the refreshed archive dir for a plain restore though. or at least if, then we could limit it to the given restore date chain.
@Ken: what do you think?

..ede/duply.net

_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Backup recovery downloads all signatures

duplicity-talk mailing list

Thanks both for your answers.

I understand now why the whole cache was rebuilt. If optimizing this is not too complex, I think it would be great addition to duplicity.
The reason I didn't have a cache in place is that the server where the backup takes place had become unavailable, and so I was recovering from another server. I imagine this is a common case in disaster recovery.

Thanks for the tip on renaming the backup for each full backup. The command is already scripted, so it shouldn't be difficult to implement it.

Thanks for your work,
Romain


On 04/07/18 14:21, Kenneth Loafman wrote:
Hi Romain,

When duplicity starts it checks to see that it has a complete local cache of metadata.  It prepares the cache for any command that may be present.  It could be optimized by command, but it's not.  The best way to avoid this is to always leave the cache in place.  The second is to follow ede's recommendation and make each full backup in its own directory, under its own backup name.  This may require writing a script so the 'full' command is issued every three months after creating the new directory.  Be sure all the 'inc' commands use the same '--name'.

...Thanks,
...Ken


On Wed, Jul 4, 2018 at 6:39 AM <[hidden email]> wrote:
On 03.07.2018 16:11, Romain Thouvenin via Duplicity-talk wrote:
> When running this command, I noticed that duplicity seems to download
> all the signatures since the first backup. Below you can see an extract
> of the output.
>
> Why is that? I had understood that one reason for doing regular full
> backups what precisely to avoid this.

nope. the reason for shorter chains is that one corrupt volume might render the whole chain useless. atleast all files that were initially or partly stored in it will fail to restore.

>Downloading all these signatures
> takes ages. Am I missing something?

well, yes. duplicity always works with a complete archive folder, regardless if it is actually needed or not. it is simply the way it was designed.

you can easily work around it though by assigning a new subfolder every 3 month as backup target. this way the resynced archive folder will hold only this time span.

i am not sure we really need the refreshed archive dir for a plain restore though. or at least if, then we could limit it to the given restore date chain.
@Ken: what do you think?

..ede/duply.net


_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Backup recovery downloads all signatures

duplicity-talk mailing list
On 04.07.2018 15:00, Romain Thouvenin wrote:
> I understand now why the whole cache was rebuilt. If optimizing this is not too complex, I think it would be great addition to duplicity.

go ahead and strengthen you python fu ;) every contribution is welcome.. ede/duply.net

_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk