Directory changes when mid-backup?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Directory changes when mid-backup?

duplicity-talk mailing list
Hi,

I've been using duplicity for a while to backup daily changing
directories of logs and results.

I have always done this in the dead of night and locked the processes
which would usually write to the parent directory being backed up for the
duration of the backup.

However there is now a requirement to add to (and potentially even
modify) this directory 24/7.  I've done a few tests and running
duplicity on a directory whilst that directory is being written to
doesn't seem to fail.

My question is - is backing up a potentially change directory supported?

Does duplicity initially snapshot the directory structure - so it will
ignore any new files or directories created whilst it is running?  How
does it handle files potentially being changed or deleted during the
backup - if at all?

It strikes me that adding files shouldn't really be a problem (they will
simply be ignored until the next incremental backup) - but deleting or
modifying files mid-backup may be difficult to handle.

Any advice welcome!

Thanks,
Phil.

_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Directory changes when mid-backup?

duplicity-talk mailing list
Hi Phil,

I asked a similar question a few months ago which might interest you: https://lists.nongnu.org/archive/html/duplicity-talk/2020-06/msg00003.html

On 3 Nov 2020, at 11:18, Phil via Duplicity-talk <[hidden email]> wrote:

Hi,

I've been using duplicity for a while to backup daily changing
directories of logs and results.

I have always done this in the dead of night and locked the processes
which would usually write to the parent directory being backed up for the
duration of the backup.

However there is now a requirement to add to (and potentially even
modify) this directory 24/7.  I've done a few tests and running
duplicity on a directory whilst that directory is being written to
doesn't seem to fail.

My question is - is backing up a potentially change directory supported?

Does duplicity initially snapshot the directory structure - so it will
ignore any new files or directories created whilst it is running?  How
does it handle files potentially being changed or deleted during the
backup - if at all?

It strikes me that adding files shouldn't really be a problem (they will
simply be ignored until the next incremental backup) - but deleting or
modifying files mid-backup may be difficult to handle.

Any advice welcome!

Thanks,
Phil.

_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk


_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Directory changes when mid-backup?

duplicity-talk mailing list
If your host uses LVM or a similar mechanism that supports filesystem snapshots, you can take a snapshot immediately before the backup starts, mount and backup the snapshot, and then remove it. That ensures an atomic view of the filesystem as you're backing up. No files will change.

On Tue, Nov 3, 2020 at 6:21 AM Oliver Dunk via Duplicity-talk <[hidden email]> wrote:
Hi Phil,

I asked a similar question a few months ago which might interest you: https://lists.nongnu.org/archive/html/duplicity-talk/2020-06/msg00003.html

On 3 Nov 2020, at 11:18, Phil via Duplicity-talk <[hidden email]> wrote:

Hi,

I've been using duplicity for a while to backup daily changing
directories of logs and results.

I have always done this in the dead of night and locked the processes
which would usually write to the parent directory being backed up for the
duration of the backup.

However there is now a requirement to add to (and potentially even
modify) this directory 24/7.  I've done a few tests and running
duplicity on a directory whilst that directory is being written to
doesn't seem to fail.

My question is - is backing up a potentially change directory supported?

Does duplicity initially snapshot the directory structure - so it will
ignore any new files or directories created whilst it is running?  How
does it handle files potentially being changed or deleted during the
backup - if at all?

It strikes me that adding files shouldn't really be a problem (they will
simply be ignored until the next incremental backup) - but deleting or
modifying files mid-backup may be difficult to handle.

Any advice welcome!

Thanks,
Phil.

_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk

_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk

_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Directory changes when mid-backup?

duplicity-talk mailing list
In reply to this post by duplicity-talk mailing list
Thanks Oliver - this largely answers my question.

For deletes (where read locking isn't a thing) - will duplicity just
ignore any files that are deleted in-between file directory structure
retrieval and access time for actual archiving (rather than error because an
expected file is now missing)?

(I'm assuming that the directory structure is snapshot up-front before any
archiving starts, otherwise I imagine constant re-interrogating of the
filesystem would lead to a whole host of race conditions in very
actively written-to directories - eg if a new large-enough file was written every n seconds the backup
might never end!)


Oliver Dunk writes:

> Hi Phil,
>
> I asked a similar question a few months ago which might interest you: https://lists.nongnu.org/archive/html/duplicity-talk/2020-06/msg00003.html
>
>  On 3 Nov 2020, at 11:18, Phil via Duplicity-talk <[hidden email]> wrote:
>
>  Hi,
>
>  I've been using duplicity for a while to backup daily changing
>  directories of logs and results.
>
>  I have always done this in the dead of night and locked the processes
>  which would usually write to the parent directory being backed up for the
>  duration of the backup.
>
>  However there is now a requirement to add to (and potentially even
>  modify) this directory 24/7.  I've done a few tests and running
>  duplicity on a directory whilst that directory is being written to
>  doesn't seem to fail.
>
>  My question is - is backing up a potentially change directory supported?
>
>  Does duplicity initially snapshot the directory structure - so it will
>  ignore any new files or directories created whilst it is running?  How
>  does it handle files potentially being changed or deleted during the
>  backup - if at all?
>
>  It strikes me that adding files shouldn't really be a problem (they will
>  simply be ignored until the next incremental backup) - but deleting or
>  modifying files mid-backup may be difficult to handle.
>
>  Any advice welcome!
>
>  Thanks,
>  Phil.
>
>  _______________________________________________
>  Duplicity-talk mailing list
>  [hidden email]
>  https://lists.nongnu.org/mailman/listinfo/duplicity-talk


_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Directory changes when mid-backup?

duplicity-talk mailing list
Phil,

afair duplicity just iterates over the source folder, matches against in/exclusions and backs up what it finds. given that, it would depend on if the folder in which the file is located is processed already or not whether the file will be backed up as deleted or as is.

hope that makes sense.. ede/duply.net

On 11/4/2020 12:52, Phil via Duplicity-talk wrote:

> Thanks Oliver - this largely answers my question.
>
> For deletes (where read locking isn't a thing) - will duplicity just
> ignore any files that are deleted in-between file directory structure
> retrieval and access time for actual archiving (rather than error because an
> expected file is now missing)?
>
> (I'm assuming that the directory structure is snapshot up-front before any
> archiving starts, otherwise I imagine constant re-interrogating of the
> filesystem would lead to a whole host of race conditions in very
> actively written-to directories - eg if a new large-enough file was written every n seconds the backup
> might never end!)
>
>
> Oliver Dunk writes:
>
>> Hi Phil,
>>
>> I asked a similar question a few months ago which might interest you: https://lists.nongnu.org/archive/html/duplicity-talk/2020-06/msg00003.html
>>
>>  On 3 Nov 2020, at 11:18, Phil via Duplicity-talk <[hidden email]> wrote:
>>
>>  Hi,
>>
>>  I've been using duplicity for a while to backup daily changing
>>  directories of logs and results.
>>
>>  I have always done this in the dead of night and locked the processes
>>  which would usually write to the parent directory being backed up for the
>>  duration of the backup.
>>
>>  However there is now a requirement to add to (and potentially even
>>  modify) this directory 24/7.  I've done a few tests and running
>>  duplicity on a directory whilst that directory is being written to
>>  doesn't seem to fail.
>>
>>  My question is - is backing up a potentially change directory supported?
>>
>>  Does duplicity initially snapshot the directory structure - so it will
>>  ignore any new files or directories created whilst it is running?  How
>>  does it handle files potentially being changed or deleted during the
>>  backup - if at all?
>>
>>  It strikes me that adding files shouldn't really be a problem (they will
>>  simply be ignored until the next incremental backup) - but deleting or
>>  modifying files mid-backup may be difficult to handle.
>>
>>  Any advice welcome!
>>
>>  Thanks,
>>  Phil.
>>
>>  _______________________________________________
>>  Duplicity-talk mailing list
>>  [hidden email]
>>  https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>
>
> _______________________________________________
> Duplicity-talk mailing list
> [hidden email]
> https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>


_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Directory changes when mid-backup?

duplicity-talk mailing list
Thanks Edgar.

I think I understand - so duplicity walks over each folder, and
retrieves separate file lists for each folder in turn.  (Rather than
walking the whole nested directory structure and producing a single
structure of all files and directories in one go?).

So if there was a race condition where duplicity retrieves the list of
files in a folder, but then a delete sneaks in before each file is archived, will it
handle this, or will the surprise of the now missing file cause it to
fall over?

Thanks for the help and sorry for labouring the point :-)


edgar.soldin--- via Duplicity-talk writes:

> Phil,
>
> afair duplicity just iterates over the source folder, matches against in/exclusions and backs up what it finds. given that, it would depend on if the folder in which the file is located is processed already or not whether the file will be backed up as deleted or as is.
>
> hope that makes sense.. ede/duply.net
>
> On 11/4/2020 12:52, Phil via Duplicity-talk wrote:
>> Thanks Oliver - this largely answers my question.
>>
>> For deletes (where read locking isn't a thing) - will duplicity just
>> ignore any files that are deleted in-between file directory structure
>> retrieval and access time for actual archiving (rather than error because an
>> expected file is now missing)?
>>
>> (I'm assuming that the directory structure is snapshot up-front before any
>> archiving starts, otherwise I imagine constant re-interrogating of the
>> filesystem would lead to a whole host of race conditions in very
>> actively written-to directories - eg if a new large-enough file was written every n seconds the backup
>> might never end!)
>>
>>
>> Oliver Dunk writes:
>>
>>> Hi Phil,
>>>
>>> I asked a similar question a few months ago which might interest you: https://lists.nongnu.org/archive/html/duplicity-talk/2020-06/msg00003.html
>>>
>>>  On 3 Nov 2020, at 11:18, Phil via Duplicity-talk <[hidden email]> wrote:
>>>
>>>  Hi,
>>>
>>>  I've been using duplicity for a while to backup daily changing
>>>  directories of logs and results.
>>>
>>>  I have always done this in the dead of night and locked the processes
>>>  which would usually write to the parent directory being backed up for the
>>>  duration of the backup.
>>>
>>>  However there is now a requirement to add to (and potentially even
>>>  modify) this directory 24/7.  I've done a few tests and running
>>>  duplicity on a directory whilst that directory is being written to
>>>  doesn't seem to fail.
>>>
>>>  My question is - is backing up a potentially change directory supported?
>>>
>>>  Does duplicity initially snapshot the directory structure - so it will
>>>  ignore any new files or directories created whilst it is running?  How
>>>  does it handle files potentially being changed or deleted during the
>>>  backup - if at all?
>>>
>>>  It strikes me that adding files shouldn't really be a problem (they will
>>>  simply be ignored until the next incremental backup) - but deleting or
>>>  modifying files mid-backup may be difficult to handle.
>>>
>>>  Any advice welcome!
>>>
>>>  Thanks,
>>>  Phil.
>>>
>>>  _______________________________________________
>>>  Duplicity-talk mailing list
>>>  [hidden email]
>>>  https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>>
>>
>> _______________________________________________
>> Duplicity-talk mailing list
>> [hidden email]
>> https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>>
>
>
> _______________________________________________
> Duplicity-talk mailing list
> [hidden email]
> https://lists.nongnu.org/mailman/listinfo/duplicity-talk


_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Directory changes when mid-backup?

duplicity-talk mailing list
In reply to this post by duplicity-talk mailing list
Thanks Rob - reading up on LVM it sounds very useful for a host of things.  Have you found it
to be reliable - there's some anecdotal evidence I've found that it can fall foul of some nasty bugs?


Rob Hasselbaum writes:

> If your host uses LVM or a similar mechanism that supports filesystem snapshots, you can take a snapshot immediately before the backup starts, mount and backup the snapshot, and then remove it. That ensures an atomic view of the filesystem as you're backing up. No files will
> change.
>
> On Tue, Nov 3, 2020 at 6:21 AM Oliver Dunk via Duplicity-talk <[hidden email]> wrote:
>
>  Hi Phil,
>
>  I asked a similar question a few months ago which might interest you: https://lists.nongnu.org/archive/html/duplicity-talk/2020-06/msg00003.html
>
>  On 3 Nov 2020, at 11:18, Phil via Duplicity-talk <[hidden email]> wrote:
>
>  Hi,
>
>  I've been using duplicity for a while to backup daily changing
>  directories of logs and results.
>
>  I have always done this in the dead of night and locked the processes
>  which would usually write to the parent directory being backed up for the
>  duration of the backup.
>
>  However there is now a requirement to add to (and potentially even
>  modify) this directory 24/7.  I've done a few tests and running
>  duplicity on a directory whilst that directory is being written to
>  doesn't seem to fail.
>
>  My question is - is backing up a potentially change directory supported?
>
>  Does duplicity initially snapshot the directory structure - so it will
>  ignore any new files or directories created whilst it is running?  How
>  does it handle files potentially being changed or deleted during the
>  backup - if at all?
>
>  It strikes me that adding files shouldn't really be a problem (they will
>  simply be ignored until the next incremental backup) - but deleting or
>  modifying files mid-backup may be difficult to handle.
>
>  Any advice welcome!
>
>  Thanks,
>  Phil.
>
>  _______________________________________________
>  Duplicity-talk mailing list
>  [hidden email]
>  https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>
>  _______________________________________________
>  Duplicity-talk mailing list
>  [hidden email]
>  https://lists.nongnu.org/mailman/listinfo/duplicity-talk


_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Directory changes when mid-backup?

duplicity-talk mailing list
The Linux implementation is very mature and solid. I've used it for over a decade professionally and personally.

On Fri, Nov 6, 2020 at 4:29 AM Phil <[hidden email]> wrote:
Thanks Rob - reading up on LVM it sounds very useful for a host of things.  Have you found it
to be reliable - there's some anecdotal evidence I've found that it can fall foul of some nasty bugs?


Rob Hasselbaum writes:

> If your host uses LVM or a similar mechanism that supports filesystem snapshots, you can take a snapshot immediately before the backup starts, mount and backup the snapshot, and then remove it. That ensures an atomic view of the filesystem as you're backing up. No files will
> change.
>
> On Tue, Nov 3, 2020 at 6:21 AM Oliver Dunk via Duplicity-talk <[hidden email]> wrote:
>
>  Hi Phil,
>
>  I asked a similar question a few months ago which might interest you: https://lists.nongnu.org/archive/html/duplicity-talk/2020-06/msg00003.html
>
>  On 3 Nov 2020, at 11:18, Phil via Duplicity-talk <[hidden email]> wrote:
>
>  Hi,
>
>  I've been using duplicity for a while to backup daily changing
>  directories of logs and results.
>
>  I have always done this in the dead of night and locked the processes
>  which would usually write to the parent directory being backed up for the
>  duration of the backup.
>
>  However there is now a requirement to add to (and potentially even
>  modify) this directory 24/7.  I've done a few tests and running
>  duplicity on a directory whilst that directory is being written to
>  doesn't seem to fail.
>
>  My question is - is backing up a potentially change directory supported?
>
>  Does duplicity initially snapshot the directory structure - so it will
>  ignore any new files or directories created whilst it is running?  How
>  does it handle files potentially being changed or deleted during the
>  backup - if at all?
>
>  It strikes me that adding files shouldn't really be a problem (they will
>  simply be ignored until the next incremental backup) - but deleting or
>  modifying files mid-backup may be difficult to handle.
>
>  Any advice welcome!
>
>  Thanks,
>  Phil.
>
>  _______________________________________________
>  Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>
>  _______________________________________________
>  Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk


_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Directory changes when mid-backup?

duplicity-talk mailing list
In reply to this post by duplicity-talk mailing list
Most backup tools will simply report an error at the end that some files
could not be backed up. The backup should usually complete. But it
depends on how picky a specific backup tool is.
For this reason you definitely want some sort of snapshot mechanism
available in your filesystem. Some things you can take a look at are,
BTRFS, ZFS and LVM. BTRFS and ZFS are filesystems and LVM is for
flexible managing of partitions. However, all of these are things that
need to be setup at the beginning, when setting up your system. Adding
it after the fact, while totally doable, can be quite tricky for
inexperienced users.


On 04.11.20 17:04, Phil via Duplicity-talk wrote:

> Thanks Edgar.
>
> I think I understand - so duplicity walks over each folder, and
> retrieves separate file lists for each folder in turn.  (Rather than
> walking the whole nested directory structure and producing a single
> structure of all files and directories in one go?).
>
> So if there was a race condition where duplicity retrieves the list of
> files in a folder, but then a delete sneaks in before each file is archived, will it
> handle this, or will the surprise of the now missing file cause it to
> fall over?
>
> Thanks for the help and sorry for labouring the point :-)
>
>
> edgar.soldin--- via Duplicity-talk writes:
>
>> Phil,
>>
>> afair duplicity just iterates over the source folder, matches against in/exclusions and backs up what it finds. given that, it would depend on if the folder in which the file is located is processed already or not whether the file will be backed up as deleted or as is.
>>
>> hope that makes sense.. ede/duply.net
>>
>> On 11/4/2020 12:52, Phil via Duplicity-talk wrote:
>>> Thanks Oliver - this largely answers my question.
>>>
>>> For deletes (where read locking isn't a thing) - will duplicity just
>>> ignore any files that are deleted in-between file directory structure
>>> retrieval and access time for actual archiving (rather than error because an
>>> expected file is now missing)?
>>>
>>> (I'm assuming that the directory structure is snapshot up-front before any
>>> archiving starts, otherwise I imagine constant re-interrogating of the
>>> filesystem would lead to a whole host of race conditions in very
>>> actively written-to directories - eg if a new large-enough file was written every n seconds the backup
>>> might never end!)
>>>
>>>
>>> Oliver Dunk writes:
>>>
>>>> Hi Phil,
>>>>
>>>> I asked a similar question a few months ago which might interest you: https://lists.nongnu.org/archive/html/duplicity-talk/2020-06/msg00003.html
>>>>
>>>>   On 3 Nov 2020, at 11:18, Phil via Duplicity-talk <[hidden email]> wrote:
>>>>
>>>>   Hi,
>>>>
>>>>   I've been using duplicity for a while to backup daily changing
>>>>   directories of logs and results.
>>>>
>>>>   I have always done this in the dead of night and locked the processes
>>>>   which would usually write to the parent directory being backed up for the
>>>>   duration of the backup.
>>>>
>>>>   However there is now a requirement to add to (and potentially even
>>>>   modify) this directory 24/7.  I've done a few tests and running
>>>>   duplicity on a directory whilst that directory is being written to
>>>>   doesn't seem to fail.
>>>>
>>>>   My question is - is backing up a potentially change directory supported?
>>>>
>>>>   Does duplicity initially snapshot the directory structure - so it will
>>>>   ignore any new files or directories created whilst it is running?  How
>>>>   does it handle files potentially being changed or deleted during the
>>>>   backup - if at all?
>>>>
>>>>   It strikes me that adding files shouldn't really be a problem (they will
>>>>   simply be ignored until the next incremental backup) - but deleting or
>>>>   modifying files mid-backup may be difficult to handle.
>>>>
>>>>   Any advice welcome!
>>>>
>>>>   Thanks,
>>>>   Phil.
>>>>
>>>>   _______________________________________________
>>>>   Duplicity-talk mailing list
>>>>   [hidden email]
>>>>   https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>>>
>>> _______________________________________________
>>> Duplicity-talk mailing list
>>> [hidden email]
>>> https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>>>
>>
>> _______________________________________________
>> Duplicity-talk mailing list
>> [hidden email]
>> https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>
> _______________________________________________
> Duplicity-talk mailing list
> [hidden email]
> https://lists.nongnu.org/mailman/listinfo/duplicity-talk



_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk