Doubled dates in old repositories

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Doubled dates in old repositories

EricZolf
Hi,

we've got a report for an issue [322] where we're not sure how it
happened and if it happens often. As I know that some of you have many
and long standing repos, you might want to have a look before you
upgrade to 2.0.0. Also I'd like to know if the problem is unique (manual
intervention could have caused the issue) or generalized.

Basically, there are two entries with exactly the same date and time in
the repository, this shouldn't be and this makes rdiff-backup 2.0.0
choke on it (rdiff-backup 1.2/1.3 -actually Python 2- doesn't choke but
its reaction is unpredictable, which isn't really better).

How to detect (under Linux): run `cd MY_BACKUP_REPO; ls -1
rdiff-backup-data/mirror_metadata.* | sed -e 's/^.*mirror_metadata\.//'
-e 's/\.[a-z]*.gz$//' | uniq -cd` -> if the output is NOT empty, you
have the issue.

How to fix: delete with 1.2/1.3 the older entries in your repo(s) up and
including to the dates given by the previous command.

As said, small feedback to understand the extension of the issue would
be great. I'll work around the issue to make rdiff-backup 2 more robust
in this regard.

Thanks, Eric

[322] https://github.com/rdiff-backup/rdiff-backup/issues/322

Reply | Threaded
Open this post in threaded view
|

Re: Doubled dates in old repositories

Dominic Raferd-3
On Mon, 20 Apr 2020 at 06:46, Eric L. Zolf <[hidden email]> wrote:

> we've got a report for an issue [322] where we're not sure how it
> happened and if it happens often. As I know that some of you have many
> and long standing repos, you might want to have a look before you
> upgrade to 2.0.0. Also I'd like to know if the problem is unique (manual
> intervention could have caused the issue) or generalized.
>
> Basically, there are two entries with exactly the same date and time in
> the repository, this shouldn't be and this makes rdiff-backup 2.0.0
> choke on it (rdiff-backup 1.2/1.3 -actually Python 2- doesn't choke but
> its reaction is unpredictable, which isn't really better).
>
> How to detect (under Linux): run `cd MY_BACKUP_REPO; ls -1
> rdiff-backup-data/mirror_metadata.* | sed -e 's/^.*mirror_metadata\.//'
> -e 's/\.[a-z]*.gz$//' | uniq -cd` -> if the output is NOT empty, you
> have the issue.
>
> How to fix: delete with 1.2/1.3 the older entries in your repo(s) up and
> including to the dates given by the previous command.
>
> As said, small feedback to understand the extension of the issue would
> be great. I'll work around the issue to make rdiff-backup 2 more robust
> in this regard.
>

I have checked our 108 repositories which go back a long way (I am still
using v1.2.8). I find this issue in one repository for some 27 dates
(mostly but not all consecutive) in January and February 2009. They are
almost (but not quite) the oldest entries in this repo. How do you
recommend I delete - using --remove-older-than? By the way, verification of
all our repositories fails earlier than 2013 but I have left the earlier
data in place because it might still be somewhat recoverable - I am
wondering if I should remove it all before upgrading to v2.

Thanks for all your (and others') great work on updating rdiff-backup.

Dominic
Reply | Threaded
Open this post in threaded view
|

Re: Doubled dates in old repositories

EricZolf
Hi Dominic,

On 20/04/2020 08:40, Dominic Raferd wrote:
> I have checked our 108 repositories which go back a long way (I am still
> using v1.2.8). I find this issue in one repository for some 27 dates
> (mostly but not all consecutive) in January and February 2009. They are
> almost (but not quite) the oldest entries in this repo. How do you
> recommend I delete - using --remove-older-than? By the way, verification of

Correct, taking the date of the increment just after the one duplicated.

> all our repositories fails earlier than 2013 but I have left the earlier
> data in place because it might still be somewhat recoverable - I am
> wondering if I should remove it all before upgrading to v2.

How do you mean, it fails? A repository not touched since 2013 fails on
--verify, or all your repositories fail on `--verify-at-time
<time-2013-or-before>`, or even something else?

Nothing happened in the code between 2009 and 2019, so the date 2013 is
strange...

KR, Eric

Reply | Threaded
Open this post in threaded view
|

Re: Doubled dates in old repositories

Dominic Raferd-3
On Mon, 20 Apr 2020 at 19:10, Eric L. Zolf <[hidden email]> wrote:

> Hi Dominic,
>
> On 20/04/2020 08:40, Dominic Raferd wrote:
> > I have checked our 108 repositories which go back a long way (I am still
> > using v1.2.8). I find this issue in one repository for some 27 dates
> > (mostly but not all consecutive) in January and February 2009. They are
> > almost (but not quite) the oldest entries in this repo. How do you
> > recommend I delete - using --remove-older-than? By the way, verification
> of
>
> Correct, taking the date of the increment just after the one duplicated.
>
Thanks

>
> > all our repositories fails earlier than 2013 but I have left the earlier
> > data in place because it might still be somewhat recoverable - I am
> > wondering if I should remove it all before upgrading to v2.
>
> How do you mean, it fails? A repository not touched since 2013 fails on
> --verify, or all your repositories fail on `--verify-at-time
> <time-2013-or-before>`, or even something else?
>

Sorry yes I mean that verification fails for (all I think) repositories
before that date.
The date probably relates to some problem that occurred on our system at
that time.
Reply | Threaded
Open this post in threaded view
|

Re: Doubled dates in old repositories

Gregor Zattler
In reply to this post by EricZolf
Hi Eric,
* "Eric L. Zolf" <[hidden email]> [2020-04-20; 07:43]:
> How to detect (under Linux): run `cd MY_BACKUP_REPO; ls -1
> rdiff-backup-data/mirror_metadata.* | sed -e 's/^.*mirror_metadata\.//'
> -e 's/\.[a-z]*.gz$//' | uniq -cd` -> if the output is NOT empty, you
> have the issue.

with this command line I see no output on two repos dating from
2012 for which rdiff-backup -l | wc -l shows 2386 and 2499 lines
respectively.

rdiff-backup is now on version 1.2.8 and both repos are local
ones (meaning there was never a version mismatch of local and
remote rdiff-backup programs involved).

Ciao; Gregor
--
 -... --- .-. . -.. ..--.. ...-.-


Reply | Threaded
Open this post in threaded view
|

Re: Doubled dates in old repositories

Eric Lavarde
In reply to this post by Dominic Raferd-3
Hi,

I've created a fix for the issue at [328], in case someone is willing to
test it on an impacted repo, or rather on a _copy_ of it.

On 20/04/2020 22:04, Dominic Raferd wrote:
>> How do you mean, it fails? A repository not touched since 2013 fails on
>> --verify, or all your repositories fail on `--verify-at-time
>> <time-2013-or-before>`, or even something else?
>
> Sorry yes I mean that verification fails for (all I think) repositories
> before that date.
> The date probably relates to some problem that occurred on our system at
> that time.

The last snapshot can still be recovered by hand, simply with cp or
rsync, but indeed, without more information, it doesn't sound like
something we can fix without a lot of work. Depending on how important
the data is to you, and how much disk space you need to recover, I would
get rid of those repositories all together or "transform" them into
simple file repos by removing the "rdiff-backup-data" directory.

KR, Eric

[328] https://github.com/rdiff-backup/rdiff-backup/pull/328

Reply | Threaded
Open this post in threaded view
|

Re: Doubled dates in old repositories

Eric Lavarde
In reply to this post by Gregor Zattler


On 20/04/2020 22:54, Gregor Zattler wrote:

> Hi Eric,
> * "Eric L. Zolf" <[hidden email]> [2020-04-20; 07:43]:
>> How to detect (under Linux): run `cd MY_BACKUP_REPO; ls -1
>> rdiff-backup-data/mirror_metadata.* | sed -e 's/^.*mirror_metadata\.//'
>> -e 's/\.[a-z]*.gz$//' | uniq -cd` -> if the output is NOT empty, you
>> have the issue.
>
> with this command line I see no output on two repos dating from
> 2012 for which rdiff-backup -l | wc -l shows 2386 and 2499 lines
> respectively.
>
> rdiff-backup is now on version 1.2.8 and both repos are local
> ones (meaning there was never a version mismatch of local and
> remote rdiff-backup programs involved).


OK, noted. Thanks.

So we have currently a timespan of 2009-2011 for this issue.

KR, Eric

>
>

Reply | Threaded
Open this post in threaded view
|

Re: Doubled dates in old repositories

Dominic Raferd-3
On Tue, 21 Apr 2020 at 07:17, Eric Lavarde <[hidden email]> wrote:

>
>
> On 20/04/2020 22:54, Gregor Zattler wrote:
> > Hi Eric,
> > * "Eric L. Zolf" <[hidden email]> [2020-04-20; 07:43]:
> >> How to detect (under Linux): run `cd MY_BACKUP_REPO; ls -1
> >> rdiff-backup-data/mirror_metadata.* | sed -e 's/^.*mirror_metadata\.//'
> >> -e 's/\.[a-z]*.gz$//' | uniq -cd` -> if the output is NOT empty, you
> >> have the issue.
> >
> > with this command line I see no output on two repos dating from
> > 2012 for which rdiff-backup -l | wc -l shows 2386 and 2499 lines
> > respectively.
> >
> > rdiff-backup is now on version 1.2.8 and both repos are local
> > ones (meaning there was never a version mismatch of local and
> > remote rdiff-backup programs involved).
>
>
> OK, noted. Thanks.
>
> So we have currently a timespan of 2009-2011 for this issue.
>
> KR, Eric
>

Despite the failed verifications, I have successfully recovered a largish
(>150MB) file from December 2008 in the impacted repository. As there are
2806 later versions of this file (with changes in this file occurring
between most of these versions), the recovery presumably involved applying
2806 reverse-diffs to the current file - pretty impressive work by
rdiff-backup! (Sorry, still on v1.2.8)
Reply | Threaded
Open this post in threaded view
|

Re: Doubled dates in old repositories

EricZolf
Hi,

On 21/04/2020 13:46, Dominic Raferd wrote:
> Despite the failed verifications, I have successfully recovered a largish
> (>150MB) file from December 2008 in the impacted repository. As there are
> 2806 later versions of this file (with changes in this file occurring
> between most of these versions), the recovery presumably involved applying
> 2806 reverse-diffs to the current file - pretty impressive work by
> rdiff-backup! (Sorry, still on v1.2.8)

As only the metadata are impacted, I'm not too surprised but I also
can't rule out that it might go wrong.

KR, Eric