Missing files from backup

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Missing files from backup

Thomas Witzel

Hello,

I hope this is not duplicate from a previous issue, but I couldn't find my
exact scenario.

I'm using rdiff-backup 1.2.8 on an unbuntu machine and everything is
working great except that for one of my nightly backups 43 of 9101 files
are always missing. If there was no changes my statistics looks like this:


StartTime 1401379711.00 (Thu May 29 12:08:31 2014)
EndTime 1401379724.83 (Thu May 29 12:08:44 2014)
ElapsedTime 13.83 (13.83 seconds)
SourceFiles 9101
SourceFileSize 335423184 (320 MB)
MirrorFiles 9101
MirrorFileSize 335423184 (320 MB)
NewFiles 0
NewFileSize 0 (0 bytes)
DeletedFiles 0
DeletedFileSize 0 (0 bytes)
ChangedFiles 43
ChangedSourceSize 0 (0 bytes)
ChangedMirrorSize 0 (0 bytes)
IncrementFiles 43
IncrementFileSize 0 (0 bytes)
TotalDestinationSizeChange 0 (0 bytes)
Errors 0

So there is always 43 IncrementFiles, but those files are missing from the
backup and for those 43 files I get messages lile this:

UpdateError xxx.txt Updated mirror temp file /xxxx/rdiff-backup.tmp.77 does not match source

I have no idea what mismatch this is referring too. I can manually copy
the source to rdiff-backup.tmp.77 and with no tools at my disposal can I
find a difference. I also can't see whats different about those 43 files
from the other 9000, they are all very similar with similar names. Of the
43 affected files the most recent change occured 9 months ago in one of
them, so its not that they changed during the backup. How do I go about
debugging this ?

Thank you,
Thomas Witzel



The information in this e-mail is intended only for the person to whom it is
addressed. If you believe this e-mail was sent to you in error and the e-mail
contains patient information, please contact the Partners Compliance HelpLine at
http://www.partners.org/complianceline . If the e-mail was sent to you in error
but does not contain patient information, please contact the sender and properly
dispose of the e-mail.


_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: Missing files from backup

Mike Fleetwood
On 29 May 2014 17:14, Thomas Witzel <[hidden email]> wrote:

>
> Hello,
>
> I hope this is not duplicate from a previous issue, but I couldn't find my
> exact scenario.
>
> I'm using rdiff-backup 1.2.8 on an unbuntu machine and everything is working
> great except that for one of my nightly backups 43 of 9101 files are always
> missing. If there was no changes my statistics looks like this:
>
>
> StartTime 1401379711.00 (Thu May 29 12:08:31 2014)
> EndTime 1401379724.83 (Thu May 29 12:08:44 2014)
> ElapsedTime 13.83 (13.83 seconds)
> SourceFiles 9101
> SourceFileSize 335423184 (320 MB)
> MirrorFiles 9101
> MirrorFileSize 335423184 (320 MB)
> NewFiles 0
> NewFileSize 0 (0 bytes)
> DeletedFiles 0
> DeletedFileSize 0 (0 bytes)
> ChangedFiles 43
> ChangedSourceSize 0 (0 bytes)
> ChangedMirrorSize 0 (0 bytes)
> IncrementFiles 43
> IncrementFileSize 0 (0 bytes)
> TotalDestinationSizeChange 0 (0 bytes)
> Errors 0
>
> So there is always 43 IncrementFiles, but those files are missing from the
> backup and for those 43 files I get messages lile this:
>
> UpdateError xxx.txt Updated mirror temp file /xxxx/rdiff-backup.tmp.77 does
> not match source
>
> I have no idea what mismatch this is referring too. I can manually copy the
> source to rdiff-backup.tmp.77 and with no tools at my disposal can I find a
> difference. I also can't see whats different about those 43 files from the
> other 9000, they are all very similar with similar names. Of the 43 affected
> files the most recent change occured 9 months ago in one of them, so its not
> that they changed during the backup. How do I go about debugging this ?
>
> Thank you,
> Thomas Witzel
Hi Thomas,

Try temporarily applying the attached patch and running rdiff-backup
with -v 7 or greater.  It will log why rdiff-backup decided the file
changed during the backup.

Thanks,
Mike

_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

rdiff-backup-debug-causes-of-update-error-d2.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Missing files from backup

Thomas Witzel

Thanks Mike,

I tried your patch and actually had created something similar myself right
before you replied, but made sure to try your patch as well, since my
python skills are at best rudimentary.

Anyhow, both modifications lead to the result that apparently the size
doesn't match. Over the last few days the number of affected files has
increased to 52, even though the "new" four files with this error also
haven't changed in months.

Anyhow when I copy these files from the source by hand (using the "cp"
command in my shell), the size matches. Also a curious other observation
is that starting from the second error the size attribute of the first
argument is always identical to the size attribute of the second argument
of the previous error. Its too weird to be a coincident. The files
affected are not in the same directories nor seem to have many other
connections, other than being ascii files of the same format and
extension (but so are the other 9000)....

Is there a patch that the tmp file will NOT be deleted in case an error
occurs? I believe taking a look at these files will be quite instructive.

Thanks,
Thomas


  On Thu, 29 May 2014, Mike Fleetwood
wrote:

> On 29 May 2014 17:14, Thomas Witzel <[hidden email]> wrote:
>>
>> Hello,
>>
>> I hope this is not duplicate from a previous issue, but I couldn't find my
>> exact scenario.
>>
>> I'm using rdiff-backup 1.2.8 on an unbuntu machine and everything is working
>> great except that for one of my nightly backups 43 of 9101 files are always
>> missing. If there was no changes my statistics looks like this:
>>
>>
>> StartTime 1401379711.00 (Thu May 29 12:08:31 2014)
>> EndTime 1401379724.83 (Thu May 29 12:08:44 2014)
>> ElapsedTime 13.83 (13.83 seconds)
>> SourceFiles 9101
>> SourceFileSize 335423184 (320 MB)
>> MirrorFiles 9101
>> MirrorFileSize 335423184 (320 MB)
>> NewFiles 0
>> NewFileSize 0 (0 bytes)
>> DeletedFiles 0
>> DeletedFileSize 0 (0 bytes)
>> ChangedFiles 43
>> ChangedSourceSize 0 (0 bytes)
>> ChangedMirrorSize 0 (0 bytes)
>> IncrementFiles 43
>> IncrementFileSize 0 (0 bytes)
>> TotalDestinationSizeChange 0 (0 bytes)
>> Errors 0
>>
>> So there is always 43 IncrementFiles, but those files are missing from the
>> backup and for those 43 files I get messages lile this:
>>
>> UpdateError xxx.txt Updated mirror temp file /xxxx/rdiff-backup.tmp.77 does
>> not match source
>>
>> I have no idea what mismatch this is referring too. I can manually copy the
>> source to rdiff-backup.tmp.77 and with no tools at my disposal can I find a
>> difference. I also can't see whats different about those 43 files from the
>> other 9000, they are all very similar with similar names. Of the 43 affected
>> files the most recent change occured 9 months ago in one of them, so its not
>> that they changed during the backup. How do I go about debugging this ?
>>
>> Thank you,
>> Thomas Witzel
>
> Hi Thomas,
>
> Try temporarily applying the attached patch and running rdiff-backup
> with -v 7 or greater.  It will log why rdiff-backup decided the file
> changed during the backup.
>
> Thanks,
> Mike
>


The information in this e-mail is intended only for the person to whom it is
addressed. If you believe this e-mail was sent to you in error and the e-mail
contains patient information, please contact the Partners Compliance HelpLine at
http://www.partners.org/complianceline . If the e-mail was sent to you in error
but does not contain patient information, please contact the sender and properly
dispose of the e-mail.


_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Backup of a backup

Greg Torrance
Hi all,

What I would like to do for backups is the following: Rather than plug
an external drive into my machine every day, I would like to do daily
rdiff-backups to local storage (since I have more than enough). Then,
once a week, I would like to backup that backup to an external drive.

My original thought was to use rdiff-backup for the local backup,
thereby allowing me to have incremental versions of files. And then,
backp the rdiff-backup using to an external drive using, say, rsync.

The problem I have is the following: Since I don't trust that file
timestamps will be sufficient to determine whether or not a file has
changed (since I have some large files that are updated with NO
timestamp change), I would have to use rsync in the mode that
hashes/checksums each file on the source and destination. And since the
destination is an external drive, it will be extremely time-consuming,
as it means fully reading every file across slow USB (before even
attempting any updates).

Do I understand correctly that rdiff-backup does hashes/checksums of
every file?

And do I understand correctly that it stores this information in the
meta data, such that JUST the meta data needs to be read back from an
external drive, thereby greatly increasing backup speed?

If I am correct about the above assumptions, it seems that rdiff-backup
might be a good option for both my local backup and the backup (of the
local backup) to the external drive.

However, at this point I'm unsure if an rdiff-backup of an rdiff-backup
is even possible. Is it? How will it handle the backup of the meta data?
Will the files conflict?

A related question: Is there any way to have rdiff-backup create the
meta data in a separate location from the actual files? It would seem
that might be quite a useful option. Just curious.

[I am running 1.2.8 on a Linux Mint machine.]

I'd really appreciate any help or suggestions you can offer.

Thanks,
Greg

_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: Backup of a backup

Mike Fleetwood
On 1 June 2014 20:40, Greg Torrance <[hidden email]> wrote:

> Hi all,
>
> What I would like to do for backups is the following: Rather than plug an
> external drive into my machine every day, I would like to do daily
> rdiff-backups to local storage (since I have more than enough). Then, once a
> week, I would like to backup that backup to an external drive.
>
> My original thought was to use rdiff-backup for the local backup, thereby
> allowing me to have incremental versions of files. And then, backp the
> rdiff-backup using to an external drive using, say, rsync.
>
> The problem I have is the following: Since I don't trust that file
> timestamps will be sufficient to determine whether or not a file has changed
> (since I have some large files that are updated with NO timestamp change), I
> would have to use rsync in the mode that hashes/checksums each file on the
> source and destination. And since the destination is an external drive, it
> will be extremely time-consuming, as it means fully reading every file
> across slow USB (before even attempting any updates).
>
> Do I understand correctly that rdiff-backup does hashes/checksums of every
> file?
>
> And do I understand correctly that it stores this information in the meta
> data, such that JUST the meta data needs to be read back from an external
> drive, thereby greatly increasing backup speed?
>
> If I am correct about the above assumptions, it seems that rdiff-backup
> might be a good option for both my local backup and the backup (of the local
> backup) to the external drive.
>
> However, at this point I'm unsure if an rdiff-backup of an rdiff-backup is
> even possible. Is it? How will it handle the backup of the meta data? Will
> the files conflict?
>
> A related question: Is there any way to have rdiff-backup create the meta
> data in a separate location from the actual files? It would seem that might
> be quite a useful option. Just curious.
>
> [I am running 1.2.8 on a Linux Mint machine.]
>
> I'd really appreciate any help or suggestions you can offer.
>
> Thanks,
> Greg

It is unusual for an application to deliberately restoring previous
mtime after writing to the file.  Testing rdiff-backup ...

Initial backup
# cd /var/tmp
# mkdir test
# echo foo > test/foo
# rdiff-backup test /mnt/1/backup-test
# zcat /mnt/1/backup-test/rdiff-backup-data/mirror_metadata.*.snapshot.gz
...
File foo
  Type reg
  Size 4
  SHA1Digest f1d2d2f924e986ac86fdf7b36c94bcdf32beec15
  ModTime 1401728418
...

Change file without changing mtime
# touch -r test/foo ref
# echo bar > test/foo
# touch -r ref test/foo

Second backup
# rdiff-backup test /mnt/1/backup-test
# zcat /mnt/1/backup-test/rdiff-backup-data/mirror_metadata.*.snapshot.gz
...
File foo
  Type reg
  Size 4
  SHA1Digest f1d2d2f924e986ac86fdf7b36c94bcdf32beec15
  ModTime 1401728418
...
# cat test/foo
bar
# cat /mnt/1/backup-test/foo
foo
# rdiff-backup --compare-full test /mnt/1/backup-test
metadata the same, data changed: foo
# echo $?
1

This looks like rdiff-backup is using mtime to determine if the source
files have changed and need to be backed up or not.

Whatever you decide make sure you test backup and restore in your scenario.

Thanks,
Mike

_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: Missing files from backup

Mike Fleetwood
In reply to this post by Thomas Witzel
On 1 June 2014 19:57, Thomas Witzel <[hidden email]> wrote:

>
> Thanks Mike,
>
> I tried your patch and actually had created something similar myself right
> before you replied, but made sure to try your patch as well, since my python
> skills are at best rudimentary.
>
> Anyhow, both modifications lead to the result that apparently the size
> doesn't match. Over the last few days the number of affected files has
> increased to 52, even though the "new" four files with this error also
> haven't changed in months.
>
> Anyhow when I copy these files from the source by hand (using the "cp"
> command in my shell), the size matches. Also a curious other observation is
> that starting from the second error the size attribute of the first argument
> is always identical to the size attribute of the second argument of the
> previous error. Its too weird to be a coincident. The files affected are not
> in the same directories nor seem to have many other connections, other than
> being ascii files of the same format and extension (but so are the other
> 9000)....
>
> Is there a patch that the tmp file will NOT be deleted in case an error
> occurs? I believe taking a look at these files will be quite instructive.
>
> Thanks,
> Thomas

Please post the verbose messages for the first few problem files.

I don't know of any such patch.  You might try putting a return as the
first line of TempFile.py, class TempFile, def delete().  Never tried
this so I don't know what it will do.

Mike

_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: Missing files from backup

Thomas Witzel

Hi, its kind of difficult to post this here, because even the file names
are and its be a bit of work to change all that. The attribute error is
like this:

attribute differs: first.data['size']=350 != second.data['size']=340

I managed to patch rpath.py so the tmp file doesn't get deleted. I found
that the tmp file is in fact identical to the source, and the true size is
340 (of both the tmp file as well as the source). Strangely, also as of
now its now only 24 files that show this behaviour. I'm really going crazy
with this.

Thomas

  On Mon, 2 Jun 2014, Mike Fleetwood wrote:

> On 1 June 2014 19:57, Thomas Witzel <[hidden email]> wrote:
>>
>> Thanks Mike,
>>
>> I tried your patch and actually had created something similar myself right
>> before you replied, but made sure to try your patch as well, since my python
>> skills are at best rudimentary.
>>
>> Anyhow, both modifications lead to the result that apparently the size
>> doesn't match. Over the last few days the number of affected files has
>> increased to 52, even though the "new" four files with this error also
>> haven't changed in months.
>>
>> Anyhow when I copy these files from the source by hand (using the "cp"
>> command in my shell), the size matches. Also a curious other observation is
>> that starting from the second error the size attribute of the first argument
>> is always identical to the size attribute of the second argument of the
>> previous error. Its too weird to be a coincident. The files affected are not
>> in the same directories nor seem to have many other connections, other than
>> being ascii files of the same format and extension (but so are the other
>> 9000)....
>>
>> Is there a patch that the tmp file will NOT be deleted in case an error
>> occurs? I believe taking a look at these files will be quite instructive.
>>
>> Thanks,
>> Thomas
>
> Please post the verbose messages for the first few problem files.
>
> I don't know of any such patch.  You might try putting a return as the
> first line of TempFile.py, class TempFile, def delete().  Never tried
> this so I don't know what it will do.
>
> Mike
>
>
>


The information in this e-mail is intended only for the person to whom it is
addressed. If you believe this e-mail was sent to you in error and the e-mail
contains patient information, please contact the Partners Compliance HelpLine at
http://www.partners.org/complianceline . If the e-mail was sent to you in error
but does not contain patient information, please contact the sender and properly
dispose of the e-mail.


_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
Reply | Threaded
Open this post in threaded view
|

Re: Missing files from backup

Leland Best
Hi Thomas,

I know this is kind of grasping at straws but ...

On Mon, 2014-06-02 at 20:57 -0400, Thomas Witzel wrote:
> Hi, its kind of difficult to post this here, because even the file names
> are and its be a bit of work to change all that. The attribute error is
> like this:
>
> attribute differs: first.data['size']=350 != second.data['size']=340
[...]

I use rdiff-backup fairly often and have not had this problem.  However,
I recently got a few errors on an 'rsync' run.  The errors occurred on
only four or five seemingly unrelated files and, while I no longer have
the exact error message, I do recall that they were quite misleading.
After several hours of "poking around" it turned out my backup hard
drive was failing.  So, as I said, it's a shot in the dark but ... have
you checked your drive?  I didn't find my problem until I ran 'e2fsck'
with '-c' (non-destrutive bad block check).  Running 'e2fsck
-f ...' (i.e. without '-c') said everything was fine.

Cheers
Leland
--
-------------------------------------------------------------------------------
Leland C. Best      | Creationists make it sound as though a 'theory' is
[hidden email] | something you dreamt up after being drunk all night.
                    | -- Isaac Asimov
-------------------------------------------------------------------------------


_______________________________________________
rdiff-backup-users mailing list at [hidden email]
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki