How does rdiff-backup really detect changed files?
Dear wizards of rdiff-backup,
as a user of rdiff-backup and main author of the German Wikipedia page
on rdiff-backup I want to ask you a few questions because the manual
pages are not very clear on this:
- Is SHA-1 always used to determine if a file has been modified? I
assume this since the file "mirror_metadata.*.diff.gz" contains the
SHA-1 digests of the files that have changed since last backup.
- What is the exact meaning of the --verify, --compare-hash and
compare-full options? They all seem to deal with SHA-1 digests or making
sure that changed files are indeed copied to the backup directory. This
is a bit irritating - users might chose the --compare-hash option, but
rdiff-backup --compare hash <source> <destination> does not actually do
With the very best thanks for this great but yet extremely simple backup
Re: How does rdiff-backup really detect changed files?
On 04/10/2016 09:15 AM, David Croll wrote:
> Dear wizards of rdiff-backup,
> as a user of rdiff-backup and main author of the German Wikipedia page
> on rdiff-backup I want to ask you a few questions because the manual
> pages are not very clear on this:
> - Is SHA-1 always used to determine if a file has been modified?
Absolutely not! That would require running sha1sum over every file in
the source directory, which could take many hours or even days for a
complete system with terabytes of data. What is tested are those fields
in the inode that are also stored in the latest mirror-metadata file:
type, size, mtime, uid, gid, and permissions. For non-directory files
with more than 1 hard link that also includes the device and inode
numbers unless the --no-compare-inode option is used.
> - What is the exact meaning of the --verify, --compare-hash and
> compare-full options?
The --verify and --verify-at-time options check the internal consistency
of the archive. Every file is reconstructed from the mirror + diffs, and
its sha1sum is compared with the stored value.
The --compare-hash and --compare-full options (and their *-at-time
variants) compare the current contents of the source with the
archive. The --compare-hash option just computes the sha1sum of the
source files and compares that against the metadata stored in the
archive. The --compare-full option actually reconstructs each file from
the mirror + diffs and compares that file byte-by-byte with the source.
Bob Nichols "NOSPAM" is really part of my email address.
Do NOT delete it.