Strange reported size (--no-compression usage)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Strange reported size (--no-compression usage)

duplicity-talk mailing list
Hi everyone,

On a 67GB set of data (size reported by "du" command), I have a strange
backup size :

--------------[ Backup Statistics ]--------------
StartTime 1542206341.16 (Wed Nov 14 15:39:01 2018)
EndTime 1542213693.41 (Wed Nov 14 17:41:33 2018)
ElapsedTime 7352.25 (2 hours 2 minutes 32.25 seconds)
SourceFiles 15785
SourceFileSize 437413719036 (407 GB)
NewFiles 15785
NewFileSize 437413719036 (407 GB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 15785
RawDeltaSize 71926352063 (67.0 GB)
TotalDestinationSizeChange 23016875024 (21.4 GB)
Errors 0
-------------------------------------------------

It is a full backup, using --no-compression option.

Why is my backup only 21.4 GB when my data is 67GB ?

And what is the "407 GB" size ?? I don't have such data !

The data is on a CephFS filesystem, using fuse, and backup destination
is multi par2+B2+S3.

Thank you.

Florent


_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Strange reported size (--no-compression usage)

duplicity-talk mailing list
Do you have sparse files?

> On Nov 14, 2018, at 10:32, Florent B via Duplicity-talk <[hidden email]> wrote:
>
> Hi everyone,
>
> On a 67GB set of data (size reported by "du" command), I have a strange
> backup size :
>
> --------------[ Backup Statistics ]--------------
> StartTime 1542206341.16 (Wed Nov 14 15:39:01 2018)
> EndTime 1542213693.41 (Wed Nov 14 17:41:33 2018)
> ElapsedTime 7352.25 (2 hours 2 minutes 32.25 seconds)
> SourceFiles 15785
> SourceFileSize 437413719036 (407 GB)
> NewFiles 15785
> NewFileSize 437413719036 (407 GB)
> DeletedFiles 0
> ChangedFiles 0
> ChangedFileSize 0 (0 bytes)
> ChangedDeltaSize 0 (0 bytes)
> DeltaEntries 15785
> RawDeltaSize 71926352063 (67.0 GB)
> TotalDestinationSizeChange 23016875024 (21.4 GB)
> Errors 0
> -------------------------------------------------
>
> It is a full backup, using --no-compression option.
>
> Why is my backup only 21.4 GB when my data is 67GB ?
>
> And what is the "407 GB" size ?? I don't have such data !
>
> The data is on a CephFS filesystem, using fuse, and backup destination
> is multi par2+B2+S3.
>
> Thank you.
>
> Florent
>
>
> _______________________________________________
> Duplicity-talk mailing list
> [hidden email]
> https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>


_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Strange reported size (--no-compression usage)

duplicity-talk mailing list
Maybe (e-mail storage with mdbox format) but sparse files are reported with their real disk usage with du command (67GB). It can explain the 407GB but not the 21.4GB backup size, can't it ?

Le 14 novembre 2018 21:42:53 GMT+01:00, Nate Eldredge <[hidden email]> a écrit :
Do you have sparse files?

On Nov 14, 2018, at 10:32, Florent B via Duplicity-talk <[hidden email]> wrote:

Hi everyone,

On a 67GB set of data (size reported by "du" command), I have a strange
backup size :

--------------[ Backup Statistics ]--------------
StartTime 1542206341.16 (Wed Nov 14 15:39:01 2018)
EndTime 1542213693.41 (Wed Nov 14 17:41:33 2018)
ElapsedTime 7352.25 (2 hours 2 minutes 32.25 seconds)
SourceFiles 15785
SourceFileSize 437413719036 (407 GB)
NewFiles 15785
NewFileSize 437413719036 (407 GB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 15785
RawDeltaSize 71926352063 (67.0 GB)
TotalDestinationSizeChange 23016875024 (21.4 GB)
Errors 0
It is a full backup, using --no-compression option.

Why is my backup only 21.4 GB when my data is 67GB ?

And what is the "407 GB" size ?? I don't have such data !

The data is on a CephFS filesystem, using fuse, and backup destination
is multi par2+B2+S3.

Thank you.

Florent
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk



_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Strange reported size (--no-compression usage)

duplicity-talk mailing list
Can you give the complete command line you are using?

On Nov 14, 2018, at 14:26, Florent B <[hidden email]> wrote:

Maybe (e-mail storage with mdbox format) but sparse files are reported with their real disk usage with du command (67GB). It can explain the 407GB but not the 21.4GB backup size, can't it ?

Le 14 novembre 2018 21:42:53 GMT+01:00, Nate Eldredge <[hidden email]> a écrit :
Do you have sparse files?

On Nov 14, 2018, at 10:32, Florent B via Duplicity-talk <[hidden email]> wrote:

Hi everyone,

On a 67GB set of data (size reported by "du" command), I have a strange
backup size :

--------------[ Backup Statistics ]--------------
StartTime 1542206341.16 (Wed Nov 14 15:39:01 2018)
EndTime 1542213693.41 (Wed Nov 14 17:41:33 2018)
ElapsedTime 7352.25 (2 hours 2 minutes 32.25 seconds)
SourceFiles 15785
SourceFileSize 437413719036 (407 GB)
NewFiles 15785
NewFileSize 437413719036 (407 GB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 15785
RawDeltaSize 71926352063 (67.0 GB)
TotalDestinationSizeChange 23016875024 (21.4 GB)
Errors 0
It is a full backup, using --no-compression option.

Why is my backup only 21.4 GB when my data is 67GB ?

And what is the "407 GB" size ?? I don't have such data !

The data is on a CephFS filesystem, using fuse, and backup destination
is multi par2+B2+S3.

Thank you.

Florent
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk



_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Strange reported size (--no-compression usage)

duplicity-talk mailing list
If you’re willing to share the output of ‘find -ls’ it might yield some insight too. 

On Nov 14, 2018, at 16:44, Nate Eldredge via Duplicity-talk <[hidden email]> wrote:

Can you give the complete command line you are using?

On Nov 14, 2018, at 14:26, Florent B <[hidden email]> wrote:

Maybe (e-mail storage with mdbox format) but sparse files are reported with their real disk usage with du command (67GB). It can explain the 407GB but not the 21.4GB backup size, can't it ?

Le 14 novembre 2018 21:42:53 GMT+01:00, Nate Eldredge <[hidden email]> a écrit :
Do you have sparse files?

On Nov 14, 2018, at 10:32, Florent B via Duplicity-talk <[hidden email]> wrote:

Hi everyone,

On a 67GB set of data (size reported by "du" command), I have a strange
backup size :

--------------[ Backup Statistics ]--------------
StartTime 1542206341.16 (Wed Nov 14 15:39:01 2018)
EndTime 1542213693.41 (Wed Nov 14 17:41:33 2018)
ElapsedTime 7352.25 (2 hours 2 minutes 32.25 seconds)
SourceFiles 15785
SourceFileSize 437413719036 (407 GB)
NewFiles 15785
NewFileSize 437413719036 (407 GB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 15785
RawDeltaSize 71926352063 (67.0 GB)
TotalDestinationSizeChange 23016875024 (21.4 GB)
Errors 0
It is a full backup, using --no-compression option.

Why is my backup only 21.4 GB when my data is 67GB ?

And what is the "407 GB" size ?? I don't have such data !

The data is on a CephFS filesystem, using fuse, and backup destination
is multi par2+B2+S3.

Thank you.

Florent
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk


_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk

_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Strange reported size (--no-compression usage)

duplicity-talk mailing list
In reply to this post by duplicity-talk mailing list
Okay, so there are two things going on.

1. Looking at your `find` output, your data set contains some very large
*directories*.

1099525753158      1 drwx------   4 vmail    vmail    53698251933 Nov 15 10:18 <name>

The directory itself is "sparse"; it has a size of 54 GB but only one
block allocated.  But duplicity includes the full size (as reported by
`stat`) in its statistics.  So that explains the 407 GB, which you can
just ignore.  This doesn't really have any direct impact on what goes in
the backup, since duplicity would just `readdir` through the directory and
collect metadata for the entries that it actually contains.

2. The --no-compression flag doesn't have any effect when you are
encrypting your backup.  The encryption is done by gpg, which by default
also compresses the data before encrypting.  duplicity will do its own
compression only if you are using --no-encryption, and in that case only,
--no-compression would turn it off.  So your data is in fact getting
compressed, which explains the 21.4 GB.

If you *really* want encryption but no compression, I believe you could do
"--gpg-options --compress-level=0'.  But I don't know why you would want
to do that.  Your data apparently compresses pretty well, so why not take
advantage?  Turning off compression will not only waste space and
bandwidth, but probably also take a lot more time and CPU; AFAIK
encryption is many times slower than compression, so it's a win to
compress the data first and have less to encrypt.

On Thu, 15 Nov 2018, Florent B wrote:

> Hi,
>
> The full command I used is :
>
> /usr/bin/duplicity incr --name $NAME --no-compression --archive-dir
> $CACHE_DIR --tempdir $TEMP_DIR --allow-source-mismatch
> --full-if-older-than 1M --gpg-options="--always-trust" --encrypt-key
> $ENCRYPT_KEY --sign-key $SIGN_KEY --asynchronous-upload
> --exclude-if-present ".nobackup" --volsize $VOLSIZE $BACKUP_DIR/
> "par2+multi://$MULTI_CONFIG?mode=mirror&onfail=abort"
>
> I sent the result of "find -ls" command to you ;)
>
> Florent
>
> On 15/11/2018 00:44, Nate Eldredge wrote:
>> Can you give the complete command line you are using?
>>
>> On Nov 14, 2018, at 14:26, Florent B <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>>> Maybe (e-mail storage with mdbox format) but sparse files are
>>> reported with their real disk usage with du command (67GB). It can
>>> explain the 407GB but not the 21.4GB backup size, can't it ?
>>>
>>> Le 14 novembre 2018 21:42:53 GMT+01:00, Nate Eldredge
>>> <[hidden email] <mailto:[hidden email]>> a écrit :
>>>
>>>     Do you have sparse files?
>>>
>>>         On Nov 14, 2018, at 10:32, Florent B via Duplicity-talk
>>>         <[hidden email]
>>>         <mailto:[hidden email]>> wrote: Hi everyone, On a
>>>         67GB set of data (size reported by "du" command), I have a
>>>         strange backup size : --------------[ Backup Statistics
>>>         ]-------------- StartTime 1542206341.16 (Wed Nov 14 15:39:01
>>>         2018) EndTime 1542213693.41 (Wed Nov 14 17:41:33 2018)
>>>         ElapsedTime 7352.25 (2 hours 2 minutes 32.25 seconds)
>>>         SourceFiles 15785 SourceFileSize 437413719036 (407 GB)
>>>         NewFiles 15785 NewFileSize 437413719036 (407 GB) DeletedFiles
>>>         0 ChangedFiles 0 ChangedFileSize 0 (0 bytes) ChangedDeltaSize
>>>         0 (0 bytes) DeltaEntries 15785 RawDeltaSize 71926352063 (67.0
>>>         GB) TotalDestinationSizeChange 23016875024 (21.4 GB) Errors 0
>>>         ------------------------------------------------------------------------
>>>         It is a full backup, using --no-compression option. Why is my
>>>         backup only 21.4 GB when my data is 67GB ? And what is the
>>>         "407 GB" size ?? I don't have such data ! The data is on a
>>>         CephFS filesystem, using fuse, and backup destination is
>>>         multi par2+B2+S3. Thank you. Florent
>>>         ------------------------------------------------------------------------
>>>         Duplicity-talk mailing list [hidden email]
>>>         <mailto:[hidden email]>
>>>         https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>>>
>>>
>
--
Nate Eldredge
[hidden email]

_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: Strange reported size (--no-compression usage)

duplicity-talk mailing list
Ok Thank you Nate, all is well explained !
I wanted to disable compression thinking it could save some cpu time.
Thank you !
Florent

Le 15 novembre 2018 18:18:07 GMT+01:00, Nate Eldredge <[hidden email]> a écrit :
Okay, so there are two things going on.

1. Looking at your `find` output, your data set contains some very large
*directories*.

1099525753158 1 drwx------ 4 vmail vmail 53698251933 Nov 15 10:18 <name>

The directory itself is "sparse"; it has a size of 54 GB but only one
block allocated. But duplicity includes the full size (as reported by
`stat`) in its statistics. So that explains the 407 GB, which you can
just ignore. This doesn't really have any direct impact on what goes in
the backup, since duplicity would just `readdir` through the directory and
collect metadata for the entries that it actually contains.

2. The --no-compression flag doesn't have any effect when you are
encrypting your backup. The encryption is done by gpg, which by default
also compresses the data before encrypting. duplicity will do its own
compression only if you are using --no-encryption, and in that case only,
--no-compression would turn it off. So your data is in fact getting
compressed, which explains the 21.4 GB.

If you *really* want encryption but no compression, I believe you could do
"--gpg-options --compress-level=0'. But I don't know why you would want
to do that. Your data apparently compresses pretty well, so why not take
advantage? Turning off compression will not only waste space and
bandwidth, but probably also take a lot more time and CPU; AFAIK
encryption is many times slower than compression, so it's a win to
compress the data first and have less to encrypt.

On Thu, 15 Nov 2018, Florent B wrote:

Hi,

The full command I used is :

/usr/bin/duplicity incr --name $NAME --no-compression --archive-dir
$CACHE_DIR --tempdir $TEMP_DIR --allow-source-mismatch
--full-if-older-than 1M --gpg-options="--always-trust" --encrypt-key
$ENCRYPT_KEY --sign-key $SIGN_KEY --asynchronous-upload
--exclude-if-present ".nobackup" --volsize $VOLSIZE $BACKUP_DIR/
"par2+multi://$MULTI_CONFIG?mode=mirror&onfail=abort"

I sent the result of "find -ls" command to you ;)

Florent

On 15/11/2018 00:44, Nate Eldredge wrote:
Can you give the complete command line you are using?

On Nov 14, 2018, at 14:26, Florent B <[hidden email]
<mailto:[hidden email]>> wrote:

Maybe (e-mail storage with mdbox format) but sparse files are
reported with their real disk usage with du command (67GB). It can
explain the 407GB but not the 21.4GB backup size, can't it ?

Le 14 novembre 2018 21:42:53 GMT+01:00, Nate Eldredge
<[hidden email] <mailto:[hidden email]>> a écrit :

Do you have sparse files?

On Nov 14, 2018, at 10:32, Florent B via Duplicity-talk
<[hidden email]
<mailto:[hidden email]>> wrote: Hi everyone, On a
67GB set of data (size reported by "du" command), I have a
strange backup size : --------------[ Backup Statistics
]-------------- StartTime 1542206341.16 (Wed Nov 14 15:39:01
2018) EndTime 1542213693.41 (Wed Nov 14 17:41:33 2018)
ElapsedTime 7352.25 (2 hours 2 minutes 32.25 seconds)
SourceFiles 15785 SourceFileSize 437413719036 (407 GB)
NewFiles 15785 NewFileSize 437413719036 (407 GB) DeletedFiles
0 ChangedFiles 0 ChangedFileSize 0 (0 bytes) ChangedDeltaSize
0 (0 bytes) DeltaEntries 15785 RawDeltaSize 71926352063 (67.0
GB) TotalDestinationSizeChange 23016875024 (21.4 GB) Errors 0
It is a full backup, using --no-compression option. Why is my
backup only 21.4 GB when my data is 67GB ? And what is the
"407 GB" size ?? I don't have such data ! The data is on a
CephFS filesystem, using fuse, and backup destination is
multi par2+B2+S3. Thank you. Florent
Duplicity-talk mailing list [hidden email]
<mailto:[hidden email]>
https://lists.nongnu.org/mailman/listinfo/duplicity-talk




_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk