zstd compression

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

zstd compression

duplicity-talk mailing list
Hi,

Is there a way to compress backups using [zstd]? I know that normally in duplicity the compression is left to gpg, but gpg's selection for compression is quite limited and support basically gzip and bzip2. Zstd outperforms both algorithms in terms of compression and speed.

Ideally, I would like to give duplicity a command through which it pipes the tar archives before passing it on to gpg.


Thanks,
Guy

_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: zstd compression

duplicity-talk mailing list
You are the second to request this.  See the bug report at https://bugs.launchpad.net/bugs/1860200.  Perhaps you and Byron could collaborate to make this happen.  That would be greatly appreciated.

...Thanks,
...Ken

On Wed, Apr 15, 2020 at 9:59 AM Guy Rutenberg via Duplicity-talk <[hidden email]> wrote:
Hi,

Is there a way to compress backups using [zstd]? I know that normally in duplicity the compression is left to gpg, but gpg's selection for compression is quite limited and support basically gzip and bzip2. Zstd outperforms both algorithms in terms of compression and speed.

Ideally, I would like to give duplicity a command through which it pipes the tar archives before passing it on to gpg.


Thanks,
Guy
_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk

_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: zstd compression

duplicity-talk mailing list
Hi Ken,

Thanks. I read the discussion in the bug report. I saw quite a bit of discussion around a new multi-processing architecture. I think I can go forward with a simple design that will work for now and be able to break into multiprocessing in the future. The idea is to add a new file-obj wrapper that will get an additional stream compressor binary (for example zstd or gzip) and apply it on the tar file before passing it on to the gpg backend.
From a bit looking at the code, it seems the changes should go into `duplicity/gpg.py`, right?

0. Does the above approach sound reasonable?
1. Where do you think this should go in the code?
2. I see GzipWriteFile, but where is the corresponding GzipReadFile? Just wondering to make sure I understand the architecture correctly

Thanks,
Guy


On Wed, 15 Apr 2020 at 18:26, Kenneth Loafman <[hidden email]> wrote:
You are the second to request this.  See the bug report at https://bugs.launchpad.net/bugs/1860200.  Perhaps you and Byron could collaborate to make this happen.  That would be greatly appreciated.

...Thanks,
...Ken

On Wed, Apr 15, 2020 at 9:59 AM Guy Rutenberg via Duplicity-talk <[hidden email]> wrote:
Hi,

Is there a way to compress backups using [zstd]? I know that normally in duplicity the compression is left to gpg, but gpg's selection for compression is quite limited and support basically gzip and bzip2. Zstd outperforms both algorithms in terms of compression and speed.

Ideally, I would like to give duplicity a command through which it pipes the tar archives before passing it on to gpg.


Thanks,
Guy
_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk

_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: zstd compression

duplicity-talk mailing list
Sorry for the late reply.

0. Sounds reasonable to me except that duplicity does not do both compression and encryption, it's one or the other or plain.
1. Probably in the gpg.py file unless you want to refactor the code.  I would appreciate that.
2. There is no corresponding GzipReadFile.  It's done in restore() in dup_main.py.  We could probably centralize the IO functions.
3. Don't forget file_naming.py.  We need different suffixes for zstd.

...Thanks,
...Ken


On Thu, Apr 16, 2020 at 8:32 AM Guy Rutenberg <[hidden email]> wrote:
Hi Ken,

Thanks. I read the discussion in the bug report. I saw quite a bit of discussion around a new multi-processing architecture. I think I can go forward with a simple design that will work for now and be able to break into multiprocessing in the future. The idea is to add a new file-obj wrapper that will get an additional stream compressor binary (for example zstd or gzip) and apply it on the tar file before passing it on to the gpg backend.
From a bit looking at the code, it seems the changes should go into `duplicity/gpg.py`, right?

0. Does the above approach sound reasonable?
1. Where do you think this should go in the code?
2. I see GzipWriteFile, but where is the corresponding GzipReadFile? Just wondering to make sure I understand the architecture correctly

Thanks,
Guy


On Wed, 15 Apr 2020 at 18:26, Kenneth Loafman <[hidden email]> wrote:
You are the second to request this.  See the bug report at https://bugs.launchpad.net/bugs/1860200.  Perhaps you and Byron could collaborate to make this happen.  That would be greatly appreciated.

...Thanks,
...Ken

On Wed, Apr 15, 2020 at 9:59 AM Guy Rutenberg via Duplicity-talk <[hidden email]> wrote:
Hi,

Is there a way to compress backups using [zstd]? I know that normally in duplicity the compression is left to gpg, but gpg's selection for compression is quite limited and support basically gzip and bzip2. Zstd outperforms both algorithms in terms of compression and speed.

Ideally, I would like to give duplicity a command through which it pipes the tar archives before passing it on to gpg.


Thanks,
Guy
_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk

_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk
Reply | Threaded
Open this post in threaded view
|

Re: zstd compression

duplicity-talk mailing list
hey Guy,

On 17.04.2020 17:51, Kenneth Loafman via Duplicity-talk wrote:
> Sorry for the late reply.
>
> 0. Sounds reasonable to me except that duplicity does not do both
> compression and encryption, it's one or the other or plain.

i understand you are trying to achieve a zstd compressed gpg encrypted backup and that is currently not possible as Ken wrote. but extending that, you may of course, similar to our backends, wrap file objects resulting in a stream that is compressed first and encrypted only after.
it's not implemented that way now, but possible given the effort.

> 1. Probably in the gpg.py file unless you want to refactor the code.  I
> would appreciate that.

agreed. having those file objects in there always seemed like a quick and dirty hack someone did in ancient past out of sheer convenience.

> 2. There is no corresponding GzipReadFile.  It's done in restore() in
> dup_main.py.  We could probably centralize the IO functions.

same as 1.

> 3. Don't forget file_naming.py.  We need different suffixes for zstd.

right. and providing you'll take the wrapping aproach you may end up with *.zst.gpg or *.gz.gpg files on the backend.

regards.. ede/duply.net

> ...Thanks,
> ...Ken
>
>
> On Thu, Apr 16, 2020 at 8:32 AM Guy Rutenberg <[hidden email]>
> wrote:
>
>> Hi Ken,
>>
>> Thanks. I read the discussion in the bug report. I saw quite a bit of
>> discussion around a new multi-processing architecture. I think I can go
>> forward with a simple design that will work for now and be able to break
>> into multiprocessing in the future. The idea is to add a new file-obj
>> wrapper that will get an additional stream compressor binary (for example
>> zstd or gzip) and apply it on the tar file before passing it on to the gpg
>> backend.
>> From a bit looking at the code, it seems the changes should go into
>> `duplicity/gpg.py`, right?
>>
>> 0. Does the above approach sound reasonable?
>> 1. Where do you think this should go in the code?
>> 2. I see GzipWriteFile, but where is the corresponding GzipReadFile? Just
>> wondering to make sure I understand the architecture correctly
>>
>> Thanks,
>> Guy
>>
>>
>> On Wed, 15 Apr 2020 at 18:26, Kenneth Loafman <[hidden email]> wrote:
>>
>>> You are the second to request this.  See the bug report at
>>> https://bugs.launchpad.net/bugs/1860200.  Perhaps you and Byron could
>>> collaborate to make this happen.  That would be greatly appreciated.
>>>
>>> ...Thanks,
>>> ...Ken
>>>
>>> On Wed, Apr 15, 2020 at 9:59 AM Guy Rutenberg via Duplicity-talk <
>>> [hidden email]> wrote:
>>>
>>>> Hi,
>>>>
>>>> Is there a way to compress backups using [zstd]? I know that normally in
>>>> duplicity the compression is left to gpg, but gpg's selection for
>>>> compression is quite limited and support basically gzip and bzip2. Zstd
>>>> outperforms both algorithms in terms of compression and speed.
>>>>
>>>> Ideally, I would like to give duplicity a command through which it pipes
>>>> the tar archives before passing it on to gpg.
>>>>
>>>> [zstd]: https://facebook.github.io/zstd/
>>>>
>>>> Thanks,
>>>> Guy
>>>> _______________________________________________
>>>> Duplicity-talk mailing list
>>>> [hidden email]
>>>> https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>>>>
>>>
>
>
> _______________________________________________
> Duplicity-talk mailing list
> [hidden email]
> https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>


_______________________________________________
Duplicity-talk mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/duplicity-talk