I need to send large core dumps from embedded device to server, and do it in minimal t

The --adapt feature is now present in <a href="https:

Idea: adaptive compression about zstd HOT 22 CLOSED

facebook commented on July 27, 2024 4

Idea: adaptive compression

from zstd.

Comments (22)

chadnickbok commented on July 27, 2024 6

I'd argue that dynamic compression should remain outside the scope of this library.

Instead, I'd argue that an approach similar to that of realtime video compression libraries (like x264) would be more appropriate - focus on providing an API that has calls for changing things like compression complexity on-the-fly, and allow others to build tools that implement algorithms for specific scenarios.

from zstd.

Cyan4973 commented on July 27, 2024 4

The --adapt feature is now present in zstd v1.3.6 .

from zstd.

renevdzee commented on July 27, 2024 2

I've been thinking about the same thing for about a year and came up with the following algorithm to optimize for the slowest resource in the chain:
Create n+2 threads; n compression threads, 1 input thread and 1 output thread.

the input-thread reads data blocks from the input and puts these in an input buffer
the output-thread reads compressed blocks from an output buffer and writes them to the output
the compression threads allocate and compress blocks from the inputbuffer, where the compression level selection is based on how full the input and output buffers are.

When the inputbuffer is below a certain level or the output buffer is near-full a higher compression level is selected. With a full inputbuffer or an empty output buffer a faster compression level, etcetera.

When used over a network connection, whenever there is congestion in the network or on the receiving side (cpu or i/o) the output buffer will fill up and the compression threads will switch to a higher compression ratio. Thus optimizing for highest throughput end-to-end.

regards, René

from zstd.

paulcruz74 commented on July 27, 2024 2

I've recently been working on a new utility that acts as a wrapper for ZSTD and provides an algorithm for adapting compression level based on data flow. The utility is currently included in the repository under the contrib directory as adaptive-compression.

The algorithm works by breaking up data into 40MB chunks and creating three threads (reading, compressing, writing). Each 40MB chunk becomes a compression “job”. For each job, a set of two buffers is used. The reader thread fills the first buffer with 40MB worth of data, the compression thread compresses that into the second buffer, and finally the writer thread writes from the second buffer. Because the reader and writer threads operate on separate buffers, they run concurrently and only have to wait on the compression thread. Conversely, the compression thread must wait on both the reader and the writer threads. The compression level is determined in the compression thread right before actually compressing the data. At a high level, if the compression thread was previously waiting, compression is too fast and we can increase compression level. Similarly, if the writer or reader thread is waiting, we are compressing too slowly and should decrease compression level. I have also implemented a few mechanisms to determine how much each thread had completed its job before one had to wait. This can be helpful in a situation such as if the compression thread was waiting on the writer thread, but the writer thread was 99% finished.

So far, I have tested the tool with artificial tests (limit pipe speeds), as well as with some ssh tests that copy files over a network. In scenarios where the static compression level is too low (compression faster than pipeline speed), the algorithm improves compression ratio. Furthermore, the tool does not bottleneck the pipeline (which would occur when static compression level is too high).

Since this is a long-standing request, it would interesting to see if anyone has any feedback, questions, or thoughts on the tool.

from zstd.

annulen commented on July 27, 2024 2

It would be nice to have contrib/adaptive-compression program combined with contrib/pzstd

from zstd.

Cyan4973 commented on July 27, 2024 2

The feature will be integrated into zstd regular command line utility, with command --adapt.
See #1327.

from zstd.

annulen commented on July 27, 2024 1

In my particular case I think it may be possible to proceed without preserving compression context, just choose "best" params for the next ~~100MB chunk~~ 5MB of output and start from scratch.

from zstd.

X-Ryl669 commented on July 27, 2024 1

Another use case would be for a backup project where some statistics are computed from a buffer to be compressed, and depending on the statistics, one would use either the low or the high compression ratio (maybe 2 instances?). In that case, you don't select based on time spent or FIFO pressure but more likely on the data themselves. If the entropy is high, it's very likely the input data is already compressed (or not compressible) and it's worthless try to burn CPU with the high compression algorithm. Similarly, if the data is redundant, then it makes sense to try to compress it harder.

Said differently, spend your force when it matters more.

from zstd.

data-man commented on July 27, 2024 1

The code is available in contrib/adaptive-compression with an included Makefile for building it.

Not compiled with a several errors (for a latest dev branch).

from zstd.

Cyan4973 commented on July 27, 2024

Yes. I have this idea in mind for quite some time.

I believe it is first necessary to implement multi-threading, in order to achieve IO and compression in parallel. Then it seems possible to check and compare both speeds.

That's not a easy feature, so it will take some time to make it right.

from zstd.

stati64 commented on July 27, 2024

We do this with:

a monotone list of Pareto Convex Frontier compressor+parameter.
e.g. raw, lz4+256, lz4+32, lz4+2, zstd, lzlib+0, lzlib+3
an idle time goal e.g. 5%.
16 step Pulse Width Modulation between adjacent members of 1, e.g.
typedef union{
u32 level;
struct{
u08 smooth2:4; //smoothing
u08 blend2:4; //blending adjacent compressors
u08 basec2:8; //base compressor
u16 fill2:16; //unused
}bit2;
}blender;

from zstd.

Cyan4973 commented on July 27, 2024

@renevdzee 's algorithm is very good, and basically what I had in mind.

That being said, in a streaming scenario, the amount of changes that can be done to compression parameters while preserving compression context is very low. Currently, among the multiple parameters accessible through _advanced() functions, only one, searchLog, can be safely modified between blocks while preserving frame compression context. Another one, strategy, can sometimes be changed, but not always, making it more difficult to adapt. It still gives some room for adapting speed, but cannot range from 1 to 20.

Once these restrictions are accepted, it becomes possible to adapt @renevdzee 's methodology.

from zstd.

annulen commented on July 27, 2024

@Cyan4973: What do you mean in "strategy, can sometimes be changed, but not always"? Are there any specific conditions?

from zstd.

Cyan4973 commented on July 27, 2024

Oh yes, it's a bit complex though.

"fast" and "btlazy2" have their own specific table layout, incompatible with other strategies. So it's not possible to swap them "on the fly".
The 3 remaining strategies have compatible table layouts, so could be swapped.

Now that I'm thinking again about it, "btlazy2" has additional requirements : it's possible to reduce "searchLog" on the fly without breaking the tables, but increasing this value is less clear.
I believe it's possible, because the search ends by cleaning the tail of the tree, effectively reducing search length for future searches and inserts, so it should be okay, but without a test, there are remaining risks...

from zstd.

annulen commented on July 27, 2024

I do not care much about btlazy2, but I'm going to start from fast by default

from zstd.

Cyan4973 commented on July 27, 2024

Yes, 100 MB is a fairly large size, it's pretty reasonable to use that value to cut input stream into independent chunks.

from zstd.

annulen commented on July 27, 2024

I've written 100 MB first but then realized that my data (core dump) is heterogenous and it's more resonable to split by output size instead.

from zstd.

annulen commented on July 27, 2024

Is it possible to reuse only dictionary instead of full compression context?

from zstd.

Cyan4973 commented on July 27, 2024

@annulen : in theory, yes, it is.

The problem is, loading a dictionary is, in itself, a costly operation.
The higher the compression ratio, the worse it costs.

In the end, most of the time might be spent just loading dictionaries. So it does not seem interesting as a speed optimisation.

from zstd.

annulen commented on July 27, 2024

Do you have any source code publicly available?

In my use case (compression of core dumps) data is read from pipe. I wonder if it's still a good idea to use separate reading thread in this case. Also, chunk size needs to be tunable.

from zstd.

paulcruz74 commented on July 27, 2024

The code is available in contrib/adaptive-compression with an included Makefile for building it.

from zstd.

Cyan4973 commented on July 27, 2024

Fair enough.
We haven't integrated /contrib projects into CI, and as a consequence, do not regularly check if they still compile. That could be improved.

from zstd.

Idea: adaptive compression about zstd HOT 22 CLOSED

Comments (22)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent