Comments (9)
LZMA api allows to pass alloc/free functions provided by the library user. On embedded systems, this custom alloc function then should just return pointers to preallocated static buffer.
from zstd.
Implemented in v1.3.0
from zstd.
Thanks Yann to take the static allocation in consideration.
This compatibily would allow embedded system use. I target use on STM32.
from zstd.
This objective looks today a bit difficult.
For decompression, it looks remotely possible, but difficult.
ZSTD_Dctx
currently has a static size, yes, but a fairly large one.
Should this amount of memory be considered too large for embedded systems, reducing this memory is possible, especially when working with small data blocks. But then, it makes the required Dctx memory variable, which is exactly what must be avoided.
Over the long term, it's possible to reduce sharply the size of Dctx, but that would require a "dedicated" version, handling compressed data differently. Nothing technically impossible, just a different version, with the charge to maintain it. So my feeling is that it must be carefully pondered if it needs to be part of the reference library.
For compression, it starts by being worse : the amount of memory to reserve depends on the size of data to compress, and selected compression level. So it is variable, by construction.
The only solution I can think of would be to allow maximum memory size enforcement. In which case, the compression algorithm would refuse to start if it requires more than allowed memory. Rather than selecting a compression level, the caller should better use _advanced()
variants, which allow control over individual parameters, in order to precisely control the amount of memory spent.
That being said, using current source code, the final amount of memory reserved can still be quite high even for small blocks, from an embedded system perspective. I haven't made a precise calculation, but I suspect that it requires a minimum close to 100 KB. Going lower is possible, but once again, it would require a specific version, designed to be frugal on memory space. Technically achievable, but a specific effort, with dedicated maintenance. Same question than the decompression part.
For discussion
from zstd.
That's a good point @Bulat-Ziganshin .
Also, as a noticeable progress to note on this front, v0.6 now makes it possible to select fairly small memory sizes for decompression. Decoding tables are a mere 5 KB, and work buffers are now guaranteed to be <= block size, so a block size of 4 KB for example induces a pretty small memory budget for the decompressor.
from zstd.
This is a great news for the future. I am not on the subject for now, but I am sure I will use it in near future.
from zstd.
Indirectly related to this topic : latest update in dev
branch uses a lot less stack memory.
Expected to be compatible with systems which tolerate only <= 10KB stack space usage.
from zstd.
With latest update of dev
branch, there is now a way to implement zstd
compression and decompression using static allocation. A set of functions has been added for this usage :
ZSTD_initStaticCCtx()
: https://github.com/facebook/zstd/blob/dev/lib/zstd.h#L525
ZSTD_initStaticDCtx()
: https://github.com/facebook/zstd/blob/dev/lib/zstd.h#L635
They both work the same : provide a buffer of yours, and the compressor/decompressor will use that space to do its job. In order to know how much space the object will need, call the relevant ZSTD_estimate?CtxSize()
function : https://github.com/facebook/zstd/blob/dev/lib/zstd.h#L502
The size needed by ZSTD_CCtx
is variable. It depends primarily on compression level, and sometimes also on size of buffer to compress. If at any moment, ZSTD_CCtx
requires more memory than it is allowed to use, it will simply fail, with a memory_allocation
error.
Note that, as long as ZSTD_estimateCCtxSize()
is called with parameters created with largest expected compression level and largest expected source size, the estimation will provide maximum possible memory requirement, so ZSTD_CCtx
memory needs will remain within intended limit.
ZSTD_estimateCCtxSize()
is currently a bit complex, as it requires compression parameters, which can be created from ZSTD_getCParams()
(https://github.com/facebook/zstd/blob/dev/lib/zstd.h#L587). While this gives finer control, it's also more complex than just providing a compression level. I might have to revisit that design, so that there is at least one version easier to use.
Another internal change is that CCtx
and CStream
objects are now the same.
(Same thing for DCtx
and DStream
by the way).
So a ZSTD_initStaticCCtx()
can also do streaming operations.
But the memory budget is different, due to additional buffers, so it's necessary to evaluate it using ZSTD_estimateCStreamSize()
(https://github.com/facebook/zstd/blob/dev/lib/zstd.h#L509).
Having 2 different names (CStream, CCtx) might be confusing though.
So I'm wondering if it could be improved, by either providing a ZSTD_initStaticCStream()
, which would technically be exactly the same, but would make the naming (CStream
) more consistent,
or naming everything CCtx
, and have an additional parameter, like streamingEnabled
, in ZSTD_estimateCCtxSize()
function.
The initStatic
implementation has 2 limits :
- It's only compatible with single-thread mode. Multi-thread requires more memory and scatter it in a less predictable way, so there is no short-term adaptation. In the longer term though, it might be possible to be more restrictive/selective in the way memory is allocated and used, and thus become more compatible with a
static allocation
strategy. But nothing simple. - It's not able to "create an internal dictionary for bulk operation". That's a complex concept, but to simplify, it's something which happens transparently in
ZSTD_initCStream_usingDict()
, so this function is not compatible.
The work around is to create the dictionary explicitly, usingZSTD_createCDict()
, and then just reference it withZSTD_initCStream_usingCDict()
.
I guess that if static allocation is needed forZSTD_CCtx
, it's likely going to be needed forZSTD_CDict
too. So there is another set of functions to create dictionaries into static memory :
ZSTD_initStaticCDict()
: https://github.com/facebook/zstd/blob/dev/lib/zstd.h#L568
ZSTD_initStaticDDict()
: https://github.com/facebook/zstd/blob/dev/lib/zstd.h#L675
Well, as you can guess, this API is in its infancy, and feedbacks are welcomed.
from zstd.
The API has been slightly updated in latest dev
branch update
to simplify its usage.
Most importantly, the associated ZSTD_estimate*Size()
functions have been simplified, so that it's easier to request a size without detailed knowledge of the algorithm (a compression level is enough). More powerful and complex variants still exist, under the _advanced()
label.
Also, ZSTD_initStaticCStream()
and ZSTD_initStaticDStream()
have been added. Even though they perform the same action as their {C,D}Ctx
equivalent, it's cleaner from a naming perspective, since the code flow will be able to manipulate only {C,D}Stream
names.
from zstd.
Related Issues (20)
- Add library and cli flags for file format with embedded dictionary
- Question about ZSTD protocole HOT 2
- Building on MacOS 13 and targeting MacOS 11 and SDK 11.3 (or any other MacOS version) does not work HOT 2
- Integrating the library with an external thread pool HOT 2
- Is it safe to move compression and decompression contexts between threads? HOT 1
- ZDICT_trainFromBuffer_cover is not thread safe HOT 17
- zstd compression output differens with the same options between 1.5.5 and 1.5.6 HOT 5
- Warning message for `zstd -v --train` is missing line breaks
- How to accelerate the process of dictionary training in zstd? HOT 5
- tests/cli-tests/cltools/zstdless.sh fails with newer version of less HOT 3
- Please promote thread pools from experimental to stable HOT 1
- The CMake build script breaks check_ipo_supported
- Dynamic decompression HOT 3
- Change `dictionary_compression.c` example to use API for dictionary creation
- Enable weak symbol support for Risc-V? HOT 1
- Possibly missing check for truncated initial states in Huffman weight block HOT 4
- Poor compressor behavior on interleaved data HOT 2
- zstd 1.5.5+ has worse performance on Graviton2 nodes than v1.4.4 HOT 4
- [Not a bug] Dictionary building strategy HOT 7
- CLI: Hang bomb with with crafted circular symbolic link causes "zstd -d -r -f" to infinitely loop. "pigz -d-r -f" skips symbolic links with non compressed suffix
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zstd.