Comments (8)
Thanks for your reply! Adding a small regularization is a good idea. As in previous comment, I don't have access to that device for now. If I get more detailed information, I will update the information here.
from constriction.
Thank you for reporting! On arm, are you using the precompiled constriction
package from pypi (for new Macs with arm64 chips) or did you compile constriction
from source? If you compiled from source, did you compile in --release
mode? The error message that you see should appear in a debug build but it should not appear in a release build (and the official packages on pypi should all be release builds unless I messed something up in the CI).
In case you indeed compiled from source: I'd be happy to provide precompiled python packages for your platform going forward, but I'd need someone to test them, at least initially. Would you mind telling me what platform you're developing for and would you consider testing a precompiled python package on it?
Background
The second symbol in your example is encoded with a Laplace distribution with scale parameter b = 0.0
, which is not a well-defined distribution (the pdf of a Laplace distribution is probability
crate, which checks for b > 0.0
using a should!
macro, which is defined to be equivalent to debug_assert!
. Unless special compiler flags are used, debug_assert!
only checks in debug builds and is ignored in release builds.
Notes to Self
- We should at least document that the Laplace distribution requires
b > 0.0
, and maybe the python API should check for it even in release builds.In fact, the provided example works on amd64 only by coincidence. It would fail to reconstruct the original message if we changed the second symbol in[update on 2023-11-25: I can't reproduce this anymore. Maybe I tried it out in debug mode myself when I tested this originally]message
to anything other than10
becauseprobability
—justifiably—does not provide a valid quantile function for Laplace distributions withb = 0.0
. - I guess the readme should document that, when compiling python packages for production use from source, one should compile in
--release
mode. This may be obvious to rust developers but these instructions are mainly for developers who are at home in python land and not necessarily very familiar with rust.
from constriction.
Thanks for your reply!
I compiled from source following
compile in release mode (i.e., with optimizations) and run the benchmarks: cargo bench
and
build the Python module:
poetry run maturin develop --features pybindings
Because I am pretty new to rust, I am not very sure whether I compiled the package in release mode at that moment.
Unfortunately I dont have the original device to test the solution now. If the behavior is reproduciable on other platform in debug mode, I think this may be the main reason.
By the way, when
from constriction.
I make another test on amd64
import constriction
import numpy as np
message = np.array([10, 5, 6], dtype=np.int32)
entropy_model = constriction.stream.model.QuantizedLaplace(
-20, 20 + 1
)
encoder = constriction.stream.queue.RangeEncoder()
encoder.encode(message, entropy_model, np.array([10., 10., 10.]), np.array([10., 0., 0.]))
compressed = encoder.get_compressed()
print(f"compressed representation: {compressed}")
print(f"(in binary: {[bin(word) for word in compressed]})")
decoder = constriction.stream.queue.RangeDecoder(compressed)
decoded = decoder.decode(entropy_model, np.array([10., 10., 10.]), np.array([10., 0., 0.]))
print(decoded)
assert np.all(decoded == message) # (verifies correctness)
the result is
compressed representation: [2042752312 556892819]
(in binary: ['0b1111001110000011110110100111000', '0b100001001100011000001010010011'])
[10 5 6]
May be there is some fallback when
from constriction.
You are right, leakily quantized Laplace distributions with
I compiled from source following
[...]build the Python module:
poetry run maturin develop --features pybindings
Because I am pretty new to rust, I am not very sure whether I compiled the package in release mode at that moment.
This compiles in debug mode (i.e., it will result in a python package that contains more checks and fewer run-time optimizations). This explains the difference in behavior between your two builds.
(For completeness, if you want to compile in release mode, use: poetry run maturin develop --features pybindings --release
; I updated the readme accordingly.)
I'm a bit unsure what the conclusion should be. On the one hand, one could interpret a Laplace distribution with
On the other hand, there are two reasons against allowing
- parity with the rust API: while all provided precompiled python modules are compiled in release mode, users of the rust API will probably develop in debug mode, and there the
probability
crate (which is not under my control) forbids$b=0$ (and I think that's the right thing to do for a general purpose probability crate). -
There's an edge case where quantizing a delta distribution is not well-defined: when the delta peak is right at the boundary between two bins. In this case, when the leaky quantizer integrates over one of the two adjacent bins, it's not clear whether the peak should be counted towards the probability mass. As far as I can tell, the way the Laplace distribution is implemented in theprobability
crate, it should evaluate0.0 / 0.0
for this case on this line, which should result inNaN
for both debug and release builds. Weirdly, I can't manage to trigger this issue in practice, so it seems like it's not an issue, but I want to understand why it's not an issue before relying on it.
[update (2023-11-26): in this edge case, the delta peak is counted entirely towards the right bin. TheNaN
result fromLaplace::distribution
is cast into an integer here and here, which results in0
as per rust reference so that the cdf still evaluates to zero at the delta peak. It still feels dangerous to rely on this behavior, especially if the easier solution is to just add a tiny amount of regularization to$b$ .]
Request for feedback
To help me decide, could you give me some more background on why you tested with a Laplace distribution with
from constriction.
Thanks for your explanations! I am doing some research on Cool-Chic and encountered the case strict=true
option in constriction.stream.model.QuantizedLaplace
to enforce the check that strict=false
for more flexible interpretation.
from constriction.
Thanks for the background information. I like the idea about a strict=true
argument to QuantizedLaplace
, but I decided against it for the following reasons (this list is not meant as criticism, it's just to document and communicate my reasoning):
- I think it would add unnecessary complexity to the python API where a simpler solution exists (see below);
- it would still break the promise that the python API is a subset of the rust API, and would thus prevent users from porting python prototypes to exactly equivalent rust, at least for rust debug builds; and
- it relies on somewhat brittle behavior that might break, e.g., with future versions of the
probability
crate (which are out of my control).
Proposed Solution
The problem of 1e-16
) to the scale
parameter. I added this recommendation to the python API documentation (e.g., here), and I added unit tests to assert that quantizing continuous distributions with even much smaller scales behaves as expected without any numerical issues.
Regarding Precompiled Packages for arm
Unrelated to the specific issue reported here, please let me know if you'd be willing to test a precompiled python package for your computer architecture so I can upload official pip
packages to pypi going forward. This would make things easier also for other people with your compute architecture. I'd need to know your host triple to do this (i.e., what rustc --version --verbose
says after "host").
from constriction.
Related Issues (9)
- ImportError: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found HOT 2
- Creating a categorical distribution sometimes fails to converge HOT 1
- assertions not supported for parameter cdf in constriction.stream.model.CustomModel HOT 1
- Output Dict/Struct of Huffman Symbol Codes HOT 2
- Vectorize cdf query of custom model HOT 1
- Relax minimal numpy version, if possible HOT 6
- Encoding int8/unit8 streams of symbols HOT 4
- List `LICENSE.html` in `RECORD` file of python wheels HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from constriction.