Comments (8)
I thought about this a little when I was working out how to get binary dependencies for a Rust project. In the end, I decided that what the build script should try (note: this was in Python, before Cargo had build scripts):
- Check a standard drop location to see if the necessary files are already present (local override).
- Run any system-specific locators that might help (
pkg-config
on *nix, shrug and give up on Windows). - Try to download a pre-compiled binary from the official website, for the current platform, to a reasonable cache location.
- Try to check-out the source from the official repository, cross its fingers, and hope the user has the necessary software to build it (probably after prompting them).
I've always felt that just compiling the source is dicey as Windows doesn't have a C compiler by default. Since Rust no longer depends on GCC, you can't even assume that is present on Windows. Besides which, it basically ignores any version installed on the system, which might cause surprising behaviour ("but, I updated libsplang on my system to close the security vulnerability; how'd I get exploited?!", or "why can't prog-a and prog-b share files? They're both using libsplang!").
It might be worth having a standard sysdep
package that abstracts all this, so it doesn't have to be re-engineered for every project.
from crates.io.
Another possible route here would be to compress with xz
or bzip2
. For me it shaves 10MB off the size of the cld2
directory packed up. In general though @steveklabnik was right on reddit in that we don't want to let this get out of control too fast.
from crates.io.
@DanielKeep: I'd use system packages for cld2, but it's not a very widely-packaged library. Plus, I need a build solution for Heroku, where I have no control over the installed libraries.
lifthrasiir has just sent me emk/rust-cld2#1 , which removes cld2's documentation, deletes some unused data tables, and strips comments from the source code (which substantially boosts compression performance). This gets the rust-cld2 crate under 10MB, at least for this version, though the recent update to the upstream project may break it.
Is there any way to run a custom script during the packaging process? If not, maybe I need to fork cld2 and produce a stripped down git repo. Or cache tarballs on S3, but I'm trying to avoid that.
I'd love to find a good solution here.
from crates.io.
@alexcrichton If the crate has a data which inherent entropy exceeds 10MB, we are left with no choice but workarounds.
In the particular case of cld2, the main source of excess entropy is a comment (with UTF-8-encoded words for each entry) and removing comments really helps, but the table itself already exceeds 10MB and no common general purpose compresser can easily pack them. (My estimate is that, the actual entropy is some 7 or 8MB, as about 40% of data can be somewhat correlated to each other. But it wouldn't be very easy to infer.)
from crates.io.
@lifthrasiir we've got to draw the line somewhere in terms of package upload or otherwise it'll get out of hand. Some crates will always fall on the other side of the line (and this may for example).
from crates.io.
Yeah, I can see there's an obvious tension between:
- Wanting reproducible builds coming entirely from inside crates.io.
- Keeping crate sizes reasonable.
- Packaging libraries according to the *-sys convention (and therefore being able to easily deploy them to Heroku, etc).
cld2
is a very interesting case, because it legitimately needs large data tables to do its job, and the official version is unpackaged Subversion repository. On the other hand, it's a pretty useful library and I have some server-side Rails projects that use it quite successfully in production.
Then there are the semi-evil solutions, including breaking cld2 up into multiple packages by language detected, or some such. I'm going to try to figure out how these tables fit together, and see if I can find a clever solution.
from crates.io.
Using @lifthrasiir's well-researched patch as a starting point, I've created a new git mirror of the upstream cld2 repository, stripped the comments as proposed, and built an exclude
list in my Cargo.toml
file. With all these tweaks, the cld2-sys
package is now down to 6.5MB.
There are bunch of table files which aren't getting included in the current build, and I'll need to look into those later. So maybe we'll see this probem again in the future.
But at least for now, for this one package, we appear to have a workable solution. Thank you to everybody who helped out, especially to @lifthrasiir for figuring out how to cut down the package size.
from crates.io.
With the change I just merged, just contact me over IRC/email/whatnot and I can raise the limit for crates individually
from crates.io.
Related Issues (20)
- Problem with Image source at https HOT 1
- Page still has Twitter logo HOT 2
- asimetria cpp rust stdout HOT 4
- Ellipsoid 0.3.1 is not compatible with stable build HOT 2
- api/v1/crates?per_page=100&page=201 ERRORS HOT 2
- https://crates.io/crates?page=201&sort=new. Something Went Wrong! HOT 1
- 30 seconds timeout is too short for cargo publish HOT 2
- Check autorisation HOT 1
- Use non-200 HTTP response status codes for API errors HOT 6
- Problem when trying to install a specific version
- v1 api breaks on crate "new"
- robots.txt is too restrictive, preventing Discord from generating embeds HOT 3
- Admin action plan HOT 9
- #7941 broke searches with spaces in them HOT 3
- crates.io's TOML snippet with metadata produces warnings when used in `Cargo.toml` HOT 2
- Download graphs not starting at y=0
- `recent_crate_downloads` materialised view is not refreshed with the new download counting implementation HOT 1
- 'Browse All Crates' results in `Something Went Wrong!' HOT 1
- API token expiry warning emails HOT 5
- Name squatting: Can't find current owner's contact info HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from crates.io.