Comments (8)
I was going to say go with bson but then I saw your message on slack... Maybe read/write is still faster though.
I think zipping the json would be perfect - although I don't know how parsing performance (of default libraries) compares to XML.
from ace.jl.
So the route that @albapa and I have taken is to write a "fat" file, with everything in it, even the original training data, i.e. enough to actually rerun the training with a future version of the code, but structured in such a way that the file can simply be transformed (by removing lines) to a "thin" format, that is just enough to evaluate the potential, possibly using some restricted versions of the code (in our case, with versions of the code > version that wrote the file, but you could even be thinner than that)
We write the fat version by default, because users often don't mind large files, and helps debugging. if there is a utility provided to transform fat files to thin files then they don't need to carry around large files if that is a problem. developers who might be creating a huge number of potential files in a short space of time during development will know how to switch on the thin writer.
from ace.jl.
Ok so that sounds like some form of mixed thin/fat format would be ideal.
from ace.jl.
from ace.jl.
We used to use a binary format which was very fast but a pain in all other respects. Then we went for XML, with CDATA lines for the meta-data, i.e. training configurations and command line options. I think this was a very good choice, for the reasons above. We actually have the options for companion files to store lots of reals, which are slow and cumbersome to read by XML - these are read in C. In the training code, there is an option to omit the training data, which is useful for explorations and quick tests, and for distribution we use the full version.
I guess today we would use a json file.
from ace.jl.
So far I've stored huge amounts of reals in a separate HDF5 file. So similar to your approach.
What's your view on JSON (XML) compressed as zip as needed?
from ace.jl.
I think that's where I'm going then. Julia has very nice zip format integration via ZipFile.jl
from ace.jl.
I'm going to close this - Zipped JSON files turn out to be easy to manage in Julia and exactly the level of flexibility we need.
from ace.jl.
Related Issues (20)
- Rotation-equivariance properties in bulk configuration
- Flexible Degree for Polynomial Basis HOT 1
- How to GROW the Basis
- Species-dependent radial basis
- Retire Rn1pBasis and Scal1pBasis HOT 4
- Problem with LinearMaps HOT 5
- Rewrite ACE.jl as a Graph HOT 1
- Bug / instability of inner cutoff HOT 5
- Move Bonds out of ACE.jl HOT 11
- `create_artifacts.jl` doesn't work as it should? HOT 1
- Reorganize Euclidean Tensors HOT 3
- ObjectPools for B1pComponent
- Spherical Tensors Fail in v0.12.33 HOT 1
- Basis Sizes are Very Different in v0.12.33 and v0.12.34
- Basis Generation Bottleneck
- Additional symmetries for matrix-valued ACE bases HOT 2
- A minor issue on `ACE.SphericalHarmonics:cart2spher`
- Some potentially useful functions HOT 1
- basis functions are not consistent with different MaxDeg HOT 7
- Nquad in discrete_jacobi
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ace.jl.