Comments (12)
@johnmyleswhite I guess https://github.com/JuliaStats/DataFrames.jl/blob/master/src/RDA.jl gives us a lot of what would be needed to get started.
from feather.
This seems to be becoming a hot topic over in Julia land: see https://groups.google.com/forum/#!topic/julia-dev/qRtkiBAbM9w and https://groups.google.com/forum/#!topic/julia-dev/pFJS4oxKAFE
and https://github.com/JuliaStats/Feather.jl
from @dmbates
from feather.
Heh, "heavily templated" is not how I would describe libfeather. I guess I should take a little time and write a C wrapper
from feather.
@wesm Douglas' comment, not mine! Anyway, that would be greatly appreciated! Julians love to party also, and to be able to join a party hosted by Hadley and you would be fantastic! 🤓
from feather.
I've been playing with @dmbates initial implementation and also considering some of the discussion that has happened around the desire to keep all the code in this repo. I think we have a few different options given the current capabilities of julia's Pkg
manager:
- It's not currently possible to (easily) have a Julia package live in a directly of another repo; this is more an artifact of julia's heavily reliance on
git
rather than any design decision (i.e. it's non-trivial to only git clone a specific directory) - Option 1: continue with @dmbates Feather.jl stand-alone package and create a git submodule in this repo that points to it; advantages: utilizes julia's current package setup/system; disadvantages: we'd have to pull updates over here
- Option 2: Have the "master" code live here in a
julia
directory, and every so often we could "release" the code over to aFeather.jl
mirror repo that would be used for allowing Julia users to install.
I think Option 2 wouldn't be too much of a hassle as we already manage this "release" process for Julia packages anyway, it would just be one more extra step of feather/julia/Feather
=> Feather.jl
copy/paste for the release.
CC: @dmbates: thoughts?
from feather.
@quinnj Having the master code live here is fine with me. Right now the dmbates/Feather.jl repository has one functional branch (dmb/cxx) and a non-functional master branch. I would prefer to follow the Cxx route if we can assume that Keno's Cxx package will be registered in time for v0.5 of Julia. With a functioning Cxx package this is the easiest approach.
Another priority I would have is to have Feather.Reader
inherit from AbstractDataFrame
and write a ModelFrame
method for it. The columns in the Reader
are memory-mapped. If you create a DataFrame
from it you risk losing the file handle if the original Reader
is garbage-collected. To be safe creation of a DataFrame
should do a deep copy of the columns. But then selected columns will be copied again to create a ModelFrame
. I would prefer not to copy twice.
from feather.
Maybe a link to https://github.com/JuliaStats/Feather.jl should be add to README.md ?
from feather.
#234
Pull request for a change to README.md adding a link to Feather.jl
from feather.
Just confirming that it's not possible to have the Feather Julia implementation in this repo? I understand there is some design limitation in the Julia package manager that prevents this?
from feather.
That's correct, currently Julia packages need to live in a .julia/v0.5/Feather
path. It is possible for a use to manually specify another path to be checked for modules, but each user would have to add this to their launch script.
A new major revision of Julia's package manager is planned for the 0.6 release cycle, and I've advocated for this functionality, so it may not be too far away. In that case, I'd be happy to bring the Feather.jl code over here. In the mean time, I do try to watch PRs here pretty carefully and track the new release functionality.
from feather.
What's the state of the art in wrapping C++ libraries in Julia? Would be interesting to start Arrow bindings for Julia cc @SylvainCorlay
from feather.
-
@wesm For the julia bindings of xtensor, we currently use the CxxWrap project by @barche
CxxWrap
is to Julia whatboost.python
andpybind
are to Python, in that it uses the C API of the Julia interpreter to defined classes, methods etc... -
Regarding the requirement for Julia packages to live in one repo, I wanted to add that most of the time, Julia packages vendoring assets from other projects will in fact fetch it dynamically in the build.jl. For example,
Xtensor.jl
fetches the headers from xtensor at build time for the package.So I guess that the Julia bindings
Arrow.jl
would be a virtually empty repository merely containing abuild.jl
in/deps
fetching resources from other places.Note that we should add a flag to disable the vendoring of third party assets for the case where we want to use an independent package for it, such as a system-wide install or a conda package. This will be important to Debian package maintainers when they make a package for
Arrow.jl
.
from feather.
Related Issues (20)
- Incompatibility with tibble>=2.0.0 HOT 4
- Read file in "feather" format by command "View" HOT 2
- Disable test for UTF-8 filenames on windows
- Reading/writing fails for large data.frames HOT 5
- Support for set/list columns in python HOT 2
- "cannot serialize duplicate column names" HOT 2
- Is it possible to save feather file in compressed zip format? HOT 6
- Should feather::read_feather respect options(stringsAsFactors = FALSE)? HOT 3
- In R, Feather cannot read dataframe back if missing or having empty column header HOT 2
- [Feature request] allow datetime index HOT 3
- python feather-format 0.4.0 is incompatible with pyarrow 0.17.0: ImportError FeatherError FeatherReader FeatherWriter HOT 3
- feather installation failed with "big endian" issue HOT 3
- Reading feather HOT 1
- Another CRAN release needed: not compatible with arrow HOT 2
- where are the docs? HOT 1
- Strange error HOT 1
- how to inspect metadata?
- pyarrow.lib.ArrowTypeError: Expected bytes, got a 'dict' object
- Support for Big-endian?
- how can I write file as a+ mode?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from feather.