Coder Social home page Coder Social logo

Invite Julia to the party? about feather HOT 12 CLOSED

wesm avatar wesm commented on June 25, 2024 6
Invite Julia to the party?

from feather.

Comments (12)

wesm avatar wesm commented on June 25, 2024

@johnmyleswhite I guess https://github.com/JuliaStats/DataFrames.jl/blob/master/src/RDA.jl gives us a lot of what would be needed to get started.

from feather.

ScottPJones avatar ScottPJones commented on June 25, 2024

This seems to be becoming a hot topic over in Julia land: see https://groups.google.com/forum/#!topic/julia-dev/qRtkiBAbM9w and https://groups.google.com/forum/#!topic/julia-dev/pFJS4oxKAFE
and https://github.com/JuliaStats/Feather.jl
from @dmbates

from feather.

wesm avatar wesm commented on June 25, 2024

Heh, "heavily templated" is not how I would describe libfeather. I guess I should take a little time and write a C wrapper

from feather.

ScottPJones avatar ScottPJones commented on June 25, 2024

@wesm Douglas' comment, not mine! Anyway, that would be greatly appreciated! Julians love to party also, and to be able to join a party hosted by Hadley and you would be fantastic! 🤓

from feather.

quinnj avatar quinnj commented on June 25, 2024

I've been playing with @dmbates initial implementation and also considering some of the discussion that has happened around the desire to keep all the code in this repo. I think we have a few different options given the current capabilities of julia's Pkg manager:

  • It's not currently possible to (easily) have a Julia package live in a directly of another repo; this is more an artifact of julia's heavily reliance on git rather than any design decision (i.e. it's non-trivial to only git clone a specific directory)
  • Option 1: continue with @dmbates Feather.jl stand-alone package and create a git submodule in this repo that points to it; advantages: utilizes julia's current package setup/system; disadvantages: we'd have to pull updates over here
  • Option 2: Have the "master" code live here in a julia directory, and every so often we could "release" the code over to a Feather.jl mirror repo that would be used for allowing Julia users to install.

I think Option 2 wouldn't be too much of a hassle as we already manage this "release" process for Julia packages anyway, it would just be one more extra step of feather/julia/Feather => Feather.jl copy/paste for the release.

CC: @dmbates: thoughts?

from feather.

dmbates avatar dmbates commented on June 25, 2024

@quinnj Having the master code live here is fine with me. Right now the dmbates/Feather.jl repository has one functional branch (dmb/cxx) and a non-functional master branch. I would prefer to follow the Cxx route if we can assume that Keno's Cxx package will be registered in time for v0.5 of Julia. With a functioning Cxx package this is the easiest approach.

Another priority I would have is to have Feather.Reader inherit from AbstractDataFrame and write a ModelFrame method for it. The columns in the Reader are memory-mapped. If you create a DataFrame from it you risk losing the file handle if the original Reader is garbage-collected. To be safe creation of a DataFrame should do a deep copy of the columns. But then selected columns will be copied again to create a ModelFrame. I would prefer not to copy twice.

from feather.

femtotrader avatar femtotrader commented on June 25, 2024

Maybe a link to https://github.com/JuliaStats/Feather.jl should be add to README.md ?

from feather.

phillc73 avatar phillc73 commented on June 25, 2024

#234
Pull request for a change to README.md adding a link to Feather.jl

from feather.

wesm avatar wesm commented on June 25, 2024

Just confirming that it's not possible to have the Feather Julia implementation in this repo? I understand there is some design limitation in the Julia package manager that prevents this?

from feather.

quinnj avatar quinnj commented on June 25, 2024

That's correct, currently Julia packages need to live in a .julia/v0.5/Feather path. It is possible for a use to manually specify another path to be checked for modules, but each user would have to add this to their launch script.

A new major revision of Julia's package manager is planned for the 0.6 release cycle, and I've advocated for this functionality, so it may not be too far away. In that case, I'd be happy to bring the Feather.jl code over here. In the mean time, I do try to watch PRs here pretty carefully and track the new release functionality.

from feather.

wesm avatar wesm commented on June 25, 2024

What's the state of the art in wrapping C++ libraries in Julia? Would be interesting to start Arrow bindings for Julia cc @SylvainCorlay

from feather.

SylvainCorlay avatar SylvainCorlay commented on June 25, 2024
  1. @wesm For the julia bindings of xtensor, we currently use the CxxWrap project by @barche

    CxxWrap is to Julia what boost.python and pybind are to Python, in that it uses the C API of the Julia interpreter to defined classes, methods etc...

  2. Regarding the requirement for Julia packages to live in one repo, I wanted to add that most of the time, Julia packages vendoring assets from other projects will in fact fetch it dynamically in the build.jl. For example, Xtensor.jl fetches the headers from xtensor at build time for the package.

    So I guess that the Julia bindings Arrow.jl would be a virtually empty repository merely containing a build.jl in /deps fetching resources from other places.

    Note that we should add a flag to disable the vendoring of third party assets for the case where we want to use an independent package for it, such as a system-wide install or a conda package. This will be important to Debian package maintainers when they make a package for Arrow.jl.

from feather.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.