Coder Social home page Coder Social logo

Apache Beam Dataframe and SQL support about fugue HOT 3 OPEN

alxmrs avatar alxmrs commented on August 21, 2024
Apache Beam Dataframe and SQL support

from fugue.

Comments (3)

goodwanghan avatar goodwanghan commented on August 21, 2024

Thanks for the suggestion. We may consider Flink in the future, however, I am not sure about Beam.

I am curious have you seen beam performing well compared to native Flink and Spark? I personally didn't have much positive experience with Beam. And the streaming first philosophy may be the fundamental problem in my opinion.

I'd love to learn different opinion from you.

Thanks!

from fugue.

alxmrs avatar alxmrs commented on August 21, 2024

from fugue.

tvalentyn avatar tvalentyn commented on August 21, 2024

+1, this feature request would be interesting to explore.

And the streaming first philosophy may be the fundamental problem in my opinion.

I wouldn't describe Beam as streaming-first, Beam uses a unified model that allows expressing both batch and streaming pipelines from inception. However Dataframe and SQL support was added to Beam relatively recently, and gaps in these api surfaces might be main blockers for a successful integration.

I am curious have you seen beam performing well compared to native Flink and Spark?

It depends on a usecase but here is an example https://medium.com/google-cloud/yahoo-benchmarks-dataflow-vs-b189c809ff49.

Beam has a Dataframe API available in Python SDK (beam.apache.org/documentation/dsls/dataframes/overview), and a SQLTransform https://beam.apache.org/documentation/dsls/sql/overview/ in Java SDK but also available in Python SDK via cross-language framework. It sounds like those APIs could be used to integrate with Fugue.

I expect there will be gaps in Beam's Dataframe implementation but we'd have more information if we identified them.

I imagine Fugue has some test suite that validates an execution engine, some a minimum set of tests that should pass before an engine can be onboarded. It would be interesting to know what tests from such suite would pass or not pass when using Beam.

from fugue.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.