Coder Social home page Coder Social logo

Comments (4)

chenzl25 avatar chenzl25 commented on May 26, 2024 1

In short, if implementing the batch iceberg source takes much time due to its complexity, a parquet file source with decent performance is good enough to help move the POC forward. The user will consider switching RW only if RW's iceberg batch source is fast enough.

Must the file format be Parquet? Is it possible to use a CSV file that has been supported in our file source? If it is ok to test a CSV file first, we can support file source batch read first, to test the performance. BTW, I tested insert select from a RisingWave table to another RisingWave table last week. Can we just compare the streaming load from Kafka to a table with insert select from a table to another table?

from risingwave.

lmatz avatar lmatz commented on May 26, 2024

Just try to provide the complete context, the details of the POC user request can be found: https://www.notion.so/risingwave-labs/optimize-parquet-source-for-batch-load-dc498a043d504621bf56461690b14bd7?d=84ebdf5d7469412680278059c5898be8

In short, if implementing the batch iceberg source takes much time due to its complexity, a parquet file source with decent performance is good enough to help move the POC forward. The user will consider switching RW only if RW's iceberg batch source is fast enough.

from risingwave.

lmatz avatar lmatz commented on May 26, 2024

Is it possible to use a CSV file that has been supported in our file source? If it is ok to test a CSV file first,

Good point, I think it is ok as CSV is a less efficient format than Parquet in terms of read and write performance.
If we achieve decent enough performance for CSV files, we can be even faster when using Parquet. I guess it is a very convincing argument to the POC user.

BTW, I tested insert select from a RisingWave table to another RisingWave table last week. Can we just compare the streaming load from Kafka to a table with insert select from a table to another table?

I can try to communicate this first. The closer to the user's real use case, the better, but definitely nothing wrong if we use what we have at the moment, could you post the link to the last week's results? 🙏
I am also thinking of adding this to the daily performance tests

from risingwave.

chenzl25 avatar chenzl25 commented on May 26, 2024

I can try to communicate this first. The closer to the user's real use case, the better, but definitely nothing wrong if we use what we have at the moment, could you post the link to the last week's results? 🙏 I am also thinking of adding this to the daily performance tests

#14630 (comment)

from risingwave.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.