Comments (9)
We are now working on file sink, which supports multiple kinds of storage system including AWS S3 and HDFS. cc. @wcy-fdu Can you help to link this issue to that
from risingwave.
We have received your request and will support HDFS sink in the next two to three releases.
from risingwave.
Hi @zhanglistar , glad to see you're interested in RisingWave. Sink to the file system is in our roadmap, could you please elaborate on your requirements for HDFS sink? For example, the sink file format/type, whether the file needs to be batched, etc.
from risingwave.
@wcy-fdu @fuyufjh Thanks for your reply. The background is that we want to use Risingwave to substitute Apache Flink for lower cost. There are several types of jobs running on Flink,
- ETL, sink to Hive and data on HDFS, file format is parquet, need to be batched.
- Java datastream API, this part is hard, And talked with @yingjunwu , no plan.
- Flink SQL, this part is the simplest. Need to try to find how much resource can be saved from RW.
Thanks. If you need more information, just tell me. And we can contribute the community if the thing worth doing.
from risingwave.
need to be batched.
what is the criteria for batching, by the number of rows, or by seconds?
from risingwave.
need to be batched.
what is the criteria for batching, by the number of rows, or by seconds?
By seconds.
from risingwave.
@wcy-fdu Do you plan to support Apache Hive sink in the next two to three releases?
from risingwave.
We have no plans for hive sink now, but HDFS sink will be there. Contributions welcome๐
from risingwave.
@wcy-fdu Looking forward to HDFS sink. Thanks a lot. We may add hive sink later.
from risingwave.
Related Issues (20)
- give `drop secret` a `cascade` option
- nightly-20240819 compute node crash
- feat: metrics for each MV/Sink's latency
- ci: e2e test fail due to abnormal log size
- Resolve SQL Backend Determinisic Simulation Recovery bugs
- feat: configurable incremental file refresh interval for file source
- cherrypick fix(sink): fix sink in to Cassandra failed when using column name containing upper case letter (#17493) to branch release-1.10
- bug: panic: already paused HOT 2
- cherrypick feat(cdc): auto schema change for mysql cdc (#17876) to branch release-2.0
- cherrypick test(frontend): test two phase approx percentile with group key is banned (#18085) to branch release-2.0
- cherrypick feat(frontend): support single phase approx percentile in batch (#18083) to branch release-2.0
- cherrypick fix: flaky udf e2e error ui test (#18132) to branch release-1.10 HOT 1
- a user-friendly way to suppress the undefined field warnings during connector parsing HOT 2
- refactor(source): separate out and enhance `validate` API HOT 1
- Tracking: Visualize stream graph bottleneck
- bug(postgres-sink): `delete` doesn't work for `timestamp with time zone` HOT 1
- File source list executor should be singleton
- bug: iceberg sink precompute partition key would generate wrong data if table partition key changed
- discussion: auto-detect/unify schema when batch querying a set of json files
- cannot prefix a SQL UDF to make it an aggregate function
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from risingwave.