Comments (4)
It is doable. Just to confirm:
- Is your avro schema definition stored in confluent schema registry (or one compatible with it, like redpanda or upstash)?
- Are you going to use (a) a subset of output columns as avro-encoded kafka key (b) a single simple output column as string/text kafka key (c) no kafka key and round-robin for partition?
from risingwave.
Thank you for your quick reply ! I might have misunderstood a bit the UPSERT
keyword. It didn't really make sense to me as kafka is append only. So I tried using an UPSERT AVRO
sink and it actually append the message with the same primary key, which is the behavior I want but It looks more like a PLAIN AVRO
to me.
To answer your questions:
- Yes it is the case, however it should not be mandatory in both case PLAIN and UPSERT no ?
- I actually use an "id" column from my materialized view as key. Though it should not be mandatory neither ?
from risingwave.
I might have misunderstood a bit the
UPSERT
keyword. It didn't really make sense to me as kafka is append only. So I tried using anUPSERT AVRO
sink and it actually append the message with the same primary key, which is the behavior I want but It looks more like aPLAIN AVRO
to me.
Yes, it is the tricky part. If the Kafka's key is aligned with the downstream's pk, it is upsert format in RisingWave. If the Kafka's key is not the downstream's pk, just for doing partition, it is append-only format with a specified key in RisingWave. For Kafka itself, the two methods look the same.
Besides, Kafka has a "compaction" func working in the broker, if two messages have the same message key, it will delete the prior only keep the later one. So we want to make sure whether users want to do upsert.
from risingwave.
- Yes it is the case, however it should not be mandatory in both case PLAIN and UPSERT no ?
Use of schema registry is mandatory for avro unless the schema never evolves. Unlike json or protobuf, to decode an avro message encoded with one version of schema into another compatible version, the definitions (avsc) of both versions must be available during decoding.
- I actually use an "id" column from my materialized view as key. Though it should not be mandatory neither ?
It is not mandatory for plain
as I mentioned option-c for round-robin partition. It is mandatory for upsert
as @tabVersion explained above.
from risingwave.
Related Issues (20)
- feat(storage): reverse scan with excluded begin key
- connector: `JsonParser` in connector is dead code
- Refactor hummock version representation in meta to enable finer-grain compaction strategy HOT 2
- Tracking: Remote storage IOPS optimization
- cherrypick feat(ci): introduce slow e2e test step (#16953) to branch release-1.9
- Unsupported cast when casting to a one field struct using the parenthesis syntax HOT 1
- Hummock version unpin delay due to log store lags
- cherrypick fix(object store): do not call abort when streaming upload finish error and fix azure workload identity (#16961) to branch release-1.9
- cherrypick fix(compactor): fix the calculation of pending_pull_task_count (#16885) to branch release-1.9
- bug: cannot compile dependency libsasl-sys on Debian with some GCC verions
- feat: allow passing meta store SQL URL username and password by env
- refactor: split source parser into separate crate HOT 1
- Bug: mysql/pg cdc without primary key
- add test cases for scaling with arrangement and no shuffle backfilling
- cherrypick fix(log-store): rebuild log store iter when exists for a timeout (#17009) to branch release-1.9
- Support Avro ref type in source HOT 3
- implement pg_get_keywords() function
- bug(expr): should report the corrupted value
- random ci failure: Storage error: Hummock error: Foyer error: ObjectStore failed with IO error: s3 error: streaming error HOT 1
- non-append-only distinct may output adjacent noop updates
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from risingwave.