Comments (4)
Hi @abhisheksingh87 - thanks for reaching out.
-
exacty-once semantics: in general sink connectors cannot provide exactly-once behaviour for any given data and/or configuration. what is possible with e.g. my sink connector is to write against the sink (mongodb collection) with upsert semantics which, when combined with any unique attribute found in kafka records, gives you idempotent write behaviour. so if you make sure that your data in kafka exhibits any unique attribute in the first place you can achieve exactly-once semantics against the sink given a proper configuration (see
DocumentIdAdder
strategy options in the README) -
when it comes to retries there is currently a very simple logic based on the following 2 config options:
mongodb.max.num.retries
(default 3) &mongodb.retries.defer.timeout
(default 5000ms). this means if MongoDB is down for more than 15 secs (roughly the time of the 3 retries) the retries are exhausted in the sink connector and it will be killed and waits for manual intervention. the connector would of course continue its work in case MongoDB come back up during the retries. you can raise both config settings as you see fit. what's currently missing is e.g. a better strategy like exponential backoff. there is an open feature request #61 - feel free to help enhance this :)
currently cannot comment on the general DLQ feature of the connect framework itself since I haven't used it for now. be aware though that it may not be available to you in case you are running a version of kafka where this didn't exist yet.
please let me know if this helps or you need anything else. thx.
from kafka-connect-mongodb.
Hi @hpgrahsl,
Thanks for your reply. We have some more clarification regarding the connector as mentioned below.
- Does the connector guarantees atleast once semantics?
- Is it possible to filter events based on any given parameter. We have use cases where we need to filter the write operations to mongodb collection based on an attribute in the event message.
3.How is the offset management handled by the connector?
from kafka-connect-mongodb.
-
yes of course. at least once semantics you get basically out-of-the-box without taking any special care on sink connector configuration. be adviced though that in my experience this is very rarely what you want. as I said you can make sure to have idempotent writes by doing key-based upserts just by configuring things accordingly and thereby get exactly-once semantics.
-
neither the kafka connect framework itself nor the sink connector is supposed to do filtering on records. while you could achieve this by implementing your custom write model for the sink connector I would discourage you from doing that. most likely the better way to go here is to have a stream processor (kstreams or KSQL) which takes care of filtering and then have let sink connector process the already filtered topic.
-
there is no explicit offset management done by the sink connector. it relies on what is configured on framework level i.e. connect commits the offsets in (configurable) regular intervals.
hope this clarifies your questions! if you plan to keep using my sink connector I'd be happy to learn about your concrete use case. ideally you're willing to share it "publicly" as a user voice/testimonial in the README. let me know :)
from kafka-connect-mongodb.
@abhisheksingh87 getting back to you about this. if anything is clarified I'd be happy if you close the issue. THX!
from kafka-connect-mongodb.
Related Issues (20)
- Log is not getting appended in File HOT 3
- Are there any configuration setting to get fullDocument Json only? HOT 1
- Replace strategy setting BsonId HOT 3
- support new compound shard key handling of Debezium Source Connector
- Version 1.4.0 on Confluent Hub HOT 3
- Avro schema "TopicNameStrategy" error HOT 1
- Removed records are put back again immediately. HOT 4
- where is the jar? HOT 1
- Use two WriteModel Strategies in one config? HOT 5
- Cannot convert Record to BSON HOT 2
- Issue with MongoDB sink connector, Records inserted to MongoDB are in the form of CDC structure instead of a mongo document structure HOT 1
- Is there "update"(NOT "replace") as operation type of the result from MongoSinkConnector? HOT 4
- Can I customize Kafka topic from MongoSourceConnector? HOT 1
- Error with shard mongoDB
- Error: `operationType` field is doc is missing
- Post Processor Chain not working HOT 3
- MongoDB Sink Connector Fails with NullPointerException HOT 1
- Question: has this been removed from Confluent Hub? HOT 2
- Generating unwanted key for Delete HOT 1
- Write model strategy - optional upsert HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kafka-connect-mongodb.