Because group.rb calls Java::kafka::utils::ZkUtils.maybeDele

Should be fixed by <a class="issue-link js-issue-link" data-error-text="Failed to load

Thanks for raising this <a class="user-mention notranslate" data-hovercard-type="user"

Consumer starting where it left off instead at very beginning or very end of topic about jruby-kafka HOT 15 CLOSED

joekiller commented on August 18, 2024

Consumer starting where it left off instead at very beginning or very end of topic

from jruby-kafka.

Comments (15)

graphex commented on August 18, 2024

Should be fixed by #17

from jruby-kafka.

joekiller commented on August 18, 2024

Maybe I'm misunderstanding, but that call is only made if reset_beginning is true. The maybe delete offset is only successful if no other consumers within that group are processing data. In normal situations you would leave reset_beginning to false and the consumer will resume at its group's last offset or the smallest offset available if the topic has since purged that offset in cases such as the data got too old.

from jruby-kafka.

joekiller commented on August 18, 2024

Sorry I was confusing my libraries. You shouldn't be including the config option reset_beginning for a consumer group unless you want it to reset. If you don't include the option your consumer group should resume where it was last committed to zookeeper. My explination about the offset selection otherwise is still valid.

from jruby-kafka.

joekiller commented on August 18, 2024

This is what Kafka's docs say about the setting:

What to do when there is no initial offset in ZooKeeper or if an offset is out of range:

smallest : automatically reset the offset to the smallest offset
largest : automatically reset the offset to the largest offset
anything else: throw exception to the consumer

from jruby-kafka.

graphex commented on August 18, 2024

I guess it isn't clear how ZkUtils.maybeDeletePath works. From the code at https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/utils/ZkUtils.scala#L449 it seems like it would succeed whenever it was called if the path existed. It seems like jruby-kafka would call ZkUtils.maybeDeletePath whenever a group is run with @auto_offset_reset == 'smallest', so wouldn't the offsets be deleted if I set auto_offset_reset to 'smallest' and run? If not, what causes maybeDeletePath to fail so that the offsets in ZK are retained so the consumer can start where it left off?

from jruby-kafka.

graphex commented on August 18, 2024

This appears to be similar to an issue with the Spark Kafka consumer: http://apache-spark-user-list.1001560.n3.nabble.com/spark-streaming-and-the-spark-shell-td3347.html#a3387

Here is the commit that addresses the issue in Spark's consumer that appears to relate to this issue:
apache/spark@c8850a3

Again, auto_offset_reset="smallest" should only cause the consumer to start at "the topic beginning" if there is no offset stored in ZK or the stored offset is out of range. If a consumer goes offline, and comes back online with auto_offset_resets="smallest" it should normally start at its last offset, not have the offset deleted.

I want to reference the next-to-last bullet point in the "Known issues in Spark Streaming" section of this:
http://www.michael-noll.com/blog/2014/10/01/kafka-spark-streaming-integration-example-tutorial/

from jruby-kafka.

joekiller commented on August 18, 2024

Can you write a test to repeat the issue as I'm having a hard time
replicating it.
On Nov 21, 2014 3:27 PM, "Sean McKibben" [email protected] wrote:

This appears to be similar to an issue with the Spark Kafka consumer:
http://apache-spark-user-list.1001560.n3.nabble.com/spark-streaming-and-the-spark-shell-td3347.html#a3387

Here is the commit that addresses the issue in Spark's consumer that
appears to relate to this issue:
apache/spark@c8850a3
apache/spark@c8850a3

Again, auto_offset_reset="smallest" should only cause the consumer to
start at "the topic beginning" if there is no offset stored in ZK or the
stored offset is out of range. If a consumer goes offline, and comes back
online with auto_offset_resets="smallest" it should normally start at its
last offset, not have the offset deleted.

I want to reference the next-to-last bullet point in the "Known issues in
Spark Streaming" section of this:

http://www.michael-noll.com/blog/2014/10/01/kafka-spark-streaming-integration-example-tutorial/

—
Reply to this email directly or view it on GitHub
#16 (comment)
.

from jruby-kafka.

graphex commented on August 18, 2024

I will try to come up with some way to reproduce, but it is one of those concurrency issues that is difficult to get to happen reliably. It only seems to occur in rebalancing or other cases when multiple threads/processes are part of the same consumer group. But when the group's ZK record does get deleted, it can be pretty disastrous, as accidentally starting from the very beginning of a big kafka topic during a rebalance that could have itself been triggered by a worker OOMing can cause a cascading failure of another worker on a topic, causing more rebalancing and more OOMing and more ZK metadata deletion.

I think this issue explains things most clearly:
https://issues.apache.org/jira/browse/SPARK-2492
Especially the differences between 0.7 and 0.8 with regard to autooffset.reset vs auto.offset.reset vs the console --from-beginning option.

Seems like a best practice to try to stick to just wrapping the official High-Level Consumer rather than introducing any additional ZK operations in the ruby code.

from jruby-kafka.

joekiller commented on August 18, 2024

Okay I've finally seen the light here. I'll accept the PR but am going to change it a bit to match how Kafka implemented fixing https://issues.apache.org/jira/browse/KAFKA-1431 apache/kafka@b866c55

Will try to do this soon.

from jruby-kafka.

joekiller commented on August 18, 2024

Thanks for raising this @graphex it should be fixed now.

from jruby-kafka.

graphex commented on August 18, 2024

Glad to hear, will be putting it to the test soon! Thank you.

from jruby-kafka.

agarwalpranaya commented on August 18, 2024

Thanks Joe for the fix. I am facing the same issue. When is the next release planned with the latest fix?

from jruby-kafka.

joekiller commented on August 18, 2024

I pushed out 1.1.0.beta just now.

from jruby-kafka.

agarwalpranaya commented on August 18, 2024

Thanks a lot Joe. Is that release from the latest commit in master? https://github.com/joekiller/jruby-kafka/commits/master

from jruby-kafka.

joekiller commented on August 18, 2024

I pushed out 1.1.0 as non-beta this afternoon. https://github.com/joekiller/jruby-kafka/releases/tag/v1.1.0 and https://rubygems.org/gems/jruby-kafka/versions/1.1.0-java

from jruby-kafka.

Consumer starting where it left off instead at very beginning or very end of topic about jruby-kafka HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent