Coder Social home page Coder Social logo

Comments (8)

UnorderedSigh avatar UnorderedSigh commented on May 26, 2024

On a whim, I tried:

  • Upgrading reactor
            implementation("io.projectreactor:reactor-core:3.6.1")
            implementation("io.projectreactor.netty:reactor-netty:1.1.14")
  • Java 17 (instead of 21)
java 17.0.9 2023-10-17 LTS
Java(TM) SE Runtime Environment (build 17.0.9+11-LTS-201)
Java HotSpot(TM) 64-Bit Server VM (build 17.0.9+11-LTS-201, mixed mode, sharing)
  • Reorganizing my .on(whatever) calls
rewrite of withGatewayClient
    private static void withGatewayClient(GatewayDiscordClient gateway) {
        final RestClient restClient = gateway.getRestClient();
        final ApplicationService applicationService = restClient.getApplicationService();
        final long applicationId = restClient.getApplicationId().block();
        final List<ApplicationCommandRequest> request = new ArrayList<>();
        request.add(ApplicationCommandRequest.builder()
            .name("fail").description("causes bot to freeze").dmPermission(false).build());

        applicationService.bulkOverwriteGlobalApplicationCommand(applicationId, request)
            .doOnNext(cmd -> LOGGER.info("registered /fail"))
            .doOnError(e -> {
                    LOGGER.error("could not register /fail", e);
                    System.exit(1);
                })
            .then()
            .block();

        gateway.on(ChatInputInteractionEvent.class, event -> handleChatCommand(event))
            .onErrorResume(e -> {
                    LOGGER.error("error in chat interaction event", e);
                    return Mono.empty();
                }).subscribe();
        
        gateway.on(ButtonInteractionEvent.class, event -> handleButtonInteraction(event))
            .onErrorResume(e -> {
                    LOGGER.error("error in button interaction event", e);
                    return Mono.empty();
                }).subscribe();

        gateway.onDisconnect().block();
}
  • Switching the timerTaskScheduler to boundedElastic

  • Switching to another computer on another network, with twice the CPUs and twice the memory.

None of that helped.

from discord4j.

UnorderedSigh avatar UnorderedSigh commented on May 26, 2024

I've added logs at debug level to the PR description.

from discord4j.

quanticc avatar quanticc commented on May 26, 2024

Hi, thanks for the detail and the example code.

I can reproduce the issue, it is caused by blocking an infinite sequence inside:

.doOnNext(ready -> withGatewayClient(client))

I'm still investigating so this explanation might be updated, but that blocking code is creating a backpressure scenario due to how the default EventDispatcher in D4J, which is backed by an EmitterProcessor, works.

An EmitterProcessor will send onNext signals to each subscriber, respecting the current requested value from each one and taking the lowest one. Backpressure in Reactor allows operators to tune how many onNext signals they can accept, for example in some cases this is 32, in others 256, etc. This is the initial demand of an operator and subscription.

As items are emitted and consumed the demand is eventually replenished, allowing the producer to send more signals. If at any point the consumption takes more time, it is replenished more slowly to avoid overwhelming a slow consumer.

To understand this internally you could add a breakpoint without suspending in EmitterProcessor:474 (reactor-core-3.4.30)
image
image

And you can enrich each subscriber with a context to understand:
image
image

This would highlight the current demand for each subscriber:

242 requested by Context2{sub=ReadyEvent, discord4j.gateway=35747ce6}
249 requested by Context2{sub=handleChatCommand, discord4j.gateway=35747ce6}
249 requested by Context2{sub=handleButtonInteraction, discord4j.gateway=35747ce6}

In this extreme scenario, the consumption (given by doOnNext completing) never happens because the last block call in withGatewayClient resolves only on disconnect, therefore the initial demand is never replenished. When it hits 0, the processor is stalled, affecting all other subscribers.

0 requested by Context2{sub=ReadyEvent, discord4j.gateway=35747ce6}
199 requested by Context2{sub=handleChatCommand, discord4j.gateway=35747ce6}
199 requested by Context2{sub=handleButtonInteraction, discord4j.gateway=35747ce6}

Btw, I'm not 100% sure why it halts around 122, my guess is that each button press emits 2 events, making it closer to 256, the initial demand I'm seeing.

Pretty sure this can be improved in D4J, either by adding docs, migrating away from EmitterProcessor and preventing this scenario somehow. Updating your code to a non blocking approach should be a start for now:

public static void main(String[] args) {
	DiscordClient.create(System.getenv("TOKEN"))
		.gateway()
		.withGateway(client -> client.on(ReadyEvent.class)
			.flatMap(__ -> withGatewayClient(client))
			.doOnError(error -> LOGGER.error("gateway error: ", error)))
		.doOnError(error -> {
			LOGGER.error("bot error: ", error);
			System.exit(1);
		})
		.block();
}
private static Mono<Void> withGatewayClient(GatewayDiscordClient gateway) {
	// (...)

	return gateway
		.on(ChatInputInteractionEvent.class, event -> handleChatCommand(event))
		.onErrorResume(e -> {
			LOGGER.error("error in chat interaction event", e);
			return Mono.empty();
		})
		.mergeWith(gateway.on(ButtonInteractionEvent.class, event -> handleButtonInteraction(event))
			.onErrorResume(e -> {
				LOGGER.error("error in button interaction event", e);
				return Mono.empty();
			}))
		.onErrorResume(e -> {
			LOGGER.error("error in merged chat+button interaction flux", e);
			return Mono.empty();
		})
		.then();
}

Also we typically use Mono.when to add up multiple sources:

Mono.when(
            gateway.on(ChatInputInteractionEvent.class, event -> handleChatCommand(event)),
            gateway.on(ButtonInteractionEvent.class, event -> handleButtonInteraction(event))
        ) // ...

I tried it and should work for you too, as the doOnNext is not held back by an "infinite" sequence. Hope this also helps with your actual bot code.

from discord4j.

UnorderedSigh avatar UnorderedSigh commented on May 26, 2024

I updated the master of the bot to contain @quanticc's fixes. The https://github.com/UnorderedSigh/FreezeDiscord4J/releases/tag/failing-version tag contains the original version which broke.

The bot still misses many events if the button is pressed too quickly. You must press the button before the number increments to see it happen.

from discord4j.

quanticc avatar quanticc commented on May 26, 2024

Interactions in the client work by including the custom_id somewhere in the payload. I guess if you edit the custom_id in your code on click like fail:N+1 it can cause your interaction to fail if it doesn't match after it was edited.

I've noticed when it does it's a "Component validation failed" error from Discord. If you see that error in your client, D4J doesn't get the event.

You could avoid changing the button's custom_id on click, if it's constant spam clicking is much more reliable.

Otherwise, and if your interaction takes <3 seconds, call edit directly like: return event.edit().withComponents(row).then();. It can speed up the process and reduce the time it takes to update the custom_id.

from discord4j.

UnorderedSigh avatar UnorderedSigh commented on May 26, 2024

The full-sized bot never changes the button IDs. I rarely see any missed events. Your explanation seems to match the evidence.

from discord4j.

UnorderedSigh avatar UnorderedSigh commented on May 26, 2024

My troubles are gone now, so the issue is gone from my point of view.

Do you want me to close this issue? Or leave it open as a placeholder for "need better documentation?"

from discord4j.

quanticc avatar quanticc commented on May 26, 2024

Glad to know that. I'm closing this for now and add a bit of javadocs when I get the chance:

  • withGateway should clearly mention that onDisconnect is included. Standalone onDisconnect is used mostly when you call login().block() and need something to prevent the JVM from exiting after login
  • on() should discourage blocking for a long time, because it can create backpressure issues that affect any EventDispatcher subscription

from discord4j.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.