Currently, if you attempt to serialize coroutine state via standard Java Serialization

Another use-case: support for ing via coroutines in a game engine that needs to

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Serializability of coroutine classes,about kotlin/kotlinx.coroutines

Comments (74)

joost-de-vries commented on September 23, 2024 9

I think it would be very useful to implement long running business processes.
The tools that are available for this are usually proprietary using some incomplete home brew expression language and flow control. And almost all of them use proprietary storage and run platform. As a result you can't use proper version control, proper debugging, proper module reuse, proper cloud deployment etc.
So I being able to use coroutines to implement this would be incredibly exciting.

from kotlinx.coroutines.

philipguin commented on September 23, 2024 5

Another use-case: support for scripting via coroutines in a game engine that needs to serialize the scripts on save/load, or heaven forbid, in a "rollback" networking system, where suspended scripts are saved each frame and swapped between depending on network-delayed inputs from other clients.

Save/load also encompasses "zoning", where for instance, as the player runs around the game world, parts of the world are loaded and unloaded dynamically. This would include NPCs with "wander" scripts, enemies with "scan then attack" scripts, etc.

from kotlinx.coroutines.

elizarov commented on September 23, 2024 5

@pdvrieze I'm thinking along the lines that if we go with kotlinx.serialzation support in the future, then we'll be explicitly annotating suspending functions and lambda as @Serializable and the compiler will guarantee that they only capture and work with serializable classes.

from kotlinx.coroutines.

caeus commented on September 23, 2024 4

Another use-case: To have code describe human processes. Such as making a delivery, hiring people or any bureaucratic process a company may have implemented internally. Orchestration could be done by a coroutine. Once it needs a task from another server, a human, or anything that cannot be done in the same runtime, it's suspended, and persisted in a DB. Once the task has been completed, coroutine is loaded from DB and resumed with the value of the task.

from kotlinx.coroutines.

Miha-x64 commented on September 23, 2024 3

One more use-case: Telegram bots.
For each user, a bot stores a state of the dialogue, which is a finite state machine easy to express with a coroutine, e. g.

suspend fun talkTo(user: User, firstMessage: Message) {
    when (firstMessage.text) {
        "Buy some donuts" -> sellDonutsTo(user)
        …
    }
}
private suspend fun sellDonutsTo(user: User) {
    sendMessage(user, "How many?")
    val number = receiveNumberFrom(user, onError = "Why don't you send me a whole positive number?")
    …
}

Currently this can be done with sealed classes which requires much more patience.

from kotlinx.coroutines.

elizarov commented on September 23, 2024 3

IMO, a combination of both Kotlin coroutines and kotlin serialization can serve as a great backbone for persistent, long-running workflows. The missing piece is some explicit way to indicate that a given function represents a serializable workflow.

From the user-perspective, this can be implemented simply as a @Serializable suspend function, where a compiler will ensure that function's state carried over each suspension point is serializable, and all the invoked suspending functions are serializable as well.

Just like we have an optional support for @SerialName in class serialization, we might need an optional mechanism for naming suspension points, so that the workflows can evolve while maintaining backwards compatibility with the previously serialized state. However, this is not a hard requirement in all domains, so a compiler can just number suspension points by default.

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024 2

@wdroste There are some really nasty bits of internals to take care of. I've managed to make it work (https://github.com/pdvrieze/android-coroutines/tree/master/core/src/main/java/nl/adaptivity/android/kryo) although this is not for the latest coroutines library (and it has internals).

One of the problems is that the coroutines library uses sentinel objects that are just objects (rather than enum instances - for the experimental version. Enums are used in 1.3 nonexperimental) and as such don't handle serialization properly. If course there are also other issues.

from kotlinx.coroutines.

mandolaerik commented on September 23, 2024 2

Another use-case is computer architecture simulators. This use case is in many ways similar to @philipguin 's game engine use case; a simulation is a set of things that interact deterministically in a controlled way. An important feature of simulators like Simics is to accurately save/load simulation state. Furthermore, coroutines are often a useful way to model data communication between devices; many models written in the SystemC framework are implemented in terms of (stackful) coroutines, which makes serialization very hard.
Few simulators are interoperable with JVM, so this use case brings LLVM compatibility as a requirement.

from kotlinx.coroutines.

elizarov commented on September 23, 2024 1

It does, but, unfortunately, making it enum pollutes its namespace with methods like valueOf and values which we don't need here. The correct solution is to provide private fun readResolve. Ideally, this should be done by Kotlin compiler, but it is not there yet: https://youtrack.jetbrains.com/issue/KT-14528

from kotlinx.coroutines.

elizarov commented on September 23, 2024 1

@Miha-x64 This particular issue is about making it serializable with Java Serialization. However, there's a parallel thought process on how we might be able, in the future, support kotlinx.serialization for coroutine state, too.

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024 1

@elizarov I would agree with you. I've done the Kryo thing and it is brittle/unstable by design. Nice for a proof of concept, but not suitable for long-term production projects. Actually making the things serializable (either using kotlinx.serialization or otherwise) would be a better approach.

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024 1

@faucct

I have stumbled upon this ticket while dreaming about a programming language with a support for durable workflows.

Workflows (or business processes - my research area) are much more heavyweight than coroutines are. When you state that workflows should use explicit schemas you are looking at it with the correct attitude.

While I agree that current languages have poor support for workflows, I would suggest this is partially due to the challenge. A workflow management system needs to deal with the management of the workflow instances in flight. They require a high amount of robustness, may have state associated with the instance, and most of all may fail in unclear ways that may need some sort of manual handling.

Unfortunately a lot of the work in this area is stuck in the enterprise (software) world giving rise to languages such as BPEL4WS, or (better) direct execution of BPMN models. The high requirement approach (which still needs plenty of streamlining) is probably good when you're dealing with high-frequency processes, but for simple processes a more hard-coded workflow is probably better. Serializable coroutines could be a way in which this is done, but not always the best approach. (My context for looking at this originally was to support multi-stage Android steps: download a library, once downloaded install it, once installed use it to get an account (indirectly through the account manager)...., a process that can be done in a few minutes, but involves many resumption points. Writing it as a coroutine is much much cleaner than using some sort of manual way to implement events and resumptions. In this case failure of storage just means it need to start from scratch.

In a situation that requires more robustness (or more explicit synchronization structure such as multi-instance subprocesses) you probably want some sort of library that provides a degree of explicit process support (with schemas for the state). A library is needed as there is significant runtime support needed. Serialization is part of that (look at kotlinx.serialization), but the inputs and outputs of each activity/part should probably be explicit (and marked serializable). I'm not sure about coroutines as the basis for such a library, in any case it would go quite deep into what coroutines are/do, and may very well just use the suspend infrastructure without any of the other coroutine bits (one thing needed is the ability to resume the coroutine halfway through on restart).

from kotlinx.coroutines.

fvasco commented on September 23, 2024

Serializable object should work out of the box, like

enum class CommonPool { INSTANCE }

but I suspect that this is compiler language issue.

from kotlinx.coroutines.

elizarov commented on September 23, 2024

Note, that you can easily serialize coroutine state with 3rd party serialization framework like Kryo. It is just that standard JVM serialization does not currently work due to objects that do not implement Serializable.

from kotlinx.coroutines.

Restioson commented on September 23, 2024

+1. I was able to achieve serialization (with the help of @elizarov, thank you so much), but only with the use of reflection, to change the value of result to COROUTINE_SUSPENDED

from kotlinx.coroutines.

wdroste commented on September 23, 2024

@elizarov totally understand the concept and saw the great example for ES6 generators but is there a sample project that can demonstrate the serialization that we can build upon.. so we make sure we're doing it correctly seems pretty deep.

from kotlinx.coroutines.

elizarov commented on September 23, 2024

@wdroste What is your use-case for serialization? Can you explain it here, please.

from kotlinx.coroutines.

wdroste commented on September 23, 2024

We have a IoT type server application w/ millions of clients. These clients are low powered in most cases and the protocol they're using requires many HTTP based request/responses to a per enterprise customer customizable business logic (scriptable workflow). Now that process can take anywhere from 30 secs to 1 min for an entire HTTP Session.. that process is mostly waiting on I/O from the clients, the actual server processing is only a 1-2 secs total. The memory required to hold the session state is rather large so we like to make a trade off. Incur the CPU cost of serialization and server side I/O persistence so we can free up some memory to continue processing.

Basically the issue is the JVM spends a lot of time looking for memory to free that it can't because the session/workflow state must be maintained, if we were able to persist that state it could free the memory and use it for the other requests, thereby increasing throughput and as a side effect we could be stateless as well since any server could load the current state and continue the workflow making loading balancing much easier.

Our workflows are python scripts built as generators such that we can provide an synchronous programing model to an asynchronous protocol, the issue is python does not support pickling/serialization of the generator. We're looking to move to something that does.

I mentioned ES6 generator because that's the style python uses and we require to convert our workflows to another language. The 'yield' statement must return an object as well as provide one. I would prefer Kotlin excellent IDE support, type safety, and compiler based null checks, but there's others that would recommend we just do this in Javascript Rhino since there's support for both 'yield' and serialization of coroutines.

from kotlinx.coroutines.

Restioson commented on September 23, 2024

I wrote a Kotlin test of it using Kryo with @elizarov's help - https://gist.github.com/Restioson/fb5b92e16eaff3d9267024282cf1ed72 . The only issue is that it uses Kryo, a Java library (and a bit of java reflection which I'm sure can be converted to Kotlin). Theoretically it would work if you subbed that out with another serialization library able to serialize private fields, I imagine, or if you did it yourself with reflection.

from kotlinx.coroutines.

wdroste commented on September 23, 2024

@Restioson appreciate it, I will take a look.

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024

I'd like to add an additional use case. This is in case of Android. In Android in particular there are cases that require repeated invocations of dialogs or independent activities. The coding of this is very suited for coroutines and it works "well" if you ignore the fact that android activities can be closed (and saved to disk) at any time. Some form of serialization would solve this problem. as a continuation can then just be stored allowing safe resumption of state.

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024

I've managed to make something work on Android (including serializing across activity recreation). There is one giant hack though. In a coroutine it is deceptively easy to capture an activity. This is not valid. The hack will actually serialize Service, Application and Activity (subtypes) as simple enum constants. On deserialization a context is passed which is used in it's place (with a cast to the "type").

The Kryo part lives in KryoIO.kt. It's use for android (including some stuff around wrapping startActivityForResult is in CoroutineActivity.kt

from kotlinx.coroutines.

wdroste commented on September 23, 2024

I've create a project to illustrate my statements above. The ES6 generator part works great, however I'm unable to serialize the coroutine. If we can serialize the workflow we can use it for long running processes.

@elizarov could really use some help w/ this.
@Restioson tried to follow your gist but it doesn't seem to work in 1.2.10
https://github.com/wdroste/kotlin-generator-serializer

from kotlinx.coroutines.

unicomp21 commented on September 23, 2024

Looking at aws step functions, this would be so much nicer.

from kotlinx.coroutines.

mrussek commented on September 23, 2024

Any update on this? I think this would be super useful!

from kotlinx.coroutines.

elizarov commented on September 23, 2024

@mrussek What's your use-case for this?

from kotlinx.coroutines.

elizarov commented on September 23, 2024

You can do it right now. It takes so hardship to setup, but I don't see how we could make it any more simple. If you are to serialize coroutine it automtically means you have to abide by certain restrictions in your code.

from kotlinx.coroutines.

joost-de-vries commented on September 23, 2024

I can understand that. Tx.
Can you give some pointers? I'd like to make a small playground to see what's involved.

from kotlinx.coroutines.

elizarov commented on September 23, 2024

Read the discussion in this issue. There are links to a bunch of working examples.

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024

@joost-de-vries You may want to look at business process managements systems for this. For now, that is your best bet (although indeed it doesn't support debugging across the process). I'm not sure that coroutines can actually fully do what you want though. A key design criterion is still in ACID properties of the steps.

Btw. serialization would probably be a lot easier if it was possible to have the compiler validate captures (or the lack of captures).

from kotlinx.coroutines.

mrussek commented on September 23, 2024

Yeah, my work is currently looking at Uber's cadence for something like this, although their approach seems a bit more hacky imo.

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024

@mrussek I had a quick look at uber cadence. It seems a sensible system although I'm not clear why you would use it over a bpmn2 based system. In any case, key in those systems is that they are essentially distributed systems where the workflows system doesn't do the work. It may be interesting to use workflows in a single-process system, but this is something that is not quite ready for bleeding edge yet. I have a system that should be able to do it reasonably easily as I already have the testing of workflows implemented without actual actions attached. Linking them with a lambda for the behaviour (rather than workflow messaging) should be reasonably straightforward.

from kotlinx.coroutines.

Miha-x64 commented on September 23, 2024

As an author of an external serialization tool, now I wonder how this is going to be implemented. Will this require kotlinx.serialization? How much stable serialized representation will be?

from kotlinx.coroutines.

mrussek commented on September 23, 2024

For anyone that is curious, I am working on a prototype implementation that can serialize the continuation state to JSON. It is a little hacky, but it seems like it may be a good starting point for further conversations.

from kotlinx.coroutines.

Miha-x64 commented on September 23, 2024

Isn't it “just works” with reflection, e.g. Gson or Moshi?

from kotlinx.coroutines.

mrussek commented on September 23, 2024

My implementation uses jackson. I haven't tried with Gson or Moshi, but part of what makes it hacky is the fields in the continuation object are declared private and there is no way to access them other than via reflection. I'm not sure if Gson or Moshi do this automatically.

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024

I've done it based on kryo. It works, but is is far from straighforward. And in reality many coroutines capture all kinds of references to surrounding objects (that shouldn't be serialized - in my case Android Activities). There are ways to "catch" this and work around it, but it's not great.

What we could really do with is some way to require a lambda/suspending lambda not to be capturing. That would make serializability much more feasible. In some ways I'd just want a single token to pick up the process where it left off.

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024

@elizarov That sounds like a good solution. Of course having an annotation to prevent capturing may still be worthwhile (especially in the Android case - it is very very easy to create leaks by capturing the wrong things - Making the lambda an extension on Nothing can help a bit ;-)).

from kotlinx.coroutines.

Miha-x64 commented on September 23, 2024

But we typically need the host Activity of Fragment.
For example, in callback-based APIs, I pass functions which require receiver to be Screen (an abstraction on top of Fragment): pureParcelFunction3(RootScreen::photoTaken).

from kotlinx.coroutines.

elizarov commented on September 23, 2024

But we typically need the host Activity of Fragment.
For example, in callback-based APIs, I pass functions which require receiver to be Screen (an abstraction on top of Fragment): pureParcelFunction3(RootScreen::photoTaken).

@Miha-x64 That's a larger serialization-related question. What if I serialize some state, but it also contains a reference to some "context" that I don't want to be serialized, but property replaced with a new context during deserialization. We don't have an answer to that yet, but this falls under the serialization design anyway, not coroutines design.

from kotlinx.coroutines.

mrussek commented on September 23, 2024

Agreed. FWIW, I think that in some circumstances there could be support at the application framework level. Not sure what you would do for Android, but in my prototype I had a hack that serialized all Spring beans in the context using the Spring bean identifier. This allowed several Spring application servers to cooperate on executing a coroutine.

from kotlinx.coroutines.

aa0ndrey commented on September 23, 2024

I think there are a lot of problems to serialize coroutine state. But for long running process/activity it could be enough to have ability to serialize state of iterators created by sequence-scope
https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.sequences/-sequence-scope/
Now such iterators are created by coroutine and we stuck in the same problem: "how to serialize coroutine". But if kotlin dev team could change implementation for the sequence on something like C# do https://csharpindepth.com/Articles/IteratorBlockImplementation it would be awesome! We could wrap any continuation logic inside such serializable iterator to persist and restore it. And we could use them inside coroutines without any upgrading for coroutines.

from kotlinx.coroutines.

jgreenyer commented on September 23, 2024

I've posted another use case where the possibility to serialize/clone coroutines or whole coroutine contexts would be helpful: https://discuss.kotlinlang.org/t/clone-coroutines/20009/2

Are there any more recent developments in this direction?

from kotlinx.coroutines.

qwwdfsad commented on September 23, 2024

@jgreenyer nice use-case! There is no new development, unfortunately, this is out of our scope right now.

Alternatively, you can use Kryo (there is a snippet in the discussion with how to use that) or clone not the coroutines, but their state which can be expressed as a regular Kotlin [data] class

from kotlinx.coroutines.

holgerbrandl commented on September 23, 2024

Same here using the latest and greatest of kryo, kotlin and coroutines. See https://github.com/holgerbrandl/kryo-kotlin-sam/blob/master/src/main/kotlin/simpleproc/SimpleProc.kt for the complete example including gradle-project context for reproducibility.

My use case would also point towards simulation and is described in holgerbrandl/kalasim#19

I've carefully reviewed all materials from above, but so far did not find any way to use the continuation after deserialization. All I get is an NPE (included in the source pointer from above). Interestingly, the example from above allows to deserialize the object but fails later when consuming the sequence iterator again.

Any help/point here to overcome/workaround this issue would be greatly appreciated.

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024

@holgerbrandl It is possible to do it, but it requires quite some configuration of Kryo. For the android use case you can have a look at: https://github.com/pdvrieze/android-coroutines (it should work although it was more of an experimental piece of software).

from kotlinx.coroutines.

Miha-x64 commented on September 23, 2024

By the way, when designing serializability, it should be quite easy to implement cloneability, too.
This is especially useful for people from Arrow which are used to rewind an existing coroutine to the previous state.

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024

@Miha-x64 Serialization can be used for cloning - either through a format or "directly" (and for equals and hashcode).

from kotlinx.coroutines.

Miha-x64 commented on September 23, 2024

@pdvrieze, sure. But “direct” cloneability is way easier to design, and it can satisfy some use-cases.

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024

@Miha-x64 Actually, direct cloning is not that easy due to control inversion (both serialization and deserialization are driven from the serializer side, so the format cannot easily shuffle between the two. You'd have to use (actual) threads, or use a buffer.

from kotlinx.coroutines.

holgerbrandl commented on September 23, 2024

Thanks for your feedback and the codepointer @pdvrieze . I'll analyze it to see if I can adapt it for my use case.

from kotlinx.coroutines.

holgerbrandl commented on September 23, 2024

@pdvrieze I've tried to work through your mentioned android-coroutines repo, extracted the non-android bits into https://github.com/holgerbrandl/kryo-kotlin-sam/blob/master/src/main/kotlin/kryo (to work out a more minimalistic POC example) and prepared an example along with it in https://github.com/holgerbrandl/kryo-kotlin-sam/blob/master/src/main/kotlin/kryo/SeqBuilderSerializationExample.kt.

When doing so, I had lifted coroutines, kotlin and kryo dependencies to the current versions.

Unfortunately, the example still throws an NPE when continuing the sequence. I suspect https://github.com/holgerbrandl/kryo-kotlin-sam/blob/eb3dd4ab23e6dd036fde095e6e3065202ac6d995/src/main/kotlin/kryo/AndroidKotlinResolver.kt#L42-L45 to to refer to no longer valid/used classes. Also CommonPool which is somehow used to configure the Kryo instance seems gone. But these aspects may not even relate to the root cause, which is still in the dark for me.

I know it is asking a lot, but is there any chance you could provide me with more guidance to make this - i.e. sequence-builder persistence & continuation using kryo - work?

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024

@holgerbrandl I didn't look at it, but I suspect you are correct. Basically to serialize coroutines you are going into the guts of how coroutines work (and for Android into replacing old instances with new ones - so you have a valid context rather than the original one - which isn't valid). I haven't really touched that code in a while so it is probably no longer working with the latest library versions that have different internal state.

from kotlinx.coroutines.

holgerbrandl commented on September 23, 2024

Since it seem to have worked once, I'd still have some hope that we could make it work again. I'm happy to help, but I would need some pointer/guidance to get started.

I'd be interested to do so just on the JVM without any Android. And I'd also be primarily interested in kryotizing sequence{}.iterator() instances. Not sure if this makes things simpler or more complicated.

from kotlinx.coroutines.

elizarov commented on September 23, 2024

I've played with it, using the recent version of Kryo (5.2.1) and the recent version of Kotlin (1.6.10) . I'm not sure how it worked in the past, but it does not seem to be capable of serializing classes that do not have no-arg constructors anymore, which includes anonymous classes, lambdas, and suspending functions. Even though you can ask Kotlin to make the lambdas themselves serializable (including suspend lambdas), but the state machines that Kotlin generates for the suspending lambda and for suspending functions are not serializable and do not have a default constructor, so Kryo, as of today, cannot serialize them.

There have been some changes in the way Kotlin JVM generates lambdas, which might somehow contribute, too. See also discussion here KT-45375 Generate all Kotlin lambdas via invokedynamic + LambdaMetafactory by default.

from kotlinx.coroutines.

elizarov commented on September 23, 2024

On a side note, serializing a state of sequence does not work with Java Serialization either, because kotlin.sequences.SequenceBuilderIterator is not a serializable class. This might be easier to fix, than to figure it out with Kryo. See KT-51100 Consider making kotlin.sequences.SequenceBuilderIterator serializable.

from kotlinx.coroutines.

holgerbrandl commented on September 23, 2024

This would be great as well. I guess it could also be helpful when using kryo (which is more applicable if not all types implement Serializable) via https://github.com/EsotericSoftware/kryo#javaserializer-and-externalizableserializer

from kotlinx.coroutines.

holgerbrandl commented on September 23, 2024

Given the somehow slow pace of adopting Serializable in the coroutines/stdlib (timeline in this thread, lack of target release & lack of assignees), I would not mind a "brittle/unstable" kryo setup to get started with a POC. Do you think it's a bug in kryo? If so, I could try to submit a ticket there.

from kotlinx.coroutines.

faucct commented on September 23, 2024

I have stumbled upon this ticket while dreaming about a programming language with a support for durable workflows.

Though I believe that the workflows should have explicit schemas: without those they would be brittle to changes of the codebase.

So I would like to propose an annotations API, which would make it possible to develop an IR-plugin for JVM as an unstable proof of concept, though this would be more stable with the compiler support, like kotlinx.serialization has.

There have to be at least two annotation types: @PersistencePoint and @PersistedField.
@PersistencePoint would be annotating the suspension-points where the coroutine could be persisted: either directly by calling some Persistor-interface element from coroutineContext, responsible for persistence, either indirectly by calling another suspendable function, which could at some point persist the coroutine, also directly or indirectly. The annotation instance should have an explicit label value, unique for a continuation, so it would be stable to the code changes.

@PersistedField would be annotating the persisted parameters/fields. This ones would also have unique label, also unique for a continuation.

Here is an example of an annotated workflow definition:

abstract class Persistor : CoroutineContext.Element, suspend () -> Unit {
    abstract suspend fun invoke()

    companion object Key : CoroutineContext.Key<Persistor>

    override val key: CoroutineContext.Key<*> = Key
}

suspend fun persist() = coroutineContext[Persistor.Key]()

suspend fun bookTrip(@PersistedField("name") name: String) {
    @PersistencePoint("reservingCar") @PersistedField("car") val car = reserveCar()
    @PersistencePoint("reservingHotel") @PersistedField("hotel") val hotel = reserveHotel()
    @PersistencePoint("reservingFlight") @PersistedField("flight") val flight = reserveFlight()
    return Trip(car, hotel, flight)
}

suspend fun reserveCar() -> String {
    @PersistedField("idempotencyKey") val idempotencyKey = java.util.UUID.randomUUID().toString()
    @PersistencePoint("requesting") val _ = persist()
    return GRPC.reserveCar(idempotencyKey)
}

suspend fun reserveHotel() -> String {
    @PersistedField("idempotencyKey") val idempotencyKey = java.util.UUID.randomUUID().toString()
    @PersistencePoint("requesting") val _ = persist()
    return GRPC.reserveHotel(idempotencyKey)
}

suspend fun reserveFlight() -> String {
    @PersistedField("idempotencyKey") val idempotencyKey = java.util.UUID.randomUUID().toString()
    @PersistencePoint("requesting") val _ = persist()
    return GRPC.reserveFlight(idempotencyKey)
}

Those two annotations are enough to implement a simple serialization plugin, but there could be others, like ones kotlinx.serialization has to support polymorphism, custom serializers, etc...

What do you think about this API?

from kotlinx.coroutines.

philipguin commented on September 23, 2024

My initial thought is that a coroutine cannot be persisted without serializing all needed fields anyway - the only benefits of the annotation are explicitness and providing customizability to the serialized data (e.g. XML names), at the cost to legibility imo. (Omitting the name and using (static) reflection to retrieve it could help here.)

Marking certain suspensions as serializable or not is also strange to me - what should the API look like from outside the coroutine? E.g. suppose I’m shutting down the application and there are suspended continuations - should those in a “non-persistable” state be canceled as usual, while others are persisted? Or is persistence at the discretion of the suspended continuation’s “owner,” the annotation merely indicating the possibility? Or perhaps we should inject “Serializers” as we currently do Dispatchers, in a global way? If so, how do owners get a handle back to their suspended continuations? It’s really a question of “who does what” here.

What about migrating data across version changes? Can annotations alone account for this? I think I’d prefer a Ruby on Rails-like approach, or at least the option.

There’s just a lot of open questions, almost certainly worthy of its own KEEP.

from kotlinx.coroutines.

faucct commented on September 23, 2024

My initial thought is that a coroutine cannot be persisted without serializing all needed fields anyway

Continuations may have unserializable fields invisible inbetween persistence points: sockets, processes, intermediate buffers, etc. I think it is reasonable to explicitly select what fields to save and how, while the compiler would check if there are no other fields required by the persistence points.

suspend fun reserveCar() -> String {
    @PersistedField("idempotencyKey") val idempotencyKey = java.util.UUID.randomUUID().toString()
    @PersistencePoint("requesting") val _ = persist()
    return GRPC.newClient().using { grpc -> grpc.reserveCar(idempotencyKey) }
}

Omitting the name and using (static) reflection to retrieve it could help here.

For most variables this will probably be true and @PersistedField annotations may be skipped, but I don't think this approach would be able to also replace @PersistencePoint annotations.

Marking certain suspensions as serializable or not is also strange to me - what should the API look like from outside the coroutine? E.g. suppose I’m shutting down the application and there are suspended continuations - should those in a “non-persistable” state be canceled as usual, while others are persisted? Or is persistence at the discretion of the suspended continuation’s “owner,” the annotation merely indicating the possibility?

Yes, I was thinking of it as a possibility of persistence. Using that the Kotlin compiler user should be able to implement any policy of liveness. The policy may be centralizing in single JVM, so the coroutines would be durable to the JVM restarts, or it may be distributing across the cluster with state persisting into some distributed NoSQL storage, so the coroutines would be also durable to death of single processing nodes.
I imagine the distributed policy as something like akka-cluster sharding: where every shard is responsible for running a subset of persisted coroutines, whose PersistenceIds hashes to specific ranges of values. When someone in a cluster wants to start a new coroutine, he would have to give it a unique PersistenceId and persist the initial fields to it. After that the cluster would be responsible for liveness and consistency of this coroutine. Liveness would be achieved with cluster trying to eventually have a node responsible for the coroutine. Consistency would be achieved by persisting the state using distributed compare-and-swap requests at persistence points, so if there are multiple nodes running the same coroutine they would be able to notice that at persistence points and synchronize who is supposed to be running it before its execution forks unmergeably.
https://doc.akka.io/docs/akka/current/typed/cluster-sharding.html#persistence-example

class HelloWorldService(system: ActorSystem[_]) {
  import system.executionContext

  private val sharding = ClusterSharding(system)

  // registration at startup
  sharding.init(Entity(typeKey = HelloWorld.TypeKey) { entityContext =>
    HelloWorld(entityContext.entityId, PersistenceId(entityContext.entityTypeKey.name, entityContext.entityId))
  })

  private implicit val askTimeout: Timeout = Timeout(5.seconds)

  def greet(worldId: String, whom: String): Future[Int] = {
    val entityRef = sharding.entityRefFor(HelloWorld.TypeKey, worldId)
    val greeting = entityRef ? HelloWorld.Greet(whom)
    greeting.map(_.numberOfPeople)
  }
}

What about migrating data across version changes? Can annotations alone account for this? I think I’d prefer a Ruby on Rails-like approach, or at least the option.

When user is responsible for the location where the coroutines are being stored and understands the format they are being stored in, then he will be able to write any migration code he wants, if that's what you are talking about.
Though I was talking about versioning of code around the persistence points. The annotations would let you add/remove statements between the persistence points as long there are no required fields being added. You would also be able to add persistence points between existing ones. Removing would also be possible, though you would have to be sure that there are no persisted coroutines in those persistence points, or else they won't be able to resume and will halt until you will decide how to migrate them or bring back those persistence points.

from kotlinx.coroutines.

dkhalanskyjb commented on September 23, 2024

If I understood correctly what you mean by "persistence," it's the ability to stop the execution of the program, and then resume it from the same place. Did I get this right?

If so, the problem is completely unrelated to coroutines. Sure, coroutines do represent a reified state of the execution, and it's, in some sense, easier to save the execution in the middle of the coroutine than in some arbitrary code. But!

What about static variables? These are not part of a coroutine.
If you have several fields pointing to the same (potentially mutable) object, how do you preserve this relation when serializing?
What about multithreading? One coroutine could execute in one thread and reach a "persistence point", but the other ones could still execute in parallel.

Etc. The common approach to dealing with this (and there are some precedents!) is to just dump the whole state of RAM related to that program. For example, emacs did do this, using the unexec glibc function that stored the memory state: https://lwn.net/Articles/673724/ This deals with all the problems I listed. However, even this approach has its limitations:

What about open files? If you have log.txt open for writing, then stop the process, and then start again, there may even be no log.txt, and the operating system will certainly not remember that you're supposed to be able to write to it.
Likewise for any per-process state that the operating system stores.

I hope this is what you meant by persistence, or I'd feel stupid for giving a lecture about an irrelevant thing!

from kotlinx.coroutines.

faucct commented on September 23, 2024

If I understood correctly what you mean by "persistence," it's the ability to stop the execution of the program, and then resume it from the same place. Did I get this right?

Not the whole program, the granular persistence of single coroutines one by one would be enough. In this case there should be no problems with multithreading, considering the user avoids shared mutable state, such as shared objects or static variables.

from kotlinx.coroutines.

philipguin commented on September 23, 2024

Continuations may have unserializable fields invisible inbetween persistence points: sockets, processes, intermediate buffers, etc. I think it is reasonable to explicitly select what fields to save and how, while the compiler would check if there are no other fields required by the persistence points.

What does it mean to serialize a variable that Kotlin wouldn't turn into a field by default? Isn't this strictly a variable that has no impact after the suspension point? Hence, what is the purpose?

Conversely, what would it mean to not serialize a variable that Kotlin converts to a field by default? Isn't this strictly necessary? Or were you thinking more like Java's transient, where someone somewhere would be reconstructing the field during deserialization?

I suppose my point is, I don't know that annotations for fields are strictly necessary, even if their existence would be nice for customization. You could enable a linter setting if you'd like to mandate all fields be annotated by the programmer, otherwise IDE underscoring could be sufficient for many, and a 1-to-1 mapping of what must actually be serialized. (Correct me if I'm missing something.)

On your next points, I agree that it's good that annotation only represent the possibility of serializability. I'm not familiar with most of the following terminology however, but it sounds like things that may be important use cases for coroutine serialization, rather than features we'd want Kotlin to implement specifically? As long as we can get serialization into a custom format out of suspended coroutines, I think that would suit most purposes.

In terms of use cases, for a game engine it would be good to support serializability as a side-effect of a suspension point, not it's primary purpose. E.g. delay would be serializable by default, so the entire game or level in a game could be halted, saved to disk, then loaded and resumed later, delayed coroutines in-tact. The only effort necessary would be on the implementer of delay incorporating serialization, with minimal effort (if any) from the coroutine author.

It's problematic though, because designers aren't programmers, and they're not going to want to annotate a bunch of fields to avoid catastrophic serialization issues (nor should anyone want them to.) But annotation isn't the only thing necessary to avoid disaster - adding, removing or converting fields would have to be automatically mapped to schema changes as well, to make the process sane to a non-coroutine-expert.

This is way out of left field, but could the IDE automatically detect changes to @Serializable coroutine code, present modal dialogues if necessary to the programmer/DSL-user (e.g. possible rename detected), then generate or update the schema and migration code accordingly? I'm just trying to think of a way to make this process as human-friendly as possible -- this is in keeping with the purpose of DSLs in my opinion, which are also likely to be suspend routines. (E.g. "move the turtle" DSL, or game DSLs, etc.)

EDIT: example DSL to make this concrete:

// Example game script DSL: NPC hauling items for all eternity
suspend fun behavior(npc: Npc) {
    while (true) {
        val item = npc.pickUpNearestItem()
        npc.walkTo(0, 0, 100) // << suspend function, item needs serializability here
        npc.dropItem(item)
        npc.walkTo(0, 0, 0)
    }
}

In this case, if a designer were to modify this script, I don't want them to even have to think about persistence if I can help it. I'd accept a ridiculous git history of ten-thousand schema transitions, so long as it's correct and designers don't have to squint their eyes too hard.

from kotlinx.coroutines.

faucct commented on September 23, 2024

What does it mean to serialize a variable that Kotlin wouldn't turn into a field by default? Isn't this strictly a variable that has no impact after the suspension point? Hence, what is the purpose?

The first example that comes into mind are function interface parameters, which aren't being used by the implementation at this moment, but the code could be updated to use them later. They also could be only used before the first suspension point, so there would be no need in generating continuation fields for them, though the code using them after the first suspension point may appear later. I believe that at the moment the Kotlin compiler optimizes such unused variables and does not save them into continuation fields.

Conversely, what would it mean to not serialize a variable that Kotlin converts to a field by default? Isn't this strictly necessary?

If a network socket is being opened which has suspendable usage between two near persistence points, then the compiler will generate a continuation field for it, though there could be no use for it in persisted state, because it never escapes out of those two persistence points:

suspend fn request() {
  @PersistedField("idempotencyKey") val idempotencyKey = UUID.randomUUID().toString()
  @PersistencePoint("requesting") val _ = persist()
  @PersistedField("response") val response = Socket("tcp://...").use { socket -> // there is no need to persist this variable
    socket.write(...) // these are suspension points, though not persistence points
    socket.read(...) //
  }
  @PersistencePoint("requested") val _ = persist()
  ...
}

I'm not familiar with most of the following terminology however, but it sounds like things that may be important use cases for coroutine serialization, rather than features we'd want Kotlin to implement specifically? As long as we can get serialization into a custom format out of suspended coroutines, I think that would suit most purposes.

Yes, that is what I have had in mind.

This is way out of left field, but could the IDE automatically detect changes to @serializable coroutine code, present modal dialogues if necessary to the programmer/DSL-user (e.g. possible rename detected), then generate or update the schema and migration code accordingly?

Though extracting schema from suspendable code into a separate source file is probably possible, I believe that making those two files matchable will be far more cryptic than having a single source file with suspendable code annotated.

from kotlinx.coroutines.

philipguin commented on September 23, 2024

The first example that comes into mind are function interface parameters, which aren't being used by the implementation at this moment, but the code could be updated to use them later

That makes sense, persisting fields not currently used but possibly necessary in the future would indeed be valuable.

If a network socket is being opened which has suspendable usage between two near persistence points, then the compiler will generate a continuation field for it, though there could be no use for it in persisted state, because it never escapes out of those two persistence points:

I see - in my mind, every suspension in the coroutine would be persistable, but I suppose this isn't strictly necessary. I suppose non-annotated suspension points would be "unsafe points," rolling back to the last persisted if interrupted, effectively? Outside my domain of experience, so hard to comment, sorry.

Though extracting schema from suspendable code into a separate source file is probably possible, I believe that making those two files matchable will be far more cryptic than having a single source file with suspendable code annotated.

How would annotations cover the case of a field being renamed? What about multiple renames over time? How would data under the old schema be loaded into the new? If the programmer does not update annotations to account for a rename, what should happen?

By "cryptic" did you mean "difficult to implement?" Because if you meant "difficult to understand," I don't think a generated schema and series of migrations (a la Ruby on Rail) would be hard to read - I think it would be clear when looked at, which would basically only be necessary upon commit or in advanced usage scenarios.

It did occur to me that some migration code may be impossible to generate - when a new field is added, for instance, code to create it would have to be hand-written, barring a modal-dialogue-specified default value. And in the case a meaningful value cannot be generated via migration, the coroutine code itself might have to interpret a "does not exist" value and modify its own behavior to handle it - this would violate the principle of coroutines not caring about generated schema/migrations underneath. There may not be an ideal solution to this.

from kotlinx.coroutines.

faucct commented on September 23, 2024

How would annotations cover the case of a field being renamed? What about multiple renames over time? How would data under the old schema be loaded into the new?

If there would no data persisted under the current field, then the deserializer would try to get it from an alias.

val npc;

Renaming npc to npc1:

@PersistedAliases("npc") val npc1;

Renaming to joshNpc:

@PersistedAliases("npc", "npc1") val joshNpc;

If the programmer does not update annotations to account for a rename, what should happen?

I guess, then a runtime deserialization error would happen – "joshNpc" field not found amongst available fields: npc.

By "cryptic" did you mean "difficult to implement?" Because if you meant "difficult to understand," I don't think a generated schema and series of migrations (a la Ruby on Rail) would be hard to read - I think it would be clear when looked at, which would basically only be necessary upon commit or in advanced usage scenarios.

Ruby on Rails migrations only work with explicitly named tables and fields, while the serialization of unannotated coroutines would have to work not only with variables, whose names could be inferred with reflection, but also with unnamed suspension-points. @elizarov has suggested to give those sequential labels, as Kotlin compiler does, so the migrations would have to be defined in terms of suspension points 0, 1, 2, ... I believe that those would be harder to understand than explicit names.

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024

@faucct The way I would see it to work would be (using kotlinx.serialization) that the hidden object holding the function stack/the continuation would be serializable. Then all values would be required to be serializable (and could be annotated). If you want to specify the type for serialization that would be something that maybe would be done as an annotation on the suspend declaration (in line with the context feature), and then local variables not existing, except as provided by that context. (It would need some special rules to deal with setting "val" properties)

from kotlinx.coroutines.

faucct commented on September 23, 2024

I have hacked together a kryo-based schemaless persistence solution. The persist-calls crash the coroutine, but saving it to a file before that, so the subsequent runs would be able to resume from the same point.
https://github.com/faucct/wrapper-persisting-coroutines

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024

@faucct I did at some point have the similar (without crashes) - focused on the Android context. However it is quite tricky and generally (esp. in the Android context) require some fixups as well. For example objects should not be deserialized, android Context objects should be injected (and not serialized/deserialized). And of course it is extremely dependent upon the specifics of the library. It can "work" but is fragile. In addition, the values captured in the closure are not explicit. This makes for the code using such serialized coroutines to be quite error-prone.

from kotlinx.coroutines.

faucct commented on September 23, 2024

I have tried to think about the serialization in terms of kotlin.serialization API to see that if there is enough of it.
It looks like not only the suspended continuations should be Serializable, which means having a Serializer. They also should be able to somehow alter the serialization context for the child continuations. This way the parent continuation serializer could be able to help resolving the concrete subclass of child continuation.
Probably their serializer should be provided by the intercepted continuation, which means the chain of ContinuationImpl#intercepted references should be inverted, so the deserialization routine would look like this:

fun deserialize(decoder: Decoder): T {
  val continuation = ... // decoder stores the `ContinuationImpl#intercepted`
  if (decoder.hasNext()) {
    return continuation.childSerializer().deserialize(decoder)
  } else {
    return continuation
  }
}

from kotlinx.coroutines.

awto commented on September 23, 2024

@faucct

I have stumbled upon this ticket while dreaming about a programming language with a support for durable workflows.

I have implemented workflows using my JVM instrumenting-based continuations implementation serializable continuations and a JS transpiler. Durability/scalability is offered by Kafka Streams, or, in fact, any stateful stream transformer. However, I'd prefer Kotlin Coroutines if they are serializable.

@pdvrieze

While I agree that current languages have poor support for workflows, I would suggest this is partially due to the challenge. A workflow management system needs to deal with the management of the workflow instances in flight. They require a high amount of robustness, may have a state associated with the instance, and most of all may fail in unclear ways that may need some sort of manual handling.

Do you mean the workflow versioning here? The versioning indeed must be part of the code, but the workflow-as-code is much easier than using declarative workflow definitions, even for versioning.

from kotlinx.coroutines.

pdvrieze commented on September 23, 2024

@awto Part of the problem of workflows is versioning. It is required. I'd say that workflows would probably be better declaratively (although this can be done in code - with the actual activity implementation being directly in code - BPEL is just bad in usability and too tight on an imperative execution model). What I was more concerned about is how you deal with workflows that have "problems" in the middle of their (sometimes weeks/months long) execution.

Part of dealing with errors is to have robust specifications to start with. You want to be explicit in the inputs and outputs of steps. As such coroutines that do this automatically/transparently may not be the best approach (I haven't tried it, so this is not a dismissal, just an "I don't know").

from kotlinx.coroutines.

awto commented on September 23, 2024

@pdvrieze

Part of the problem of workflows is versioning. It is required.

We can be very flexible with the versioning and code. As it's Kafka we can have the whole history. We can travel back to some point where it's easy to update. Or wait for some point in the future. We can simulate already recorded inputs. Or we can throw an exception signalling the update, and it's the workflow which catches it and perform the necessary state handling. For this, the state doesn't need to be exposed at all. Though it can be (for example, using local variable annotations described in this thread above). But I wouldn't do this.

I'd say that workflows would probably be better declaratively (although this can be done in code - with the actual activity implementation being directly in code - BPEL is just bad in usability and too tight on an imperative execution model).

It depends on how we define the word declarative. The workflow languages I know don't look declarative at all. They are usually some visual diagrams or some lists in yaml, or something like this - but they are still instructions about how we do things, not what we do. So they are just yet another programming language, but heavily restricted though.

What I was more concerned about is how you deal with workflows that have "problems" in the middle of their (sometimes weeks/months long) execution.

These "problems" are just exceptions. The workflow developer is responsible to handle them gracefully in both approaches. An advantage of the continuations+Kafka approach is we have the log of the weeks/months etc. We can time travel to any position. No need for any spec or vendors to support this. Everything needed is already available.

Part of dealing with errors is to have robust specifications to start with. You want to be explicit in the inputs and outputs of steps. As such coroutines that do this automatically/transparently may not be the best approach (I haven't tried it, so this is not a dismissal, just an "I don't know").

I'm not sure I understand this. The spec is Kafka topics schema. It's only the input/output spec and the internal state isn't exposed, it can be exposed but it shouldn't be. Everything the outside needs to know must be posted into some dedicated Kafka topic. And everything it needs to know from the outside, the workflow must read from some dedicated Kafka topic. Note, the internal state is still available for debugging (with any JVM debugger), and since we can travel to the past it's available at any point in time. The steps are just the code - it looks explicit enough. We can use any software development practices to make the code better, or we can write something ugly. But it's the same for BPEL, except BPEL restricts using the practices.

from kotlinx.coroutines.

Serializability of coroutine classes about kotlinx.coroutines HOT 74 OPEN

Comments (74)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent