For a client project I've been looking at the feasibility of replacing the use of <cod

Here's a revised design: <div class="highlight highlight-source-haskell notranslat

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

cc <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

It'd be of interest to me, for a separate project from the one <a class="user-mention

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Configurable serialisation layer? about acid-state HOT 8 CLOSED

acid-state commented on June 8, 2024 3

Configurable serialisation layer?

from acid-state.

Comments (8)

stepcut commented on June 8, 2024 2

cabal flags are eeeevil. IMO, the correct answer is almost always 'more packages'. I have not looked into this particular situation, but my instinct says:

acid-state-core - provides most of what is currently in acid-state but with out the safecopy
acid-state-serialize - provides the serialize bits
acid-state-cborg - provides the cborg bits
acid-state - a compatibility package that pulls in acid-state-core and acid-state-serialize in a way that can hopefully prevent most packages that depend on acid-state from breaking.

I should note that another argument against flags is that it can be reasonable to want to have multiple serializations in the same app for the same database.

The current serialize backend is design to be compact -- which is good for performance. However, it makes it nearly impossible to examine old backups with out the code available.

It would be nice if there was a more verbose or human-readable serialization format that could be used for checkpoints that you want to archive for backup purposes. In the days of yore, happs-state had both a binary and an XML backend. The XML output would have been potentially easier to recover data from even if you lost the original code. It is nearly impossible to recover data from a checkpoint file right now unless you have the matching code.

from acid-state.

adamgundry commented on June 8, 2024 1

Here's a revised design:

data Serialiser a = Serialiser { encode :: a -> Lazy.ByteString
                               , decode :: Lazy.ByteString -> Either String a
                               }

safeCopySerialiser :: SafeCopy a => Serialiser a
safeCopySerialiser = Serialiser (runPutLazy . safePut) (runGetLazy safeGet)

-- | The basic Method class. Each Method has an indexed result type
--   and a unique tag.
class ( Typeable ev, Typeable (MethodResult ev)) =>
      Method ev where
    type MethodResult ev
    type MethodState ev
    methodTag :: ev -> Tag
    methodTag ev = Lazy.pack (showQualifiedTypeRep (typeOf ev))

    methodSerialiser :: Serialiser ev
    default methodSerialiser :: SafeCopy ev => Serialiser ev
    methodSerialiser = safeCopySerialiser

    resultSerialiser :: proxy ev -> Serialiser (MethodResult ev)
    default resultSerialiser :: SafeCopy (MethodResult ev) => proxy ev -> Serialiser (MethodResult ev)
    resultSerialiser _ = safeCopySerialiser :: Serialiser (MethodResult ev)

class IsAcidic st where
    acidEvents :: [Event st]
      -- ^ List of events capable of updating or querying the state.
    stateSerialiser :: Serialiser st
    default stateSerialiser :: SafeCopy st => Serialiser st
    stateSerialiser = safeCopySerialiser

Putting the serialisers in the Method and IsAcidic classes with default methods means that in the common case, this is backwards-compatible with existing code. Individual instances are free to user alternative serialisers, however.

I'm wondering whether the Serialiser record type should instead be unpacked into the classes, i.e. giving an encoder and decoder as separate class methods. Doing so might be slightly more performant, but is a bit messier and leaves open the possibility of a user overriding one default method but not the other.

It's also not obvious to me whether it would be worth permitting the encoding to be swapped out at the level of events/checkpoints. For my present application it probably doesn't matter much; all I really need is to be able to convert a checkpoint to and from an external binary format with a minimum of fuss.

Finally, it's slightly unsatisfying that we always have to define a serialiser for method results, even though this is needed only if the remote functionality is used. Have you ever considered refactoring to avoid this? I suppose this would require a type distinction to identify AcidState components that could be used with the remote code, and perhaps that's too high a price to pay.

from acid-state.

adamgundry commented on June 8, 2024 1

@dmjio thanks! Sorry it is taking me a little while to get to this...

I'm not sure that changing the API depending on a cabal flag is a good idea. The problem is that client libraries can't express a dependency on a flag choice (since that is ultimately a matter for the person building an application). However, a library using acid-state might well require a particular serialisation layer (because it will only be providing instances for one or the other). Moreover, two libraries in a single application might want to use acid-state with different serialisation layers.

Really the "right" solution to this kind of problem is Backpack, but I guess that's not a realistic option until it's more widely available.

The approach I'm proposing does mean that acid-state will still have a safecopy dependency, which might in principle be redundant (though only if we also permit changing the serialisers for events/checkpoints, which I'm still investigating).

I suppose we could drop the default methods, and have TH supply the default safecopy-based serialisers. That would mean a little more work for users creating instances manually (rather than via TH), but it might make possible a no-TH version of acid-state that did not depend on safecopy.

from acid-state.

dmjio commented on June 8, 2024

cc @stepcut @lemmih

from acid-state.

dcoutts commented on June 8, 2024

It'd be of interest to me, for a separate project from the one @adamgundry is working on.

from acid-state.

adamgundry commented on June 8, 2024

FWIW, I don't think the design sketched above is quite right, because it forces the use of orphan Serialisable instances in order to satisfy Serialisable (MethodResult ev), e.g. if MethodResult MyMethod = (). I'll explore the details a bit more. But the general idea still stands.

from acid-state.

dmjio commented on June 8, 2024

@adamgundry @dcoutts. Since no one has responded, I'll say we're all in favor of these changes and would gladly accept a PR. Although, instead of a default implementation and TH to generate empty instances, do you think it would be best to include cborg / serialise conditionally upon the enablement of a cabal flag, as well as the inclusion of a module that contains both the class declaration and relevant instance (so as to avoid orphans). This would remove more reliance on TH and would only include the relevant dependencies during the build. Just a suggestion, so please advise. Hypothetical cabal file section below.

if flag(cborg)
  build-depends: cborg
  hs-source-dirs: cborg
  exposed-modules: Data.Acid.Core

if flag(serialise)
  build-depends: serialise
  hs-source-dirs: serialise
  exposed-modules: Data.Acid.Core

^ defaults to safecopy

Implement as you wish, but again, we're all in favor.

from acid-state.

adamgundry commented on June 8, 2024

Thanks @stepcut. I agree that the multiple-packages option is plausible, and it should be feasible to refactor the library along the lines you describe (at least provided we drop the default method approach I suggested above in favour of TH-based defaults, i.e. require a bit more boilerplate from non-TH users). My only concern is that splitting into 3/4 packages would be quite a big change. Do you think the benefits of supporting new serialisation backends would justify it?

It would be nice if there was a more verbose or human-readable serialization format that could be used for checkpoints that you want to archive for backup purposes.

This is pretty much my use case. CBOR fills this role nicely as it can be decoded by standard tools without requiring the original code (e.g. generic conversion to JSON is possible). It's also a relatively compact encoding and performs well.

from acid-state.

Configurable serialisation layer? about acid-state HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent