Coder Social home page Coder Social logo

Comments (6)

Fatal1ty avatar Fatal1ty commented on June 5, 2024 1

But, the combination of the two, now isn't:

In [12]: WrapperCls(data=ConfigMessage(config={"foo":"bar"})).to_msgpack()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [12], in <cell line: 1>()
----> 1 WrapperCls(data=ConfigMessage(config={"foo":"bar"})).to_msgpack()

File <string>:9, in __mashumaro_to_msgpack__(self, encoder)

File /opt/conda/lib/python3.8/site-packages/mashumaro/mixins/msgpack.py:29, in default_encoder(data)
     28 def default_encoder(data: Any) -> EncodedData:
---> 29     return msgpack.packb(data, use_bin_type=True)

File /opt/conda/lib/python3.8/site-packages/msgpack/__init__.py:36, in packb(o, **kwargs)
     30 def packb(o, **kwargs):
     31     """
     32     Pack object `o` and return packed bytes
     33 
     34     See :class:`Packer` for options.
     35     """
---> 36     return Packer(**kwargs).pack(o)

File msgpack/_packer.pyx:294, in msgpack._cmsgpack.Packer.pack()

File msgpack/_packer.pyx:300, in msgpack._cmsgpack.Packer.pack()

File msgpack/_packer.pyx:297, in msgpack._cmsgpack.Packer.pack()

File msgpack/_packer.pyx:231, in msgpack._cmsgpack.Packer._pack()

File msgpack/_packer.pyx:291, in msgpack._cmsgpack.Packer._pack()

TypeError: can not serialize 'ConfigMessage' object

When you call WrapperCls(…).to_msgpack() you call the method to_msgpack compiled for unbound TypeVar which is interpreted as Any and you will get any data passed as is in the end. There is no type introspection at runtime. Moreover, it’s not possible to deserialize something unknown to WrapperCls instance because T can be anything. You either need to create a specific dataclass model with T specified (as I showed in the previous message for FooWrapperCls), or create an encoder for specific data type:

from mashumaro.codecs.msgpack import MessagePackEncoder

config_message_encoder = MessagePackEncoder(WrapperCls[ConfigMessage])
config_message_encoder.encode(WrapperCls(ConfigMessage(config={"foo":"bar"})))

See my answer in the similar issue: #193 (comment)

You can also play with discriminators, hooks, custom serialization strategies, since only you know better how to distinguish all variety of your data. Just as a silly example, it’s possible to achieve runtime introspection on your side using (de)serialization methods. However, if you care about performance, I encourage you to create specific dataclass models / encoders / decoders instead.

def serialize_t(value):
    if isinstance(value, DataClassDictMixin):
        return value.to_dict()
    else:
        raise NotImplementedError

@dataclass
class WrapperCls(Generic[T], DataClassMessagePackMixin):
    data: T = field(metadata=field_options(serialize=serialize_t))

from mashumaro.

Fatal1ty avatar Fatal1ty commented on June 5, 2024

Hi @stamostz!

This code works properly on mashumaro 3.0.4

It works indeed, but not in the way that can be expected. I installed 3.0.4 (the same will be on 3.8.1) and enabled debug to see what happens in your example. Basically, for field data in dataclass WrappedCls the following method is generated:

__main__.WrapperCls:
@classmethod
def __unpack_union_WrapperCls_data__8f8113bcd6294acd898493f0d316a6f8(cls, value, *, dialect=None):
    try:
        return mashumaro_mixins_msgpack_DataClassMessagePackMixin.from_dict(value)
    except Exception: pass
    try:
        return mashumaro_mixins_json_DataClassJSONMixin.from_dict(value)
    except Exception: pass
    raise InvalidFieldValue('data',typing.Union[mashumaro.mixins.msgpack.DataClassMessagePackMixin, mashumaro.mixins.json.DataClassJSONMixin],value,cls)

The problematic expressions here are mashumaro_mixins_msgpack_DataClassMessagePackMixin.from_dict(value) and mashumaro_mixins_json_DataClassJSONMixin.from_dict(value). They work with no doubts but creating an instance of DataClassMessagePackMixin or DataClassJSONMixin is meaningless. This was addressed and changed in:

I am wrapping any kind of data in a wrapper class. I want it to be serializable and the data I'm wrapping to also be serializable

Good news for you! Starting with version 3.9, you no longer need to inherit mixins for nested dataclasses. Unfortunately, it would be problematic to express any type serializable with mashumaro, so I suggest you just drop bound in your TypeVar:

from datetime import date
from typing import TypeVar, Generic, Optional, List
from mashumaro.mixins.msgpack import DataClassMessagePackMixin
from mashumaro.mixins.json import DataClassJSONMixin
from dataclasses import dataclass

T = TypeVar("T")


@dataclass
class WrapperCls(Generic[T], DataClassMessagePackMixin, DataClassJSONMixin):
    data: Optional[T] = None


@dataclass
class Foo:
    x: date


@dataclass
class FooWrapperCls(WrapperCls[Foo]):
    pass


@dataclass
class DateWrapperCls(WrapperCls[date]):
    pass


@dataclass
class ListOfDatesWrapperCls(WrapperCls[List[date]]):
    pass


assert FooWrapperCls.from_json('{"data": {"x": "2024-03-01"}}').data.x == date(2024, 3, 1)
assert DateWrapperCls.from_json('{"data": "2024-03-01"}').data == date(2024, 3, 1)
assert ListOfDatesWrapperCls.from_json('{"data": ["2024-03-01"]}').data[0] == date(2024, 3, 1)

from mashumaro.

stamostz avatar stamostz commented on June 5, 2024

Thank @Fatal1ty for the elaborate response.

I can indeed drop bound=Union[DataClassMessagePackMixin, DataClassJSONMixin] and the error is resolved. However, is my TypeVar T now type-safe? From your quote and the appended issue, I understand it is type-safe.

Good news for you! Starting with version 3.9, you no longer need to inherit mixins for nested dataclasses.

However, there are cases I need the nested dataclasses to be mixins. Here's an example:

from dataclasses import dataclass
from mashumaro.mixins.msgpack import DataClassMessagePackMixin
from mashumaro.mixins.json import DataClassJSONMixin
from typing import Dict, Any


@dataclass
class ConfigMessage(DataClassMessagePackMixin, DataClassJSONMixin):
    config: Dict[str, Any]

In this scenario, my WrapperCls is sent over some network with the data field (along with others, metadata, etc) which in this case is a ConfigMessage. Then, in another software block, I want to serialize/deserialize my ConfigMessage for other reasons/usecases. So I need to able to receive data and have it be serializable.

I did some digging and here's what is odd:
My WrapperCls is mashumaro serializable for basic examples. I can confirm it with a console:

In [8]: ResultResponse(data="foo").to_msgpack()
Out[8]: b'\x82\xa4data\xa3foo\xa5error\xc0'

In [9]: ResultResponse(data=4).to_msgpack()
Out[9]: b'\x82\xa4data\x04\xa5error\xc0'

In [10]: ResultResponse(data={"foo": "bar"}).to_msgpack()
Out[10]: b'\x82\xa4data\x81\xa3foo\xa3bar\xa5error\xc0'

My ConfigMessage class, also is mashumaro serializable:

In [11]: ConfigMessage(config={"foo": "bar"}).to_msgpack()
Out[11]: b'\x81\xa6config\x81\xa3foo\xa3bar'

But, the combination of the two, now isn't:

In [12]: WrapperCls(data=ConfigMessage(config={"foo":"bar"})).to_msgpack()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [12], in <cell line: 1>()
----> 1 WrapperCls(data=ConfigMessage(config={"foo":"bar"})).to_msgpack()

File <string>:9, in __mashumaro_to_msgpack__(self, encoder)

File /opt/conda/lib/python3.8/site-packages/mashumaro/mixins/msgpack.py:29, in default_encoder(data)
     28 def default_encoder(data: Any) -> EncodedData:
---> 29     return msgpack.packb(data, use_bin_type=True)

File /opt/conda/lib/python3.8/site-packages/msgpack/__init__.py:36, in packb(o, **kwargs)
     30 def packb(o, **kwargs):
     31     """
     32     Pack object `o` and return packed bytes
     33 
     34     See :class:`Packer` for options.
     35     """
---> 36     return Packer(**kwargs).pack(o)

File msgpack/_packer.pyx:294, in msgpack._cmsgpack.Packer.pack()

File msgpack/_packer.pyx:300, in msgpack._cmsgpack.Packer.pack()

File msgpack/_packer.pyx:297, in msgpack._cmsgpack.Packer.pack()

File msgpack/_packer.pyx:231, in msgpack._cmsgpack.Packer._pack()

File msgpack/_packer.pyx:291, in msgpack._cmsgpack.Packer._pack()

TypeError: can not serialize 'ConfigMessage' object

What am I not understanding properly?

from mashumaro.

stamostz avatar stamostz commented on June 5, 2024

Spot on, this worked like a charm.

Finally, I'm overriding from_json and from_msgpack to include the type in my WrapperCls like this:

def serialize_t(value):
    if value is None:
        return None
    if isinstance(value, DataClassDictMixin):
        return value.to_dict()
    else:
        raise NotImplementedError

@dataclass
class WrapperCls(Generic[T], DataClassMessagePackMixin, DataClassJSONMixin):
    data: Optional[T] = field(metadata=field_options(serialize=serialize_t))
    
    @classmethod
    def from_json(
        cls: "WrapperCls",
        data: EncodedData,
        type: Type[T],
        decoder: Decoder = json.loads,
        **from_dict_kwargs,
    ) -> "WrapperCls[T]":
        decoded_data = decoder(data)

        data_part_maybe = decoded_data.get("data")
        if data_part_maybe is not None:
            # give mashumaro type hint
            return WrapperCls(
                data=type.from_dict(data_part_maybe, **from_dict_kwargs)
            )

        return cls.from_dict(decoder(data), **from_dict_kwargs)

    @classmethod
    def from_msgpack(
        cls: "WrapperCls",
        data: EncodedData,
        type: Type[T],
        decoder: Decoder = default_decoder,
        **from_dict_kwargs,
    ) -> "WrapperCls[T]":
        decoded_data = decoder(data)

        data_part_maybe = decoded_data.get("data")
        if data_part_maybe is not None:
            # give mashumaro type hint
            return WrapperCls(
                data=type.from_dict(
                    data_part_maybe,
                    **dict({"dialect": MessagePackDialect}, **from_dict_kwargs),
                )
            )

        return cls.from_dict(
            decoder(data), **dict({"dialect": MessagePackDialect}, **from_dict_kwargs)
        )

See the examples:

In [5]: j = WrapperCls(data=ConfigMessage(config={"foo":"bar"})).to_json()

In [6]: WrapperCls.from_json(j, ConfigMessage)
Out[6]: ResultResponse(data=ConfigMessage(config={'foo': 'bar'}))

Cheers, this works nicely.

However, when I do the same with from_msgpack:

In [7]: m = WrapperCls(data=ConfigMessage(config={"foo":"bar"})).to_msgpack()

In [8]: WrapperCls.from_msgpack(m, ConfigMessage)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [8], in <cell line: 1>()
----> 1 WrapperCls.from_msgpack(m, ConfigMessage)

File <string>:18, in __mashumaro_from_msgpack__(cls, d, decoder, dialect)

ValueError: Argument for WrapperCls.__mashumaro_from_msgpack__ method should be a dict instance

That's weird, inspecting a bit deeper, we can see that the instance is indeed a dict. That makes me think that the method is not really overriden but some default implementation is taken. We can confirm that by renaming our method:

    @classmethod
    def from_msgpack_typed(
        cls: "WrapperCls",
        data: EncodedData,
        type: Type[T],
        decoder: Decoder = default_decoder,
        **from_dict_kwargs,
    ) -> "WrapperCls[T]":
        ...

Now, the same example:

In [3]: m = WrapperCls(data=ConfigMessage(config={"foo":"bar"})).to_msgpack()

In [4]: WrapperCls.from_msgpack_typed(m, ConfigMessage)
Out[4]: WrapperCls(data=ConfigMessage(config={'foo': 'bar'}))

Yay! This works. Of course, it's the lesser evil but I can rename all from_msgpack instances in my codebase to from_msgpack_typed but really the question here is:
Why isn't from_msgpack being overriden?

from mashumaro.

Fatal1ty avatar Fatal1ty commented on June 5, 2024

Yay! This works. Of course, it's the lesser evil but I can rename all from_msgpack instances in my codebase to from_msgpack_typed but really the question here is: Why isn't from_msgpack being overriden?

Because this is how mixins work. There is no point in using mixins that provide these methods if you define them by yourself. Again, if you need some extra work during (de)serialization, you can use pre/post hooks.

Addition: There is implementation difference between DataClassJSONMixin and DataClassMessagePackMixin. You can override from_json because it’s a simple method that calls from_dict. On the other hand, from_msgpack is decorated with @final decorator for a reason — it’s a warning that you shouldn’t override it in subclasses (mypy will yell if you do that). This method looks like a stub because the specific implementation will be compiled and injected into the dataclass. So, if you break the rules and try to override it, you will be surprised. Probably, it would be nice to call warnings.warn(…) in case someone has it overridden in a subclass.

I see you want “typed” from_* methods that have a T replacement as an argument. It’s different from what mixins from mashumaro provide. I guess, if you like mixins, you can stop using ones from mashumaro and create your own that will be providing methods with different signature. Generic version with codecs under the hood can be something like this:

@classmethod
def from_msgpack(
    cls: "WrapperCls",
    data: EncodedData,
    data_type: Type[T],
    msgpack_decoder: Decoder = default_decoder,
) -> "WrapperCls[T]":
    decoded_data = msgpack_decoder(data)
    data_part = decoded_data.get("data")
    dict_decoder = cls.__dict_decoders__.get(data_type)
    if not dict_decoder:
        dict_decoder = BasicDecoder(data_type)
        cls.__dict_decoders__[data_type] = dict_decoder
    return cls(dict_decoder.decode(data_part))

If you want to be sure that T replacement is serializable through bound, you can do it:

T = TypeVar(“T”, DataClassDictMixin)

@classmethod
def from_msgpack(
   cls: "WrapperCls",
   data: EncodedData,
   data_type: Type[T],
   msgpack_decoder: Decoder = default_decoder,
) -> "WrapperCls[T]":
   decoded_data = msgpack_decoder(data)
   data_part = decoded_data.get("data")
   return cls(data_type.from_dict(data_part))

from mashumaro.

Fatal1ty avatar Fatal1ty commented on June 5, 2024

I’m going to close this issue as stale. If you still need help, feel free to answer.

from mashumaro.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.