Coder Social home page Coder Social logo

Comments (15)

bellini666 avatar bellini666 commented on July 21, 2024 1

Oh interesting! Does that mean disabling qs._fetch_all() prevents the query from running async?

It prevents you from returning querysets to be resolved at graphql-core level, as iterating over them without them already being cached will trigger a db query T_T

from strawberry.

fireteam99 avatar fireteam99 commented on July 21, 2024 1

I see, thanks for the explanation! Glad there will be a more permanent fix coming in the future

from strawberry.

patrick91 avatar patrick91 commented on July 21, 2024

@bellini666 how do you handle this use case in django?

from strawberry.

shmoon-kr avatar shmoon-kr commented on July 21, 2024

Another idea is to add a boolean flag to the parameters of strawberry_django.fields() that indicates a counter query.

For example, one can define a counter query like the following:

@strawberry.type
class Query:
    fruit_count: int = strawberry_django.field(count=True)

And in a filter apply resolver,

def apply(
    filters: Optional[object],
    queryset: _QS,
    info: Optional[Info] = None,
    pk: Optional[Any] = None,
    count: Bool = False,
) -> _QS:
    if pk not in (None, strawberry.UNSET):  # noqa: PLR6201
        queryset = queryset.filter(pk=pk)

    if filters in (None, strawberry.UNSET) or not has_django_definition(filters):  # noqa: PLR6201
        return queryset.count() if count else queryset

    ...

    return queryset.count() if count else queryset

Then a user can make counter function easily with the default field resolver.

I'm not familiar with the structure of the library but I think you can do easily with the above idea.

from strawberry.

bellini666 avatar bellini666 commented on July 21, 2024

Is there any way to make the metadata visible in the query?

@shmoon-kr what exactly do you mean by this? I'm not sure if I got it from your example.

from strawberry.

shmoon-kr avatar shmoon-kr commented on July 21, 2024

@bellini666

Please refer to the following link:
https://strawberry.rocks/docs/guides/pagination/overview

In the documentation above, there is an example of using metadata to return a total count when implementing the pagination function.

According to the above article, "It may be useful to add metadata to the result. For example, the metadata may specify how many items there are in total, so that the client knows what the greatest offset value can be." and gives the following query result example.

{
  "users": [
    ...
  ]
  "metadata": {
    "count": 25
  }
}

This is exactly what I need, but I don't have an example of how to add metadata to the results in a query resolver and how to include it in the resolver.

I checked the source code and saw that the StrawberryField class has a member variable called metadata, so I assumed I could assign it to this variable, and since the StrawberryDjangoField class inherits from the StrawberryField class, I thought I could do the same.

As a test, I tried to modify the pagination apply resolver defined in strawberry_django to include the metadata, but that didn't make the metadata appear in the query result like above. So, I'm asking how to get the metadata to appear in the query results, like the results of my query.

Thank you for your interest in this case.

--

from strawberry.

bellini666 avatar bellini666 commented on July 21, 2024

@shmoon-kr oh, I see what you mean

You can probably do something similar to what I do on strawberry-graphql-django for ListConnectionWithTotalCound: https://github.com/strawberry-graphql/strawberry-graphql-django/blob/main/strawberry_django/relay.py#L51

In there I overridden resolve_connection to store nodes at the object, which is basically the django's QuerySet, and then I use it in the total_count resolver, which could instead return a Metadata object.

Of course you would need to change your return value and probably implement your own resolver for that.

Let me know if that points you to the right direction. Otherwise feel free to share more code in here so I can give you more advices

from strawberry.

fireteam99 avatar fireteam99 commented on July 21, 2024

Hi all, I'm also using strawberry-django and have a similar need for pagination metadata without using Relay. My current approach is a manually resolve my list queries using a generic "wrapper" node.

@strawberry.type
class GenericListNode(Generic[T]):
    page_meta: PageMetaNode
    results: List[T]


def get_generic_list(
    root,
    info: Info,
    model: Type[Model],
    node: Type[T],
    pagination: Optional[OffsetPaginationInput] = strawberry.UNSET,
    filters = None,
    order = None,
) -> GenericListNode[T]:
    """
    Provides pagination metadata with support for Strawberry Django's built in pagination, sorting, and ordering.
    """

    # for some reason trying to set pagination as a default argument does not work
    if not pagination:
        pagination = OffsetPaginationInput(offset=0, limit=30)

    qs: List[T] = model.objects.all()

    get_queryset = getattr(node, "get_queryset", None)
    if callable(get_queryset):
        qs = node.get_queryset(qs, info)

    qs: List[T] = strawberry_django.ordering.apply(order, qs)
    qs: List[T] = strawberry_django.filters.apply(filters, qs)

    total_count = qs.count()

    qs: List[T] = strawberry_django.pagination.apply(pagination, qs)

    return GenericListNode(
        page_meta=PageMetaNode(
            total_count=total_count,
        ),
        results=qs,
    )

# Usage
def get_foos(
    root,
    info: Info,
    pagination: Optional[OffsetPaginationInput] = None,
    filters: Optional[FooFilter] = None,
    order: Optional[FooOrder] = None,
) -> GenericListNode[UserNode]:
    return get_generic_list(root, info, FooModel, Foo, pagination, filter, order)

@strawberry.type(name="Query")
class FooQuery:
    foos: GenericListNode[Foo] = strawberry.field(resolver=get_foos)

This approach isn't perfect, since you still have to write a resolver function for each query and can't take advantage of strawberry-django's automatic list resolver.

Therefore, I've been trying encapsulate this logic into a field extension. Unfortunatly, this results in a performance issues because the next_ function in resolve evaluates the entire queryset (i think due to this line). This makes queries on large tables very slow - even when applying aggressive pagination.

Extension:

class PaginationExtension(FieldExtension):
    def apply(self, field: StrawberryField) -> None:
        type_argument = getattr(field.type, "of_type", None)

        if not isinstance(field.type, StrawberryList) or not type_argument:
            raise TypeError("Invalid type for PaginationExtension")

        # check if a pagination argument is already provided - either as a resolver argument or setting `pagination=True` with `strawberry_django.type`
        pagination_argument_exists = False
        for argument in field.arguments:
            if (
                argument.python_name == "pagination"
                and getattr(argument.type, "of_type", None) == OffsetPaginationInput
            ):
                pagination_argument_exists = True
                break

        if not pagination_argument_exists:
            raise ValueError("Missing pagination argument")

        field.type = GenericListNode[type_argument]

    def resolve(
        self,
        next_: Callable[..., Any],
        source: Any,
        info: Info,
        pagination: Optional[OffsetPaginationInput] = None,
        **kwargs,
    ):
        if not pagination:
            pagination = OffsetPaginationInput(offset=0, limit=30)

        # This is really slow because the default Strawberry Django resolver attempts to fetch and evaluate the entire queryset
        # before applying pagination. Source: https://github.com/strawberry-graphql/strawberry-graphql-django/blob/35a85e7d1b9d3cb30a91b167dbfafaaa22453e75/strawberry_django/resolvers.py#L29-L34
        results = next_(source, info, **kwargs)

        paginated_results = strawberry_django.pagination.apply(pagination, results)

        return GenericListNode(
            page_meta=PageMetaNode(
                total_count=results.count(),
            ),
            results=paginated_results,
        )

# Usage
@strawberry.type(name="Query")
class FooQuery:
    foos: list[Foo] = strawberry_django.field(
        extensions=[PaginationExtension()]
    )

@bellini666 would it be possible to prevent or delay the queryset from being evaluated by next_ in a field extension? I'd also love to get your thoughts on my approach in general.

from strawberry.

bellini666 avatar bellini666 commented on July 21, 2024

@fireteam99

would it be possible to prevent or delay the queryset from being evaluated by next_ in a field extension?

Yeah, that's the optimizer doing its thing =P

The relay connection also has a similar issue, and we workaround that internally by not doing that fetch you pointed out in here: https://github.com/strawberry-graphql/strawberry-django/blob/a4cc6a77ad221b6eb968685e9dfa54e9f82c5f51/strawberry_django/fields/field.py#L250

I would say that a very "ugly" workaround would be for you to do field.is_connection = True on your apply method. That might make it return the queryset without fetching it, I'm just not sure if it is going to have any other unwanted side effects...

A better fix would be to add a way to define if the field should do the prefetch or not, which by default uses the is_connection as I mentioned, but can also be overriden by an extension? What do you think?

I'd also love to get your thoughts on my approach in general.

Your approach seems fine :)

from strawberry.

fireteam99 avatar fireteam99 commented on July 21, 2024

@bellini666 thanks for the explanation!

The relay connection also has a similar issue, and we workaround that internally by not doing that fetch you pointed out in here: https://github.com/strawberry-graphql/strawberry-django/blob/a4cc6a77ad221b6eb968685e9dfa54e9f82c5f51/strawberry_django/fields/field.py#L250

Ahh i see. Can confirm setting field.is_connection = True does indeed fix the performance issues! I didn't notice any issues with ordering, filter, prefetching, or any unwanted side affects for that matter. From what I understand, the qs._fetch_all() is used by the internal optimizer? I'm curious why I didn't encounter any n+1 issues even after disabling it.

A better fix would be to add a way to define if the field should do the prefetch or not, which by default uses the is_connection as I mentioned, but can also be overriden by an extension? What do you think?

Agreed, this would be a much better solution! Is this something we can override in an extension right now? Or will it require a code change in the library?

from strawberry.

bellini666 avatar bellini666 commented on July 21, 2024

From what I understand, the qs._fetch_all() is used by the internal optimizer? I'm curious why I didn't encounter any n+1 issues even after disabling it.

Actually the opposite. qs._fetch_all() is what django does internally to cache the results, so after that iterating over the queryset will use the already fetched results instead of querying the db again.

That is done like that because when running async, you can only access the database from the main thread (usually using sync_to_async), otherwise you get something like this:

Screenshot 2024-02-16 at 11 00 38

Agreed, this would be a much better solution! Is this something we can override in an extension right now? Or will it require a code change in the library?

There isn't a way right now, but I'll try to add an api for that during the weekend, or maybe try to remove the _fetch_all() call and solve that in another way. I have an idea that might work...

from strawberry.

fireteam99 avatar fireteam99 commented on July 21, 2024

Actually the opposite. qs._fetch_all() is what django does internally to cache the results, so after that iterating over the queryset will use the already fetched results instead of querying the db again.

That is done like that because when running async, you can only access the database from the main thread (usually using sync_to_async), otherwise you get something like this:

Oh interesting! Does that mean disabling qs._fetch_all() prevents the query from running async?

There isn't a way right now, but I'll try to add an api for that during the weekend, or maybe try to remove the _fetch_all() call and solve that in another way. I have an idea that might work...

Awesome, thank you so much!

from strawberry.

bellini666 avatar bellini666 commented on July 21, 2024

Hey @fireteam99 , I think this should allow you to solve your issue: strawberry-graphql/strawberry-django#481

When it is merged, you just need to do field.disable_fetch_list_results = True in your extension apply method

Let me know what you think.

There isn't a way right now, but I'll try to add an api for that during the weekend, or maybe try to remove the _fetch_all() call and solve that in another way. I have an idea that might work...

Also, regarding this... I couldn't find a way right now, but it seems that the next graphql-core version, once it is released, will allow us to remove that workaround once and for all :)

from strawberry.

fireteam99 avatar fireteam99 commented on July 21, 2024

Hi @bellini666, thanks for adding this feature! This is a lot cleaner than the connection workaround.

Also, regarding this... I couldn't find a way right now, but it seems that the next graphql-core version, once it is released, will allow us to remove that workaround once and for all :)

Gotcha gotcha, so this shouldn't be an issue as long as I'm not using async?

from strawberry.

bellini666 avatar bellini666 commented on July 21, 2024

Gotcha gotcha, so this shouldn't be an issue as long as I'm not using async?

No. Although strawberry-django will do the fetching for both, unless you set the disable_fetch_list_results attribute to True

from strawberry.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.