Comments (15)
Oh interesting! Does that mean disabling qs._fetch_all() prevents the query from running async?
It prevents you from returning querysets to be resolved at graphql-core level, as iterating over them without them already being cached will trigger a db query T_T
from strawberry.
I see, thanks for the explanation! Glad there will be a more permanent fix coming in the future
from strawberry.
@bellini666 how do you handle this use case in django?
from strawberry.
Another idea is to add a boolean flag to the parameters of strawberry_django.fields() that indicates a counter query.
For example, one can define a counter query like the following:
@strawberry.type
class Query:
fruit_count: int = strawberry_django.field(count=True)
And in a filter apply resolver,
def apply(
filters: Optional[object],
queryset: _QS,
info: Optional[Info] = None,
pk: Optional[Any] = None,
count: Bool = False,
) -> _QS:
if pk not in (None, strawberry.UNSET): # noqa: PLR6201
queryset = queryset.filter(pk=pk)
if filters in (None, strawberry.UNSET) or not has_django_definition(filters): # noqa: PLR6201
return queryset.count() if count else queryset
...
return queryset.count() if count else queryset
Then a user can make counter function easily with the default field resolver.
I'm not familiar with the structure of the library but I think you can do easily with the above idea.
from strawberry.
Is there any way to make the metadata visible in the query?
@shmoon-kr what exactly do you mean by this? I'm not sure if I got it from your example.
from strawberry.
Please refer to the following link:
https://strawberry.rocks/docs/guides/pagination/overview
In the documentation above, there is an example of using metadata to return a total count when implementing the pagination function.
According to the above article, "It may be useful to add metadata to the result. For example, the metadata may specify how many items there are in total, so that the client knows what the greatest offset value can be." and gives the following query result example.
{
"users": [
...
]
"metadata": {
"count": 25
}
}
This is exactly what I need, but I don't have an example of how to add metadata to the results in a query resolver and how to include it in the resolver.
I checked the source code and saw that the StrawberryField class has a member variable called metadata, so I assumed I could assign it to this variable, and since the StrawberryDjangoField class inherits from the StrawberryField class, I thought I could do the same.
As a test, I tried to modify the pagination apply resolver defined in strawberry_django to include the metadata, but that didn't make the metadata appear in the query result like above. So, I'm asking how to get the metadata to appear in the query results, like the results of my query.
Thank you for your interest in this case.
--
from strawberry.
@shmoon-kr oh, I see what you mean
You can probably do something similar to what I do on strawberry-graphql-django for ListConnectionWithTotalCound
: https://github.com/strawberry-graphql/strawberry-graphql-django/blob/main/strawberry_django/relay.py#L51
In there I overridden resolve_connection
to store nodes
at the object, which is basically the django's QuerySet
, and then I use it in the total_count
resolver, which could instead return a Metadata
object.
Of course you would need to change your return value and probably implement your own resolver for that.
Let me know if that points you to the right direction. Otherwise feel free to share more code in here so I can give you more advices
from strawberry.
Hi all, I'm also using strawberry-django and have a similar need for pagination metadata without using Relay. My current approach is a manually resolve my list queries using a generic "wrapper" node.
@strawberry.type
class GenericListNode(Generic[T]):
page_meta: PageMetaNode
results: List[T]
def get_generic_list(
root,
info: Info,
model: Type[Model],
node: Type[T],
pagination: Optional[OffsetPaginationInput] = strawberry.UNSET,
filters = None,
order = None,
) -> GenericListNode[T]:
"""
Provides pagination metadata with support for Strawberry Django's built in pagination, sorting, and ordering.
"""
# for some reason trying to set pagination as a default argument does not work
if not pagination:
pagination = OffsetPaginationInput(offset=0, limit=30)
qs: List[T] = model.objects.all()
get_queryset = getattr(node, "get_queryset", None)
if callable(get_queryset):
qs = node.get_queryset(qs, info)
qs: List[T] = strawberry_django.ordering.apply(order, qs)
qs: List[T] = strawberry_django.filters.apply(filters, qs)
total_count = qs.count()
qs: List[T] = strawberry_django.pagination.apply(pagination, qs)
return GenericListNode(
page_meta=PageMetaNode(
total_count=total_count,
),
results=qs,
)
# Usage
def get_foos(
root,
info: Info,
pagination: Optional[OffsetPaginationInput] = None,
filters: Optional[FooFilter] = None,
order: Optional[FooOrder] = None,
) -> GenericListNode[UserNode]:
return get_generic_list(root, info, FooModel, Foo, pagination, filter, order)
@strawberry.type(name="Query")
class FooQuery:
foos: GenericListNode[Foo] = strawberry.field(resolver=get_foos)
This approach isn't perfect, since you still have to write a resolver function for each query and can't take advantage of strawberry-django's automatic list resolver.
Therefore, I've been trying encapsulate this logic into a field extension. Unfortunatly, this results in a performance issues because the next_
function in resolve
evaluates the entire queryset (i think due to this line). This makes queries on large tables very slow - even when applying aggressive pagination.
Extension:
class PaginationExtension(FieldExtension):
def apply(self, field: StrawberryField) -> None:
type_argument = getattr(field.type, "of_type", None)
if not isinstance(field.type, StrawberryList) or not type_argument:
raise TypeError("Invalid type for PaginationExtension")
# check if a pagination argument is already provided - either as a resolver argument or setting `pagination=True` with `strawberry_django.type`
pagination_argument_exists = False
for argument in field.arguments:
if (
argument.python_name == "pagination"
and getattr(argument.type, "of_type", None) == OffsetPaginationInput
):
pagination_argument_exists = True
break
if not pagination_argument_exists:
raise ValueError("Missing pagination argument")
field.type = GenericListNode[type_argument]
def resolve(
self,
next_: Callable[..., Any],
source: Any,
info: Info,
pagination: Optional[OffsetPaginationInput] = None,
**kwargs,
):
if not pagination:
pagination = OffsetPaginationInput(offset=0, limit=30)
# This is really slow because the default Strawberry Django resolver attempts to fetch and evaluate the entire queryset
# before applying pagination. Source: https://github.com/strawberry-graphql/strawberry-graphql-django/blob/35a85e7d1b9d3cb30a91b167dbfafaaa22453e75/strawberry_django/resolvers.py#L29-L34
results = next_(source, info, **kwargs)
paginated_results = strawberry_django.pagination.apply(pagination, results)
return GenericListNode(
page_meta=PageMetaNode(
total_count=results.count(),
),
results=paginated_results,
)
# Usage
@strawberry.type(name="Query")
class FooQuery:
foos: list[Foo] = strawberry_django.field(
extensions=[PaginationExtension()]
)
@bellini666 would it be possible to prevent or delay the queryset from being evaluated by next_
in a field extension? I'd also love to get your thoughts on my approach in general.
from strawberry.
would it be possible to prevent or delay the queryset from being evaluated by next_ in a field extension?
Yeah, that's the optimizer doing its thing =P
The relay connection also has a similar issue, and we workaround that internally by not doing that fetch you pointed out in here: https://github.com/strawberry-graphql/strawberry-django/blob/a4cc6a77ad221b6eb968685e9dfa54e9f82c5f51/strawberry_django/fields/field.py#L250
I would say that a very "ugly" workaround would be for you to do field.is_connection = True
on your apply
method. That might make it return the queryset without fetching it, I'm just not sure if it is going to have any other unwanted side effects...
A better fix would be to add a way to define if the field should do the prefetch or not, which by default uses the is_connection
as I mentioned, but can also be overriden by an extension? What do you think?
I'd also love to get your thoughts on my approach in general.
Your approach seems fine :)
from strawberry.
@bellini666 thanks for the explanation!
The relay connection also has a similar issue, and we workaround that internally by not doing that fetch you pointed out in here: https://github.com/strawberry-graphql/strawberry-django/blob/a4cc6a77ad221b6eb968685e9dfa54e9f82c5f51/strawberry_django/fields/field.py#L250
Ahh i see. Can confirm setting field.is_connection = True
does indeed fix the performance issues! I didn't notice any issues with ordering, filter, prefetching, or any unwanted side affects for that matter. From what I understand, the qs._fetch_all()
is used by the internal optimizer? I'm curious why I didn't encounter any n+1 issues even after disabling it.
A better fix would be to add a way to define if the field should do the prefetch or not, which by default uses the is_connection as I mentioned, but can also be overriden by an extension? What do you think?
Agreed, this would be a much better solution! Is this something we can override in an extension right now? Or will it require a code change in the library?
from strawberry.
From what I understand, the qs._fetch_all() is used by the internal optimizer? I'm curious why I didn't encounter any n+1 issues even after disabling it.
Actually the opposite. qs._fetch_all()
is what django does internally to cache the results, so after that iterating over the queryset will use the already fetched results instead of querying the db again.
That is done like that because when running async, you can only access the database from the main thread (usually using sync_to_async
), otherwise you get something like this:
Agreed, this would be a much better solution! Is this something we can override in an extension right now? Or will it require a code change in the library?
There isn't a way right now, but I'll try to add an api for that during the weekend, or maybe try to remove the _fetch_all()
call and solve that in another way. I have an idea that might work...
from strawberry.
Actually the opposite. qs._fetch_all() is what django does internally to cache the results, so after that iterating over the queryset will use the already fetched results instead of querying the db again.
That is done like that because when running async, you can only access the database from the main thread (usually using sync_to_async), otherwise you get something like this:
Oh interesting! Does that mean disabling qs._fetch_all()
prevents the query from running async?
There isn't a way right now, but I'll try to add an api for that during the weekend, or maybe try to remove the _fetch_all() call and solve that in another way. I have an idea that might work...
Awesome, thank you so much!
from strawberry.
Hey @fireteam99 , I think this should allow you to solve your issue: strawberry-graphql/strawberry-django#481
When it is merged, you just need to do field.disable_fetch_list_results = True
in your extension apply
method
Let me know what you think.
There isn't a way right now, but I'll try to add an api for that during the weekend, or maybe try to remove the _fetch_all() call and solve that in another way. I have an idea that might work...
Also, regarding this... I couldn't find a way right now, but it seems that the next graphql-core version, once it is released, will allow us to remove that workaround once and for all :)
from strawberry.
Hi @bellini666, thanks for adding this feature! This is a lot cleaner than the connection workaround.
Also, regarding this... I couldn't find a way right now, but it seems that the next graphql-core version, once it is released, will allow us to remove that workaround once and for all :)
Gotcha gotcha, so this shouldn't be an issue as long as I'm not using async?
from strawberry.
Gotcha gotcha, so this shouldn't be an issue as long as I'm not using async?
No. Although strawberry-django will do the fetching for both, unless you set the disable_fetch_list_results
attribute to True
from strawberry.
Related Issues (20)
- Resolver error if using Parent type with strawberry.experimental.pydantic.type
- Changing default resolver to dict don't work well with Union types HOT 1
- Docs re-structure
- Support both Pydantic 1 and 2 HOT 1
- Add support for framework's specific upload type
- Schema basics docs HOT 2
- Should we hide fields that starts with `_` by default?
- buggy generic interfaces HOT 3
- Add support of permission_classes for type decorator
- Visual bug in documentation HOT 2
- Allow `strawberry.auto | None` in Pydantic
- strawberry.ext.mypy_plugin PydanticModelField.to_argument error HOT 2
- make to_pydantic function recursive
- Unable to hide field in derived type HOT 1
- Execution Context errors
- Expose common request on extensions' context
- Add support for using FastAPI APIRouter arguments in GraphQLRouter HOT 1
- Feature Request: Info context in scalar serialization HOT 2
- Broken documentation examples in page https://strawberry.rocks/docs/guides/dataloaders HOT 6
- `print_schema` does not prefix the `extend` keyword when Schema uses apollo federation HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from strawberry.