Coder Social home page Coder Social logo

feincms / django-tree-queries Goto Github PK

View Code? Open in Web Editor NEW
393.0 10.0 23.0 218 KB

Adjacency-list trees for Django using recursive common table expressions. Supports PostgreSQL, sqlite, MySQL and MariaDB.

Home Page: https://django-tree-queries.readthedocs.io/

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

django-tree-queries's Introduction

django-tree-queries

CI Status

Query Django model trees using adjacency lists and recursive common table expressions. Supports PostgreSQL, sqlite3 (3.8.3 or higher) and MariaDB (10.2.2 or higher) and MySQL (8.0 or higher, if running without ONLY_FULL_GROUP_BY).

Supports Django 3.2 or better, Python 3.8 or better. See the GitHub actions build for more details.

Features and limitations

  • Supports only integer and UUID primary keys (for now).
  • Allows specifying ordering among siblings.
  • Uses the correct definition of depth, where root nodes have a depth of zero.
  • The parent foreign key must be named "parent" at the moment (but why would you want to name it differently?)
  • The fields added by the common table expression always are tree_depth, tree_path and tree_ordering. The names cannot be changed. tree_depth is an integer, tree_path an array of primary keys and tree_ordering an array of values used for ordering nodes within their siblings. Note that the contents of the tree_path and tree_ordering are subject to change. You shouldn't rely on their contents.
  • Besides adding the fields mentioned above the package only adds queryset methods for ordering siblings and filtering ancestors and descendants. Other features may be useful, but will not be added to the package just because it's possible to do so.
  • Little code, and relatively simple when compared to other tree management solutions for Django. No redundant values so the only way to end up with corrupt data is by introducing a loop in the tree structure (making it a graph). The TreeNode abstract model class has some protection against this.
  • Supports only trees with max. 50 levels on MySQL/MariaDB, since those databases do not support arrays and require us to provide a maximum length for the tree_path and tree_ordering upfront.

Here's a blog post offering some additional insight (hopefully) into the reasons for django-tree-queries' existence.

Usage

  • Install django-tree-queries using pip.
  • Extend tree_queries.models.TreeNode or build your own queryset and/or manager using tree_queries.query.TreeQuerySet. The TreeNode abstract model already contains a parent foreign key for your convenience and also uses model validation to protect against loops.
  • Call the with_tree_fields() queryset method if you require the additional fields respectively the CTE.
  • Call the order_siblings_by("field_name") queryset method if you want to order tree siblings by a specific model field. Note that Django's standard order_by() method isn't supported -- nodes are returned according to the depth-first search algorithm.
  • Create a manager using TreeQuerySet.as_manager(with_tree_fields=True) if you want to add tree fields to queries by default.
  • Until documentation is more complete I'll have to refer you to the test suite for additional instructions and usage examples, or check the recipes below.

Recipes

Basic models

The following two examples both extend the TreeNode which offers a few agreeable utilities and a model validation method that prevents loops in the tree structure. The common table expression could be hardened against such loops but this would involve a performance hit which we don't want -- this is a documented limitation (non-goal) of the library after all.

Basic tree node

from tree_queries.models import TreeNode

class Node(TreeNode):
    name = models.CharField(max_length=100)

Tree node with ordering among siblings

Nodes with the same parent may be ordered among themselves. The default is to order siblings by their primary key but that's not always very useful.

from tree_queries.models import TreeNode

class Node(TreeNode):
    name = models.CharField(max_length=100)
    position = models.PositiveIntegerField(default=0)

    class Meta:
        ordering = ["position"]

Add custom methods to queryset

from tree_queries.models import TreeNode
from tree_queries.query import TreeQuerySet

class NodeQuerySet(TreeQuerySet):
    def active(self):
        return self.filter(is_active=True)

class Node(TreeNode):
    is_active = models.BooleanField(default=True)

    objects = NodeQuerySet.as_manager()

Querying the tree

All examples assume the Node class from above.

Basic usage

# Basic usage, disregards the tree structure completely.
nodes = Node.objects.all()

# Fetch nodes in depth-first search order. All nodes will have the
# tree_path, tree_ordering and tree_depth attributes.
nodes = Node.objects.with_tree_fields()

# Fetch any node.
node = Node.objects.order_by("?").first()

# Fetch direct children and include tree fields. (The parent ForeignKey
# specifies related_name="children")
children = node.children.with_tree_fields()

# Fetch all ancestors starting from the root.
ancestors = node.ancestors()

# Fetch all ancestors including self, starting from the root.
ancestors_including_self = node.ancestors(include_self=True)

# Fetch all ancestors starting with the node itself.
ancestry = node.ancestors(include_self=True).reverse()

# Fetch all descendants in depth-first search order, including self.
descendants = node.descendants(include_self=True)

# Temporarily override the ordering by siblings.
nodes = Node.objects.order_siblings_by("id")

Breadth-first search

Nobody wants breadth-first search but if you still want it you can achieve it as follows:

nodes = Node.objects.with_tree_fields().extra(
    order_by=["__tree.tree_depth", "__tree.tree_ordering"]
)

Filter by depth

If you only want nodes from the top two levels:

nodes = Node.objects.with_tree_fields().extra(
    where=["__tree.tree_depth <= %s"],
    params=[1],
)

Form fields

django-tree-queries ships a model field and some form fields which augment the default foreign key field and the choice fields with a version where the tree structure is visualized using dashes etc. Those fields are tree_queries.fields.TreeNodeForeignKey, tree_queries.forms.TreeNodeChoiceField, tree_queries.forms.TreeNodeMultipleChoiceField.

django-tree-queries's People

Contributors

bryanculver avatar glennmatthews avatar jameelhamdan avatar matthiask avatar olivierdalang avatar rhomboss avatar safaalfulaij avatar taobojlen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

django-tree-queries's Issues

Is returning nodes in breadth-first order possible

I want to generate a nested HTML list structure and the easiest way would be to have the tree nodes sorted in BFS order. Or be able to .order_by('tree_depth') on the resulting query.

Instead I can call .order_by('__tree.tree_depth') but that seems to have no effect.

Descendant `update` queries do not include SQL for the CTE

First, thank you for this terrific library!

When migrating a project from django-mptt, I discovered that with Django 4.2.13 and django-tree-queries 0.19.0, node.descendants().update(...) raises django.db.utils.ProgrammingError: the generated SQL is an UPDATE that references the __tree table, but does not include the SQL for building it. In comparison, node.ancestors().update(...) splits the work into two queries, issuing a SELECT then an UPDATE.

Reproduction

# 
# Minimal setup
#

# Use the model definition from the docs.
from tree_queries.models import TreeNode
class Node(TreeNode):
    name = models.CharField(max_length=100)

# Run migrations, then create a node.
node = Node(name='The Only Node')
node.save()
another_node = Node(name='Another Node', parent=node)
another_node.save()

# Verify that node.descendants() and node.ancestors() work as expected
node.descendants()
node.ancestors()
another_node.descendants()
another_node.ancestors()

#
# Attempt to update descendant queryset, and log the SQL
#

from django.db.utils import ProgrammingError

try:
    node.descendants().update(name='A New Name')
except ProgrammingError as e:
    # ProgrammingError: missing FROM-clause entry for table "__tree"
    # STATEMENT:  UPDATE "sample_node" SET "name" = 'A New Name' WHERE ((1 = ANY(__tree.tree_path)) AND NOT ("sample_node"."id" = 1))
    print(e)

#
# For comparison, update ancestor queryset.
#

assert another_node.ancestors().update(name='A New Name') == 1

# It works, in an unexpected way:
# it first does a SELECT that includes the `WITH RECURSIVE __rank_table` clause,
# and then does an UPDATE with the found ids
# STATEMENT:  UPDATE "sample_node" SET "name" = 'A New Name' WHERE "sample_node"."id" IN (1)

Workaround

We were able to work around this easily in our application by manually taking the same two-query approach as with ancestors:

descendant_ids = list(node.descendants().values_list('id', flat=True))
Node.objects.filter(ids__in=descendant_ids).update(name='A New Name')

PostgreSQL syntax error when using with django-ordered-model

I'm trying to use django-tree-queries in combination with django-ordered-model. That library provides some useful helper methods for ordering, like methods to move a record to a different place in the ordering, or to swap two records. I don't think that there's any reason these two libraries should be incompatible.

The issue I'm running into is that django-tree-queries is generating invalid SQL. My model looks something like this:

class PolicyQuerySet(TreeQuerySet, OrderedModelQuerySet):
    pass

class PolicyManager(TreeManager):
	with_tree_queries = True

    def get_queryset(self):
        # Override get_queryset to return a custom queryset that combines
        # tree queries and ordered model queries
        return PolicyQuerySet()

class Policy(TreeNode, OrderedModel):
	name = models.TextField()

	# the following is actually provided implicitly by OrderedModel
	order = models.PositiveIntegerField()
	class Meta:
		ordering = ("order",)

When I try to execute queries I get a SQL syntax error. The generated SQL looks like, for example

            WITH RECURSIVE __tree (
                "tree_depth",
                "tree_path",
                "tree_ordering",
                "tree_pk"
            ) AS (
                SELECT
                    0 AS tree_depth,
                    array[T.id] AS tree_path,
                    array[order] AS tree_ordering,
                    T."id"
                FROM stratus_policy T
                WHERE T."parent_id" IS NULL

                UNION ALL

                SELECT
                    __tree.tree_depth + 1 AS tree_depth,
                    __tree.tree_path || T.id,
                    __tree.tree_ordering || order,
                    T."id"
                FROM stratus_policy T
                JOIN __tree ON T."parent_id" = __tree.tree_pk
            )
            SELECT (__tree.tree_depth) AS "tree_depth", (__tree.tree_path) AS "tree_path", (__tree.tree_ordering) AS "tree_ordering", "stratus_policy"."id", "stratus_policy"."order", "stratus_policy"."parent_id", "stratus_policy"."name" FROM "stratus_policy" , "__tree" WHERE ("stratus_policy"."parent_id" IS NULL AND (__tree.tree_pk = stratus_policy.id)) ORDER BY ("__tree".tree_ordering) ASC

The problem is with array[order] AS tree_ordering. Postgres throws a syntax error on this part.

Am I doing anything obviously wrong here?

Fetching siblings

Do I need to manually query where tree_depth is the same as the node I want siblings for or is there a helper method to retrieve a node's siblings I haven't found? It's not clear from the docs. Thanks

Docs appear wrong: Children not ordered by ID by default

From the docs:

Nodes with the same parent may be ordered among themselves. The default is to order siblings by their primary key

That's not what I'm seeing with Postgres 16.3. I have to explicitly call mymodel.children.order_siblings_by("id") to order children by ID.

tree_ordering by char fields - not working/supported?

It looks like the tree_ordering implementation (as well as order_siblings_by() assumes the field being ordered by is an integer value, since it left-pads it with zeros to solve the "110" < "20" problem (replacing it with "00000000000000000110" > "00000000000000000020"). While I see there's some code in tree_queries/compiler.py that appears to be intended to handle text differently from integers for PostgreSQL, in MySQL at least there's no such logic and so a tree ordered by a CharField is inappropriately sorted, e.g.:

>>> Location.objects.all()[4].tree_ordering
['0000000000Milky Way', '000000000Sol System', '00000000000000Earth', '0000000000000Africa']
>>> Location.objects.all()[5].tree_ordering
['0000000000Milky Way', '000000000Sol System', '00000000000000Earth', '000000000000Eurasia']
>>> Location.objects.all()[6].tree_ordering
['0000000000Milky Way', '000000000Sol System', '00000000000000Earth', '0000000000Australia']
>>> Location.objects.all()[7].tree_ordering
['0000000000Milky Way', '000000000Sol System', '00000000000000Earth', '000000North America']

(note that "Eurasia" is sorted between "Africa" and "Australia", apparently because its string length falls between them.)

Additionally the tree_ordering is calculated incorrectly if the field being ordered by exceeds 20 characters in length:

>>> Location.objects.get(name="Various Islands in the Pacific Ocean").tree_ordering
['0000000000Milky Way', '000000000Sol System', '00000000000000Earth']

(the tree_ordering is simply omitting the object's name altogether from the array in this case.)

Is this a known limitation of this library?

Unions, intersections and differences with tree fields do not work

Unions, intersections and differences do not seem to work when tree fields are ussed. Since this is not mentioned in the limitations I am not sure if this is deliberate. Specifically, multiple errors are thrown when performing these queries when they include tree fields. I am using django-tree-queries 0.14 and django 4.2.1.

Minimal example:

from tree_queries.models import TreeNode
class TestClass(TreeNode):
    class Meta:
        app_label = "test"

TestClass.objects.with_tree_fields().union(TestClass.objects.with_tree_fields())
# .intersection(...) and .difference(...) throw the same errors

This throws an error:

File ".../venv/lib/python3.11/site-packages/django/db/models/sql/compiler.py", line 550, in <listcomp>
    query.get_compiler(self.using, self.connection, self.elide_empty)
TypeError: TreeQuery.get_compiler() takes from 1 to 3 positional arguments but 4 were given

which seems to be related to the elide_empty not being passed as a keyword argument by this django code. Changing the signature of TreeQuery.get_compiler to get_compiler(self, using=None, connection=None, elide_empty = True, **kwargs) fixes this issue but raises another one. Instead, I get:

File ".../venv/lib/python3.11/site-packages/django/db/models/sql/query.py", line 2151, in add_fields
    raise FieldError(
django.core.exceptions.FieldError: Cannot resolve keyword 'tree_depth' into field. Choices are: __orderbycol1, id, parent, parent_id

It seems like a custom implementation of get_combinator_sql(self, combinator, all) would probably be required on TreeQuery to make this work.

How to serialize as nested json ?

Hi,
Do you have any idea or already existing solution to serialize the result of with_tree_fields() using DRF serializers as below ?

{
    "id": 1051,
    "name": "Data",
    "children": [
        {
            "id": 121,
            "name": "DataObject1"
            "children":[]
        },
        {
            "id": 122,
            "name": "DataObject2"
            "children": []
        }
    ]
}

Thanks!

Ordering siblings in descending order causes problems with retrieving ancestors

Ordering siblings in descending order and then calling "ancestors" is raising an error.

Here is my model:

class Post(tree_queries.models.TreeNode):
    created = models.DateTimeField(auto_now_add=True)
    parent = tree_queries.models.TreeNodeForeignKey(
        "self",
        blank=True,
        null=True,
        on_delete=models.CASCADE,
        verbose_name="parent",
        related_name="children",
    )

    class Meta:
        ordering = ["-created"]

In a template, I call:

    {% for ancestor in post.ancestors.all %}
    ( do stuff
    {% endfor %}

Result:

ProgrammingError
column "-created" does not exist
LINE 11:             array["-created"]::text[] AS tree_ordering,
                           ^
HINT:  Perhaps you meant to reference the column "t.created".

Am I doing something wrong? It seems like there must be an issue with how ancestors deals with sibling ordering.

tree_ordering: returns int instead of string values after upgrade from 0.16.1 to 0.18

Hi,

after upgrading to 0.18 my tests are failing at the usages of tree_ordering. I tried different versions, and the failures do not show up with 0.16.1 but with 0.17 and 0.18.

Model:

class TranslationKey(TreeNode):
    label = CharField(
        null=False,
        blank=False,
        max_length=255,
    )
    objects = TreeQuerySet.as_manager(with_tree_fields=True)

    class Meta:
        ordering = ['label']
        unique_together = ['parent', 'label']

    def __str__(self):
        return self.label

    @property
    def full_label(self):
        tree_ordering = getattr(self, 'tree_ordering', None)
        if tree_ordering:
            # this is only present for objects fetched via TranslationKey.objects
            return '.'.join(tree_ordering)
        return '.'.join([tk.label for tk in self.ancestors(include_self=True)])

full_label returns a translation label like global.button.save where global, button, and save are 3 instances of TranslationKey.

This code is running fine up until 0.16.1 but fails after upgrading to 0.17 when running as part of a Django test - which means that the DB is setup with migrations from scratch before each test. In this case, my tests run on SQLite (the actual app runs on Postgres).

I have not found any release notes that indicate I need to change something (which would be completely fine), and I have not found any issues in this github repo about this. Am I the only one facing this issue?

Thanks for this library!

How to get ancestors per node for subquery?

In django-mptt i can subquery all ancestors per node by using the following code snippet:

 def get_ancestors_per_node(self, include_self: bool = False):
        return self.get_queryset().filter(
            tree_id=OuterRef("tree_id"),
            lft__lte=OuterRef("lft") if include_self else OuterRef(
                "lft") - 1,
            rght__gte=OuterRef(
               f"rght") if include_self else OuterRef("rght") + 1
        )
def something(self):
   return self.get_queryset().annotate(
       anchestors=ArraySubquery(self.get_ancestors_per_node())
   )

In django-tree-queries i thought i can do it like:

def ancestors_per_node(self, of, include_self=True):
        return self.get_queryset().ancestors(of=of, include_self=include_self)
def something(self):
  return self.get_queryset().annotate(
       anchestors=ArraySubquery(self.get_ancestors_per_node(of=OuterRef("pk")))
   )

But that results in an ValueError: This queryset contains a reference to an outer query and may only be used in a subquery.

Logically this can't work, cause the ancestors method of the TreeQuerySet does the lookup if the passed of has no tree_path attribute:

if not hasattr(of, "tree_path"):
      of = self.with_tree_fields().get(pk=pk(of))

Do you have any ideas, how i can construct an object from the OuterRef to pass it in the ancestors method; Or any other way to get the ancestors per node for subquerys?

TreeNode model inheriting from concrete class raises Operational Error

When a model subclasses another concrete model, an Operational Error is thrown using django's admin website: no such column: T.position.

I've tried with feincms3's, with concrete Page subclassing AbstractPage, and DiffusionPage subclassing Page. It seems that is because DiffusionPage holds a fk to Page table and do not contains itself the position column.

Here is a copy of the generated SQL (sqlite) expression:

('\n'
 '    WITH RECURSIVE __tree(tree_depth, tree_path, tree_ordering, tree_pk) AS '
 '(\n'
 '        SELECT\n'
 '            0 tree_depth,\n'
 '            printf("\x1f%%s\x1f", page_ptr_id) tree_path,\n'
 '            printf("\x1f%%020s\x1f", position) tree_ordering,\n'
 '            T."page_ptr_id" tree_pk\n'
 '        FROM app_diffusionpage T\n'
 '        WHERE T."parent_id" IS NULL\n'
 '\n'
 '        UNION ALL\n'
 '\n'
 '        SELECT\n'
 '            __tree.tree_depth + 1,\n'
 '            __tree.tree_path || printf("%%s\x1f", T.page_ptr_id),\n'
 '            __tree.tree_ordering || printf("%%020s\x1f", T.position),\n'
 '            T."page_ptr_id"\n'
 '        FROM app_diffusionpage T\n'
 '        JOIN __tree ON T."parent_id" = __tree.tree_pk\n'
 '    )\n'
 '    SELECT (1) AS "a" FROM "app_diffusionpage" , "__tree" WHERE '
 '("app_diffusionpage"."diffusion_id" = %s AND (__tree.tree_pk = '
 'app_diffusionpage.page_ptr_id)) ORDER BY ("__tree".tree_ordering) '
 'ASC  LIMIT 1')

[Request for Info] Difference between path and ordering

Hi!
I got confused between path and ordering fields now that I'm using UUID as primary keys. I don't know how they should be with integer keys, hence this question.
Currently, both fields are exactly equal and contain the uuid of the records. Is this right? Or is it because of my UUID primary keys? And what should each represent?
Thanks and sorry if this looks like a basic database question :\

Concurrency concerns (question not issue)

hey @matthiask - I recently landed on this library (great job, btw) after having multiple issues related to concurrency when using django-mptt.. what initally brought me here was an unresolved issue (and the library becoming unmaintained) about tree_id.

django-mptt uses max(tree_id) + 1 to assign a tree_id to a new node and this doesn't play well with concurrent writes..

i know it's a broad question, but overall, would you recommend using this library considering we're going to have multiple workers doing concurrent writes to nodes that can possibly live in the same tree? i haven't gone too deep in the implementation (although i probably will as you said that sloc is ~280 lines).. but wondering your opinion on that?

cheers

Limit number of results per depth

I'd like to use this library, but i just wonder if it's possible to set a limit of e.g. max 10 childrens per root node and two children per child node?

Admin: adding with_tree_fields breaks the search in the list view

Using version 0.15

Hi there,

first, thanks for this useful library!

It took me a while to realize what a nice feature "with_tree_fields" is and that it already comes with ordering by custom text value. This has the great effect that the list in the admin is already sorted by full paths.

I was greedy and thought that this will also enable the search on the full paths. However, now that I've changed the queryset to always include "with_tree_fields" (for sorting), it's not possible to search in the admin list view anymore.

My model:

class TranslationKey(TreeNode):
    label = CharField(
        null=False,
        blank=False,
        max_length=255,
        db_index=True
    )
    objects = TreeQuerySet.as_manager(with_tree_fields=True)

    class Meta:
        ordering = ['label']
        unique_together = ['parent', 'label']

    @property
    def full_label(self):
        tree_ordering = getattr(self, 'tree_ordering', None)
        if tree_ordering:
            # this is only present for objects fetched via TranslationKey.objects
            return '.'.join(tree_ordering)
        return '.'.join([tk.label for tk in self.ancestors(include_self=True)])
```python

The admin:

```python
class TranslationKeyAdmin(ModelAdmin):
    list_display = ['full_label']
    search_fields = ['label', 'translation__value']
    readonly_fields = ('full_label',)

The error is the following:

[2023-08-17 17:40:36,809] ERROR - basehttp "GET /translations/groupedtranslation/?q=effect HTTP/1.1" 500 211263
[2023-08-17 17:40:43,584] ERROR - log Internal Server Error: /translations/groupedtranslation/
Traceback (most recent call last):
  File "/XXX/venv/lib/python3.11/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.errors.UndefinedTable: missing FROM-clause entry for table "v0"
LINE 101: ...e".tree_ordering) ASC LIMIT 1) AND (__tree.tree_pk = V0.id))

(Explaining the URL: this model is configured as "GroupedTranslations" (via proxy) in the admin.)


Including the output for queryset.query dumped via overriden get_search_results in the Admin class:

class TranslationKeyAdmin(ModelAdmin):
    # omitted config (see above)

    def get_search_results(self, request, queryset, search_term):
        queryset, may_have_duplicates = super().get_search_results(request, queryset, search_term)
        LOGGER.debug('>>>>>>>>>>>>>>>>>< queryset %s', queryset.query)
        return queryset, may_have_duplicates
[2023-08-17 17:56:04,592] DEBUG - base >>>>>>>>>>>>>>>>>< queryset 
    WITH RECURSIVE __tree (
        "tree_depth",
        "tree_path",
        "tree_ordering",
        "tree_pk"
    ) AS (
        SELECT
            0 AS tree_depth,
            array[T.id] AS tree_path,
            array["label"]::text[] AS tree_ordering,
            T."id"
        FROM translations_translationkey T
        WHERE T."parent_id" IS NULL

        UNION ALL

        SELECT
            __tree.tree_depth + 1 AS tree_depth,
            __tree.tree_path || T.id,
            __tree.tree_ordering || "label"::text,
            T."id"
        FROM translations_translationkey T
        JOIN __tree ON T."parent_id" = __tree.tree_pk
    )
    SELECT (__tree.tree_depth) AS "tree_depth", (__tree.tree_path) AS "tree_path", (__tree.tree_ordering) AS "tree_ordering", "translations_translationkey"."id", "translations_translationkey"."parent_id", "translations_translationkey"."label" FROM "translations_translationkey" LEFT OUTER JOIN "translations_translation" ON ("translations_translationkey"."id" = "translations_translation"."key_id") , "__tree" WHERE (NOT (EXISTS(
    WITH RECURSIVE __tree (
        "tree_depth",
        "tree_path",
        "tree_ordering",
        "tree_pk"
    ) AS (
        SELECT
            0 AS tree_depth,
            array[T.id] AS tree_path,
            array["label"]::text[] AS tree_ordering,
            T."id"
        FROM translations_translationkey T
        WHERE T."parent_id" IS NULL

        UNION ALL

        SELECT
            __tree.tree_depth + 1 AS tree_depth,
            __tree.tree_path || T.id,
            __tree.tree_ordering || "label"::text,
            T."id"
        FROM translations_translationkey T
        JOIN __tree ON T."parent_id" = __tree.tree_pk
    )
    SELECT 1 AS "a" FROM "translations_translationkey" U0 LEFT OUTER JOIN "translations_translation" U1 ON (U0."id" = U1."key_id") , "__tree" WHERE (U1."id" IS NULL AND U0."id" = ("translations_translationkey"."id") AND (__tree.tree_pk = U0.id)) ORDER BY ("__tree".tree_ordering) ASC LIMIT 1)) AND (UPPER("translations_translationkey"."label"::text) LIKE UPPER(%effect%) OR UPPER("translations_translation"."value"::text) LIKE UPPER(%effect%)) AND (__tree.tree_pk = translations_translationkey.id)) ORDER BY ("__tree".tree_ordering) ASC
[2023-08-17 17:56:04,779] ERROR - log Internal Server Error: /translations/groupedtranslation/
Traceback (most recent call last):
  File "/XXX/venv/lib/python3.11/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.errors.UndefinedTable: missing FROM-clause entry for table "v0"
LINE 101: ...ranslationkey"."id")) LIMIT 1) AND (__tree.tree_pk = V0.id))

Thanks again!

Python 3.11.4
Django 4.2.3
django-tree-queries 0.15.0


EDIT (trying to find a work around):

Search works with the following manager:

objects = TreeQuerySet.as_manager()

but if I change that back to

objects = TreeQuerySet.as_manager(with_tree_fields=True)

and then override get_search_results in the Admin to reset the queryset, it is still broken:

    def get_search_results(self, request, queryset, search_term):
        LOGGER.debug('>>>>>>>>>>>>>> qs type %s', type(queryset))
        queryset, may_have_duplicates = super().get_search_results(request, queryset.without_tree_fields(), search_term)
        LOGGER.debug('>>>>>>>>>>>>>>>>>< queryset %s', queryset.query)
        return queryset, may_have_duplicates

Looking at the code, I really would have thought that TreeQuerySet.as_manager() and TreeQuerySet.as_manager(with_tree_fields=True).without_tree_fields() would be equivalent but it seems not.

Support non-integer keys

Hi.
I was trying this library with a model using UUID on sqlite, and it worked fine, except that it doesn't escape UUID strings all the time.
I had to change this to make it work (for sqlite at least):

def pk(of):
-    return of.pk if hasattr(of, "pk") else of
+    return str(of.pk).replace("-", "") if hasattr(of, "pk") else of

Perhaps other backends will require more, testing is needed.

Tree Fields Not Accurate After Model.objects.create()

Once again, great library, and I'm back to trying to use it again after using mptt but running into a real PITA concurrency issue where it has trouble creating distinct, unrelated nodes concurrently without corrupting the tree_ids. Given the project is unmaintained, potential fixes look like they've gotten stuck in PR limbo.

Anyway, I had a lot of trouble getting the right django-tree-queries "tree field" annotations on a model in a certain situation that I've finally solved. I'd like to confirm I understand the library behavior properly and suggest some documentation updates (which I'm happy to contribute as a PR).

Here's what I was doing.

I have a model that inherits from TreeNode. I also set a custom manager via TreeQuerySet.as_manager(with_tree_fields=True). I want to be able to, given a pk for an instance of this model type, make a copy of the instance with some, slightly different attributes and then make the original instance a child of the new instance. It's essentially a rudimentary VCS.

I ran into two problems trying to do this "version control":

  1. When I create the new object via Model.objects.create(...), I kept getting an error that tree_depth was not an attribute on the new instance. I tried to refresh_from_db() thinking maybe that could help, but no dice. Eventually, I realized that, if I took the pk of the new instance and immediately refetched it via Model.objects.get(id=id), I got the tree field annotations.
  2. One thing that was throwing me a bit was the original instance had the proper parent attribute but the tree_depth was still 0. As with the new instance, refresh_from_db() was no help. Like the new instance, I eventually realized I needed to requery it from the database and my problem was solved.

After discovering working this out and then very closely rereading your documentation, I think this is expected behavior as you state the manager adds the tree fields to queries by default:

Create a manager using TreeQuerySet.as_manager(with_tree_fields=True) if you want to add tree fields to queries by default.

So, I think it's the case that a database refresh will have no effect on these fields as they're not actual database fields but rather calculated fields. Furthermore, create(), I guess, isn't a query?

So, couple questions:

  1. Can you confirm my understanding is correct and that the only way to get the proper, accurate tree fields is to use objects.get() or objects.filter()?
  2. Not having written a Django manager before, is there an easy way to extend the TreeQuerySet code to add these annotations to the model .create() behavior too?
  3. Likewise, do you know if there is a way to override the refresh_from_db() behavior to annotate the tree fields?
  4. Finally, regardless of your answer to 1, 2 or 3, would you welcome some update documentation making this behavior clearer? It took me some time to figure this out, and, while it may be clear to people familiar with the inner workings of Django managers, I'd love to save the less enlightened some time.

order_by method doesn't work

I want to sort the structure of users by username, but if I use the standard order_by method then the sort doesn't work.
Structure.objects.all().order_by('user__username')
If we look at the query for the resulting TreeQuerySet, we will see that the sorting has not changed.
ORDER BY ("__tree".tree_ordering) ASC

But if we sort using the extra method, then everything works as it should:
Structure.objects.all().extra(order_by=['user__username'])

Sorting also works if we call the order_by method twice:

structure = Structure.objects.all()
structure = structure.order_by('user__username')
structure = structure.order_by('user__username')

Support filtered tree query

Currently the CTE query build tree in table scope and this make a huge performance impact when table is big.

In my situation, we seprate the records by user id which means the performance overhead will be reduced if we can build tree at the user level.

Reverse `tree_order`, but with children last

I'm working on a kind-of-a-web-forum implementation, and I'd like to return a list sorted by the published date (newest fist), and additionally with sticky posts at the beginning of the list, though I can work around the "sticky posts first" issue by doing two queries, filtered by sticky=True/False.

The tree query is only using the first Meta.ordering column, and it's not possible to negate it in Meta.ordering:

django.db.utils.OperationalError: near "-": syntax error

Therefore I've set up Meta.ordering = ('published_date', ) to get into the tree_ordering array, and then order the response like this:

ForumPost.objects.with_tree_fields() \
    .filter(is_sticky=False) \
    .extra(order_by=['-__tree.tree_ordering', '__tree.tree_depth'])

However, the reverse ordering on tree_ordering makes the children be returned first, and tree_depth isn't accounted for because shorter arrays are "less" than longer arrays. I can't sort the result by tree_depth nor by published_date because that would separate the children from their parents.

I intend to paginate the query, so separating families is not an option. Is there any other way to reverse-sort on the tree_ordering column, but to keep longer array values after shorter ones?

Recommended replacement for django-mptt's add_related_count() method?

Hi, we're looking at migrating from django-mptt and I was wondering if there's a recommended strategy for replacing its add_related_count() TreeManager method. We rely on this to provide a cumulative count of related objects assigned to a nested model.

For example, suppose we have Region instances representing countries, states, and cities, and we have Site instances which can be assigned to any Region. We'd call something like this to annotate a cumulative count of sites for each region:

Region.objects.add_related_count(
    Region.objects.all(),
    Site,
    'region',
    'site_count',
    cumulative=True
)

Replicating MPTT's approach directly doesn't seem to be feasible, since its querying against concrete database fields: Django's Subquery class doesn't understand tree_path generated by CTE. (It yields a FieldError: Cannot resolve keyword 'tree_path' into field.) However I'll admit this is challenging my own skill with SQL.

Has anyone come up with a similar solution using CTE?

Children only

How can I get only the children instead of all the decsendants?

Question: Descendants function

Hello, did I understand correctly that the "descendants function" only returns descendants in the first line? Is there no way to get all descendants of descendants? In other words the whole tree structure?

Descendants with matching field(s)

Hi,

Just testing this beautiful library. Was wondering if requesting for de descendants() it is also possible to request the descendants with only matching fields. Or should I do that with a list comprehension on every descendant object? Can imagine that would be multiple queries what would be pitty.

Advance queries not supported

Doing something like:
test_field__in=node_field.ancetors() raises an error:

no such column: ....

Check this diff please (contains a "fix"):

diff --git a/tests/testapp/models.py b/tests/testapp/models.py
index 1e5bbb0..f342729 100644
--- a/tests/testapp/models.py
+++ b/tests/testapp/models.py
@@ -30,6 +30,15 @@ class StringOrderedModel(TreeNode):
         return self.name
 
 
+class ReferenceModel(models.Model):
+    my_field = models.ForeignKey(
+        Model,
+        on_delete=models.CASCADE,
+        blank=True,
+        null=True,
+    )
+
+
 class AlwaysTreeQueryModelCategory(models.Model):
     pass

diff --git a/tests/testapp/test_queries.py b/tests/testapp/test_queries.py
index 8d14d41..419247d 100644
--- a/tests/testapp/test_queries.py
+++ b/tests/testapp/test_queries.py
@@ -2,7 +2,7 @@ from __future__ import unicode_literals
 
 from django import forms
 from django.core.exceptions import ValidationError
-from django.db.models import Count, Sum
+from django.db.models import Count, Q, Sum
 from django.test import TestCase
 
 from tree_queries.compiler import TreeQuery
@@ -11,6 +11,7 @@ from .models import (
     AlwaysTreeQueryModel,
     AlwaysTreeQueryModelCategory,
     Model,
+    ReferenceModel,
     StringOrderedModel,
     UnorderedModel,
 )
@@ -272,3 +273,20 @@ class Test(TestCase):
         m4 = c.instances.get()
         self.assertEqual(m1, m4)
         self.assertEqual(m4.tree_depth, 0)
+
+    def test_reference(self):
+        tree = self.create_tree()
+
+        ReferenceModel.objects.create(my_field=tree.child2_2)
+        obj = ReferenceModel.objects.get()
+
+        self.assertEqual(
+            list(
+                ReferenceModel.objects.filter(
+                    Q(my_field__in=tree.root.ancestors(include_self=True))
+                    | Q(my_field__isnull=False)
+                    | Q(my_field__in=tree.root.descendants(include_self=True))
+                )
+            ),
+            [obj],
+        )

diff --git a/tree_queries/compiler.py b/tree_queries/compiler.py
index db5f785..fdd0281 100644
--- a/tree_queries/compiler.py
+++ b/tree_queries/compiler.py
@@ -139,6 +139,10 @@ class TreeCompiler(SQLCompiler):
         }
 
         if "__tree" not in self.query.extra_tables:  # pragma: no branch - unlikely
+            custom_params = params.copy()
+            if "base_table" in self.query.__dict__:
+                custom_params["db_table"] = self.query.__dict__["base_table"] # use aliased table name "UX"
+
             self.query.add_extra(
                 # Do not add extra fields to the select statement when it is a
                 # summary query
@@ -150,7 +154,7 @@ class TreeCompiler(SQLCompiler):
                     "tree_ordering": "__tree.tree_ordering",
                 },
                 select_params=None,
-                where=["__tree.tree_pk = {db_table}.{pk}".format(**params)],
+                where=["__tree.tree_pk = {db_table}.{pk}".format(**custom_params)],
                 params=None,
                 tables=["__tree"],
                 order_by=(

Intersection between TreeQuerySet and QuerySet

I am trying to find an intersection between objects of a TreeNode model and the objects of a normal django model who has a FK to this TreeNode model. Models look like this:

class FolderTree(TreeNode):
    name = models.CharField(max_length=255)
    fullpath = models.CharField(max_length=255, null=False, blank=False, unique=True)
    isfile = models.BooleanField(default=False, null=False, blank=True)
    hasdoc = models.BooleanField(default=False, null=False, blank=True)
    document =  models.ForeignKey(Document, null=True, blank=True, related_name='folder', on_delete=models.PROTECT)
    file_extension = models.CharField(max_length=10, null=True, blank=True)
    deleted = models.BooleanField(default=False, null=False, blank=True)
    descendant_files = models.IntegerField(null=True, blank=True)
    
    class Meta:
        ordering = ("name",)
        unique_together = (("name", "parent"),)

    def __str__(self):
        return self.name

class DossierFile(models.Model):
    user = models.ForeignKey(User, default=1, on_delete=models.PROTECT, null=False, blank=False)
    dossier = models.ForeignKey(Dossier, default=1, on_delete=models.PROTECT, null=False, blank=False)
    file = models.ForeignKey(FolderTree, default=1, on_delete=models.PROTECT, null=False, blank=False, related_name='dossierfiles')
    created_at = models.DateTimeField(auto_now_add=True)

What I want is to have an intersection of all the objects of DossierFile's file fields and the descendants of the FolderTree objects. I do it like this:

f = FolderTree.objects.get(pk=21)
f_desc = f.descendants().values_list('id', flat=True)
all_dossier_files = DossierFile.objects.all().values_list('file', flat=True)
all_dossier_files.intersection(f_desc)

f_desc is an QuerySet and all_dossier is a TreeQuerySet obviousely. The error I get is:

InvalidColumnReference                    Traceback (most recent call last)
~/anaconda3/envs/lblmaker/lib/python3.8/site-packages/django/db/backends/utils.py in _execute(self, sql, params, *ignored_wrapper_args)
     83             else:
---> 84                 return self.cursor.execute(sql, params)
     85

InvalidColumnReference: ORDER BY position 2 is not in select list
LINE 51: ..."file_id" FROM "casemaker_dossierfile") ORDER BY (2) ASC LIM...
                                                              ^


The above exception was the direct cause of the following exception:

Any clue what is happening or how to solve this?

`select_related`, out of scope?

I was wondering what features will be considered out of scope and too much for this library, and what features isn't.
For example, in the test suite, how can we query ReferenceModel and select the tree model with it's tree fields?

ProgrammingError when querying with exclude on reverse relation with tree fields

First, thank you for this terrific library!

Hope this specific issue isn't too obscure... In our app, we have a TreeModel Location, and a set of models with foreign keys to Location, for example PowerPanel. We're working on upgrading from Django 3.2 to Django 4.2 (better late than never!) and have found that a subset of queries against Location now throw an exception on both PostgreSQL and MySQL:

>>> Location.objects.with_tree_fields().exclude(power_panels__isnull=False).exists()
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.8/site-packages/django_prometheus/db/common.py", line 69, in execute
    return super().execute(*args, **kwargs)
psycopg2.errors.UndefinedTable: missing FROM-clause entry for table "u0"
LINE 51: ...d" = ("dcim_location"."id") AND (__tree.tree_pk = U0.id)) OR...
                                                              ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/usr/local/lib/python3.8/site-packages/django/db/models/query.py", line 1241, in exists
    return self.query.has_results(using=self.db)
  File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/query.py", line 598, in has_results
    return compiler.has_results()
  File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1530, in has_results
    return bool(self.execute_sql(SINGLE))
  File "/usr/local/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1562, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 102, in execute
    return super().execute(sql, params)
  File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(
  File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.8/site-packages/django/db/utils.py", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/usr/local/lib/python3.8/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.8/site-packages/django_prometheus/db/common.py", line 69, in execute
    return super().execute(*args, **kwargs)
django.db.utils.ProgrammingError: missing FROM-clause entry for table "u0"
LINE 51: ...d" = ("dcim_location"."id") AND (__tree.tree_pk = U0.id)) OR...
                                                              ^
Traceback (most recent call last):
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/django_prometheus/db/common.py", line 69, in execute
    return super().execute(*args, **kwargs)
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/django/db/backends/mysql/base.py", line 75, in execute
    return self.cursor.execute(query, args)
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/MySQLdb/cursors.py", line 179, in execute
    res = self._query(mogrified_query)
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/MySQLdb/cursors.py", line 330, in _query
    db.query(q)
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/MySQLdb/connections.py", line 261, in query
    _mysql.connection.query(self, query)
MySQLdb.OperationalError: (1054, "Unknown column 'U0.id' in 'where clause'")

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/unittest/case.py", line 57, in testPartExecutor
    yield
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/unittest/case.py", line 623, in run
    self._callTestMethod(testMethod)
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/unittest/case.py", line 579, in _callTestMethod
    if method() is not None:
  File "/home/runner/work/nautobot/nautobot/nautobot/core/tests/test_filters.py", line 363, in test_filter_combined_name_exclude
    self.assertQuerysetEqualAndNotEmpty(qs, [self.locations[1]])
  File "/home/runner/work/nautobot/nautobot/nautobot/core/testing/mixins.py", line 229, in assertQuerysetEqualAndNotEmpty
    self.assertNotEqual(len(qs), 0, "Queryset cannot be empty")
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/django/db/models/query.py", line 380, in __len__
    self._fetch_all()
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/django/db/models/query.py", line 1881, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/django/db/models/query.py", line 91, in __iter__
    results = compiler.execute_sql(
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/django/db/models/sql/compiler.py", line 1562, in execute_sql
    cursor.execute(sql, params)
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/django/db/backends/utils.py", line 84, in _execute
    with self.db.wrap_database_errors:
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/django/db/utils.py", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/django_prometheus/db/common.py", line 69, in execute
    return super().execute(*args, **kwargs)
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/django/db/backends/mysql/base.py", line 75, in execute
    return self.cursor.execute(query, args)
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/MySQLdb/cursors.py", line 179, in execute
    res = self._query(mogrified_query)
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/MySQLdb/cursors.py", line 330, in _query
    db.query(q)
  File "/home/runner/.cache/pypoetry/virtualenvs/nautobot-zO_ZO7c3-py3.11/lib/python3.11/site-packages/MySQLdb/connections.py", line 261, in query
    _mysql.connection.query(self, query)
django.db.utils.OperationalError: (1054, "Unknown column 'U0.id' in 'where clause'")

The simplest query we've been able to reproduce the issue with so far is something along the lines of:

Location.objects.with_tree_fields().exclude(power_panels__isnull=False)

Here's the query on Postgres:

>>> print(Location.objects.with_tree_fields().exclude(power_panels__isnull=False).query)

    WITH RECURSIVE __tree (
        "tree_depth",
        "tree_path",
        "tree_ordering",
        "tree_pk"
    ) AS (
        SELECT
            0 AS tree_depth,
            array[T.id] AS tree_path,
            array["_name"]::text[] AS tree_ordering,
            T."id"
        FROM dcim_location T
        WHERE T."parent_id" IS NULL

        UNION ALL

        SELECT
            __tree.tree_depth + 1 AS tree_depth,
            __tree.tree_path || T.id,
            __tree.tree_ordering || "_name"::text,
            T."id"
        FROM dcim_location T
        JOIN __tree ON T."parent_id" = __tree.tree_pk
    )
    SELECT (__tree.tree_depth) AS "tree_depth", (__tree.tree_path) AS "tree_path", (__tree.tree_ordering) AS "tree_ordering", "dcim_location"."id", "dcim_location"."created", "dcim_location"."last_updated", "dcim_location"."_custom_field_data", "dcim_location"."parent_id", "dcim_location"."name", "dcim_location"."_name", "dcim_location"."location_type_id", "dcim_location"."status_id", "dcim_location"."tenant_id", "dcim_location"."description", "dcim_location"."facility", "dcim_location"."asn", "dcim_location"."time_zone", "dcim_location"."physical_address", "dcim_location"."shipping_address", "dcim_location"."latitude", "dcim_location"."longitude", "dcim_location"."contact_name", "dcim_location"."contact_phone", "dcim_location"."contact_email", "dcim_location"."comments" FROM "dcim_location" , "__tree" WHERE (NOT (EXISTS(
    WITH RECURSIVE __tree (
        "tree_depth",
        "tree_path",
        "tree_ordering",
        "tree_pk"
    ) AS (
        SELECT
            0 AS tree_depth,
            array[T.id] AS tree_path,
            array["_name"]::text[] AS tree_ordering,
            T."id"
        FROM dcim_location T
        WHERE T."parent_id" IS NULL

        UNION ALL

        SELECT
            __tree.tree_depth + 1 AS tree_depth,
            __tree.tree_path || T.id,
            __tree.tree_ordering || "_name"::text,
            T."id"
        FROM dcim_location T
        JOIN __tree ON T."parent_id" = __tree.tree_pk
    )
    SELECT 1 AS "a" FROM "dcim_powerpanel" U1 , "__tree" WHERE (U1."id" IS NOT NULL AND U1."location_id" = ("dcim_location"."id") AND (__tree.tree_pk = U0.id)) ORDER BY ("__tree".tree_ordering) ASC LIMIT 1)) AND (__tree.tree_pk = dcim_location.id)) ORDER BY ("__tree".tree_ordering) ASC

Some potentially relevant data points:

  • Without the tree fields, no error:
    >>> Location.objects.without_tree_fields().exclude(power_panels__isnull=False).exists()
    True
    
  • The issue is so far only seen with .exclude(); .filter() works just fine:
    >>> Location.objects.with_tree_fields().filter(power_panels__isnull=False).exists()
    False
    >>> Location.objects.with_tree_fields().filter(power_panels__isnull=True).exists()
    True
    
  • This specific query did not raise an exception in Django 3.2, but we've seen a few other cases even in 3.2 where a .exclude() reported a similar exception; those were resolved by adding a .without_tree_fields() to the queries in question. That may be our best option in this case as well? (We perhaps foolishly made our manager class default to with_tree_fields=True, but in cases like this we don't actually need the tree annotation...)

Thanks again! We'll keep digging into potential workarounds in our app in the meantime.

Filter by depth

This is more a help request than a bug...

Transitioning from django-mptt, all my tree needs are so far OK using django-tree-queries, except one : i work on pretty deep trees, and i need to filter on max depth. filter(tree_depth__lte=4) is not possible with django-tree-queries.

I use for now a filter(Q(parent__isnull=True) | Q(parent__parent__isnull=True) | (...)__isnull=True) but this is a bit ugly. I could also store the depth into model, but this is a step back from django-tree-queries

What would be the more django-tree-queries-way to do this ?

Thanks

A way to output tree fields on values() call

Hi,
Is there any way to get tree_path and tree_depth on values() call?
for example:

Node.objects.with_tree_fields().values()

or

Node.objects.with_tree_fields().values('tree_depth')

No attribute _with_tree_fields with ManyToManyRelation

If using a ManyToManyRelation with a model that is a TreeNode I get an error if I try to access the related set on the other model if I use version >= 0.4.1.

'ManyRelatedManager' object has no attribute '_with_tree_fields'

Setting the base_manager_name to use a TreeManager doesn't help either, because the Manager is not initialized through TreeQuerySet.as_manager() and thus the field is not there either. One way to fix it would be if the get_queryset Method of the TreeManager would use getattr instead of accessing the attribute directly.

return queryset.with_tree_fields() if getattr(self, '_with_tree_fields', False) else queryset

This way the related-manager would work, but not have the the tree fields by default.

Calling .distinct().count() on TreeQuerySet throwing ProgrammingError

Running into a strange problem here that may be due to the custom manager in django-tree-queries, so I wanted to report it here to you guys and see if you had any thoughts on the root cause.

I have a relatively complex project that I want to add the ability to track hierarchical relationships between an existing model type called Mapping. I reset the database prior to experimenting with your library so I wouldn't run into any migration issues. I'm running Postgres 14.

I updated my Mapping model to inherit from your TreeNode class:

class Mapping(TreeNode):

    objects = TreeQuerySet.as_manager(with_tree_fields=True)

    class Meta:
        base_manager_name = "objects"  # To ensure that objects have tree fields by default
        verbose_name = "Value <--> Entity Mapping"
        verbose_name_plural = "Value <--> Entity Mappings"

        constraints = [
            django.db.models.CheckConstraint(
                check=(Q(global_id_field__isnull=False, datasource_id_field__isnull=True,
                         attribute_field__isnull=True) |
                       Q(global_id_field__isnull=True, datasource_id_field__isnull=False,
                         attribute_field__isnull=True) |
                       Q(global_id_field__isnull=True, datasource_id_field__isnull=True, attribute_field__isnull=False)
                       ), name='only permit one type of field per mapping'),
            django.db.models.CheckConstraint(
                check=(Q(parent__isnull=False, remapped_by__isnull=False) |
                       Q(parent__isnull=True, remapped_by__isnull=True)),
                name='remappings must have parent field provided'),
            django.db.models.UniqueConstraint(fields=['value', 'entity', 'datasource_id_field'],
                                              condition=Q(datasource_id_field__isnull=False) & Q(parent__isnull=True),
                                              name='no duplicate values for datasource for datasource id (if not an old mapping)'),
            django.db.models.UniqueConstraint(fields=['value', 'entity', 'global_id_field'],
                                              condition=Q(global_id_field__isnull=False) & Q(parent__isnull=True),
                                              name='no duplicate values for global id (if not an old mapping)'),
            django.db.models.UniqueConstraint(fields=['value', 'entity', 'attribute_field'],
                                              condition=Q(attribute_field__isnull=False) & Q(parent__isnull=True),
                                              name='no duplicate attributes for global id (if not an old mapping)'),
            django.db.models.CheckConstraint(check=(~Q(approved_by__isnull=False, rejected_by__isnull=False)),
                                             name='cannot approve and reject a mapping')
        ]

    entity = django.db.models.ForeignKey(
        "Entity",
        related_name="mappings",
        default=None,
        on_delete=django.db.models.PROTECT
    )
    value = django.db.models.ForeignKey(
        "Value",
        default=1,
        null=False,
        blank=False,
        on_delete=django.db.models.PROTECT,
        related_name="maps_to",
        related_query_name="maps_to",
    )

    type = django.db.models.CharField(
        max_length=128,
        blank=False,
        null=False,
        choices=MAPPING_TYPES,
        default=MAPPING_TYPE_AUTOMATIC
    )
    global_id_field = django.db.models.ForeignKey(
        Global_Id_Field,
        related_name="mappings",
        default=None,
        null=True,
        blank=True,
        on_delete=django.db.models.PROTECT
    )
    datasource_id_field = django.db.models.ForeignKey(
        DataSource_Id_Field,
        related_name="mappings",
        default=None,
        null=True,
        blank=True,
        on_delete=django.db.models.PROTECT
    )
    attribute_field = django.db.models.ForeignKey(
        Attribute_Field,
        related_name="mappings",
        default=None,
        null=True,
        blank=True,
        on_delete=django.db.models.PROTECT
    )

    rejected_by = django.db.models.ForeignKey(
        get_user_model(),
        on_delete=django.db.models.PROTECT,
        null=True,
        blank=True,
        default=None,
        related_name="rejected_mappings",
        related_query_name="rejected_mapping",
    )
    approved_by = django.db.models.ForeignKey(
        get_user_model(),
        on_delete=django.db.models.PROTECT,
        null=True,
        blank=True,
        default=None,
        related_name="approved_mappings",
        related_query_name="approved_mapping",
    )
    remapped_by = django.db.models.ForeignKey(
        get_user_model(),
        on_delete=django.db.models.PROTECT,
        null=True,
        blank=True,
        default=None,
        related_name="remapped_mappings",
        related_query_name="remapped_mapping",
    )
    created_by = django.db.models.ForeignKey(
        get_user_model(),
        on_delete=django.db.models.PROTECT,
        null=True,
        blank=False,
        default=1,
        related_name="created_mappings",
        related_query_name="created_mapping",
    )

    # Timing variables
    approved = django.db.models.DateTimeField(default=timezone.now, null=True, blank=True)
    rejected = django.db.models.DateTimeField(default=timezone.now, null=True, blank=True)
    created = django.db.models.DateTimeField(default=timezone.now, null=True, blank=True)
    modified = django.db.models.DateTimeField(default=timezone.now, null=True, blank=True)

    # Override save to update modified on save
    def save(self, *args, **kwargs):

        """ On save, update timestamps """
        if not self.pk:
            self.created = timezone.now()
        self.modified = timezone.now()

        return super(Mapping, self).save(*args, **kwargs)

Migrations seem to work okay. So far so good. I loaded some data, and, again no issues. I have a certain resolver where I called a filter and then .distinct() and, finally, .count():

unresolved_count=Mapping.objects.filter(
            Q(approved_by__isnull=True) &
            Q(rejected_by__isnull=True)
        ).distinct().count()

This worked fine before I inherited TreeNode. Now, however, I get this very cryptic, low-level error:

In [13]: mappings.distinct().count()
---------------------------------------------------------------------------
UndefinedColumn                           Traceback (most recent call last)
/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py in _execute(self, sql, params, *ignored_wrapper_args)
     83             else:
---> 84                 return self.cursor.execute(sql, params)
     85 

UndefinedColumn: column subquery.col20 does not exist
LINE 26: ...."col17", "subquery"."col18", "subquery"."col19", "subquery"...
                                                              ^
HINT:  Perhaps you meant to reference the column "subquery.col2" or the column "subquery.col10".


The above exception was the direct cause of the following exception:

ProgrammingError                          Traceback (most recent call last)
<ipython-input-13-1f6ce208518f> in <module>
----> 1 mappings.distinct().count()

/usr/local/lib/python3.9/site-packages/django/db/models/query.py in count(self)
    410             return len(self._result_cache)
    411 
--> 412         return self.query.get_count(using=self.db)
    413 
    414     def get(self, *args, **kwargs):

/usr/local/lib/python3.9/site-packages/django/db/models/sql/query.py in get_count(self, using)
    524         obj = self.clone()
    525         obj.add_annotation(Count('*'), alias='__count', is_summary=True)
--> 526         number = obj.get_aggregation(using, ['__count'])['__count']
    527         if number is None:
    528             number = 0

/usr/local/lib/python3.9/site-packages/django/db/models/sql/query.py in get_aggregation(self, using, added_aggregate_names)
    509         outer_query.select_related = False
    510         compiler = outer_query.get_compiler(using)
--> 511         result = compiler.execute_sql(SINGLE)
    512         if result is None:
    513             result = [None] * len(outer_query.annotation_select)

/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py in execute_sql(self, result_type, chunked_fetch, chunk_size)
   1167             cursor = self.connection.cursor()
   1168         try:
-> 1169             cursor.execute(sql, params)
   1170         except Exception:
   1171             # Might fail for server-side cursors (e.g. connection closed)

/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py in execute(self, sql, params)
     96     def execute(self, sql, params=None):
     97         with self.debug_sql(sql, params, use_last_executed_query=True):
---> 98             return super().execute(sql, params)
     99 
    100     def executemany(self, sql, param_list):

/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py in execute(self, sql, params)
     64 
     65     def execute(self, sql, params=None):
---> 66         return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
     67 
     68     def executemany(self, sql, param_list):

/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py in _execute_with_wrappers(self, sql, params, many, executor)
     73         for wrapper in reversed(self.db.execute_wrappers):
     74             executor = functools.partial(wrapper, executor)
---> 75         return executor(sql, params, many, context)
     76 
     77     def _execute(self, sql, params, *ignored_wrapper_args):

/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py in _execute(self, sql, params, *ignored_wrapper_args)
     82                 return self.cursor.execute(sql)
     83             else:
---> 84                 return self.cursor.execute(sql, params)
     85 
     86     def _executemany(self, sql, param_list, *ignored_wrapper_args):

/usr/local/lib/python3.9/site-packages/django/db/utils.py in __exit__(self, exc_type, exc_value, traceback)
     88                 if dj_exc_type not in (DataError, IntegrityError):
     89                     self.wrapper.errors_occurred = True
---> 90                 raise dj_exc_value.with_traceback(traceback) from exc_value
     91 
     92     def __call__(self, func):

/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py in _execute(self, sql, params, *ignored_wrapper_args)
     82                 return self.cursor.execute(sql)
     83             else:
---> 84                 return self.cursor.execute(sql, params)
     85 
     86     def _executemany(self, sql, param_list, *ignored_wrapper_args):

ProgrammingError: column subquery.col20 does not exist
LINE 26: ...."col17", "subquery"."col18", "subquery"."col19", "subquery"...

Oddly, it's the combination of .distinct() and .count() that is causing the problem. I can call .distinct() on the TreeQuerySet without any issues. Likewise, I can call .count() without issue. It's only when I call distinct() and count() on the same TreeQuerySet that this error is thrown. I don't think I need the distinct() call here, but the fact I can't use it as I did before gave me pause. Wanted to get your thoughts on this to see if this is expected behavior? Any thoughts on why I'm seeing this?

EDIT: Looks related to node ordering per this post (and others like it): https://stackoverflow.com/questions/20582966/django-order-by-filter-with-distinct. I removed the ordering in the model Meta, but that still isn't solving my problem here.

If it's helpful, here's the SQL statement for the TreeQuerySet that's causing the issue:

WITH RECURSIVE __tree (
       "tree_depth",
       "tree_path",
       "tree_ordering",
       "tree_pk"
   ) AS (
       SELECT
           0 AS tree_depth,
           array[T.id] AS tree_path,
           array[id] AS tree_ordering,
           T."id"
       FROM mappings_mapping T
       WHERE T."parent_id" IS NULL

       UNION ALL

       SELECT
           __tree.tree_depth + 1 AS tree_depth,
           __tree.tree_path || T.id,
           __tree.tree_ordering || id,
           T."id"
       FROM mappings_mapping T
       JOIN __tree ON T."parent_id" = __tree.tree_pk
   )
   SELECT (__tree.tree_depth) AS "tree_depth", (__tree.tree_path) AS "tree_path", (__tree.tree_ordering) AS "tree_ordering", "mappings_mapping"."id", "mappings_mapping"."parent_id", "mappings_mapping"."entity_id", "mappings_mapping"."value_id", "mappings_mapping"."type", "mappings_mapping"."global_id_field_id", "mappings_mapping"."datasource_id_field_id", "mappings_mapping"."attribute_field_id", "mappings_mapping"."rejected_by_id", "mappings_mapping"."approved_by_id", "mappings_mapping"."remapped_by_id", "mappings_mapping"."created_by_id", "mappings_mapping"."approved", "mappings_mapping"."rejected", "mappings_mapping"."created", "mappings_mapping"."modified", "mappings_mapping"."association_start_date", "mappings_mapping"."association_end_date" FROM "mappings_mapping" , "__tree" WHERE ("mappings_mapping"."approved_by_id" IS NULL AND "mappings_mapping"."rejected_by_id" IS NULL AND (__tree.tree_pk = mappings_mapping.id)) ORDER BY ("__tree".tree_ordering) ASC

`.explain()` on tree-based models throws SQL `SyntaxError`

Steps to Reproduce:

  • Create a TreeNode model (example below uses Location for model name)
  • Perform migrations
  • Try to run Model.objects.all().explain()

Expected Outcome:
SQL explain output:

>>> Device.objects.all().explain()
'Sort  (cost=11.91..12.04 rows=50 width=1508)\n  Sort Key: _name\n  ->  Seq Scan on dcim_device  (cost=0.00..10.50 rows=50 width=1508)'

Actual Outcome:
psycopg2.errors.SyntaxError: syntax error at or near "EXPLAIN" / django.db.utils.ProgrammingError: syntax error at or near "EXPLAIN" error thrown:

>>> Location.objects.all().explain()
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.7/site-packages/django_prometheus/db/common.py", line 71, in execute
    return super().execute(*args, **kwargs)
psycopg2.errors.SyntaxError: syntax error at or near "EXPLAIN"
LINE 26:     EXPLAIN SELECT (__tree.tree_depth) AS "tree_depth", (__t...
             ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/usr/local/lib/python3.7/site-packages/django/db/models/query.py", line 817, in explain
    return self.query.explain(using=self.db, format=format, **options)
  File "/usr/local/lib/python3.7/site-packages/django/db/models/sql/query.py", line 575, in explain
    return '\n'.join(compiler.explain_query())
  File "/usr/local/lib/python3.7/site-packages/django/db/models/sql/compiler.py", line 1228, in explain_query
    result = list(self.execute_sql())
  File "/usr/local/lib/python3.7/site-packages/django/db/models/sql/compiler.py", line 1175, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/python3.7/site-packages/django/db/backends/utils.py", line 98, in execute
    return super().execute(sql, params)
  File "/usr/local/lib/python3.7/site-packages/cacheops/transaction.py", line 97, in execute
    result = self._no_monkey.execute(self, sql, params)
  File "/usr/local/lib/python3.7/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/usr/local/lib/python3.7/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/usr/local/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.7/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/usr/local/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.7/site-packages/django_prometheus/db/common.py", line 71, in execute
    return super().execute(*args, **kwargs)
django.db.utils.ProgrammingError: syntax error at or near "EXPLAIN"
LINE 26:     EXPLAIN SELECT (__tree.tree_depth) AS "tree_depth", (__t...

StringOrderedModel performance

I am using the StringOrderedModel to store around 70k objects with max depth around 6-8. When I try to find the ancestors of an object of 2 to maybe 3 ancestors it takes about 500ms. I tried to put an index on the 'name' field but that does not have a an impact. Is this a known behaviour or is there something I could do differently?

Is it feasible to support get queryset descendants or ancestors like features in django-mptt

django-mptt has a feature get_queryset_descendants which query the descendants of tree nodes represented by a queryset. However, TreeQuerySet.descendants method in django-tree-queries only support query descendants of a certain tree node. After doing some experiments, I find a way to implement such feature, like bellow:

def descendants(self, of, include_self=False):
    ...
    extra_where = " or ".join(['instr(__tree.tree_path, "{sep}{pk}{sep}") <> 0'.format(
                        pk=self.model._meta.pk.get_db_prep_value(pk(obj), connection),
                        sep=SEPARATOR,
                    ) for obj in of])
            queryset = self.with_tree_fields().extra(
                # NOTE! The representation of tree_path is NOT part of the API.
                where=[
                    # XXX This *may* be unsafe with some primary key field types.
                    # It is certainly safe with integers.
                    extra_where
                ]
            )

The idea is if of parameter is a queryset rather than a single model instance, changing the where clause to or.

The downside of this approach is we need an extra database query. I also have no idea if there is performance issue for a large queryset.

Another idea I come up with is inject a a subquery in CTE, like bellow:
change:

SELECT
            0 AS tree_depth,
            array[T.{pk}] AS tree_path,
            array[{order_by}] AS tree_ordering,
            T."{pk}"
        FROM {db_table} T
        WHERE T."{parent}" IS NULL

to:

SELECT
            0 AS tree_depth,
            array[T.{pk}] AS tree_path,
            array[{order_by}] AS tree_ordering,
            T."{pk}"
        FROM {db_table} T
        WHERE T."{parent}" in <<pks of queryset>>

but I also have no idea that this is feasible.

Support for Many-to-Many relationship

I am developing a tree structure with "detachable" sub-trees, and for what I have seen the current TreeNode and TreeNodeForeignKey are not suited for this.

Nested trees

To accomplish this kind of structure I need to use a Many-to-Many parent (and also I need to define some extra fields on the M2M relation-table).

I do not need to query the whole graph, just individual trees (which can intersect with each other as shown in figure).

In MPTT I would have used the TreeManyToManyField, but since that project is not maintained I would like to accomplish something similar with this library.

I undestrand that it is not an easy step but I would like to see if it is feasable.

Descending ordering is not handled

Example:

class Model(TreeNode):
    class Meta:
        ordering = ("-created_at",)

generates an SQL containing (sqlite):

 
     WITH RECURSIVE __tree(tree_depth, tree_path, tree_ordering, tree_pk) AS 
 (
         SELECT
             0 tree_depth,
             printf("\x1f%%s\x1f", id) tree_path,
             printf("\x1f%%020s\x1f", -created_at) tree_ordering, # maybe invalid SQL
             T."id" tree_pk
         FROM my_model_table T
         WHERE T."parent_id" IS NULL
 
         UNION ALL
 
         SELECT
             __tree.tree_depth + 1,
             __tree.tree_path || printf("%%s\x1f", T.id),
             __tree.tree_ordering || printf("%%020s\x1f", T.-created_at), # invalid SQL
             T."id"
         FROM my_model_table T
         JOIN __tree ON T."parent_id" = __tree.tree_pk
     )

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.