Coder Social home page Coder Social logo

Comments (6)

simonw avatar simonw commented on June 6, 2024

That's odd, I wrote a test for this just now and it passes already:

def test_transform_preserves_rowids(fresh_db):
    # Create a rowid table
    fresh_db["places"].insert_all(
        (
            {"name": "Paris", "country": "France"},
            {"name": "London", "country": "UK"},
            {"name": "New York", "country": "USA"},
        ),
    )
    assert fresh_db["places"].use_rowid
    previous_rows = list(
        tuple(row) for row in fresh_db.execute("select rowid, name from places")
    )
    # Transform it
    fresh_db["places"].transform(column_order=("country", "name"))
    # Should be the same
    next_rows = list(
        tuple(row) for row in fresh_db.execute("select rowid, name from places")
    )
    assert previous_rows == next_rows

So maybe I'm wrong about the cause of that bug?

from sqlite-utils.

simonw avatar simonw commented on June 6, 2024

I tried bumping that up to 10,000 rows instead of just 3 but the test still passed.

from sqlite-utils.

simonw avatar simonw commented on June 6, 2024

I just noticed that the table where I encountered this bug wasn't actually a rowid table after all - it had an id column that was a text primary key.

The reason the rowid was important is that's how the FTS mechanism in Datasette relates FTS entries to their rows.

But I tried this test and it passed, too:

def test_transform_preserves_rowids(fresh_db):
    fresh_db["places"].insert_all(
        [
            {"id": "1", "name": "Paris", "country": "France"},
            {"id": "2", "name": "London", "country": "UK"},
            {"id": "3", "name": "New York", "country": "USA"},
        ],
        pk="id",
    )
    previous_rows = list(
        tuple(row) for row in fresh_db.execute("select rowid, id, name from places")
    )
    # Transform it
    fresh_db["places"].transform(column_order=("country", "name"))
    # Should be the same
    next_rows = list(
        tuple(row) for row in fresh_db.execute("select rowid, id, name from places")
    )
    assert previous_rows == next_rows

from sqlite-utils.

simonw avatar simonw commented on June 6, 2024

Oh! Maybe the row ID preservation here is a coincidence because the tables are created from scratch and count 1, 2, 3.

If I delete a row from the table and then insert some more - breaking the rowid sequence - it might show the bug.

from sqlite-utils.

simonw avatar simonw commented on June 6, 2024

Yes! That recreated the bug:

>       assert previous_rows == next_rows
E       AssertionError: assert equals failed
E         [                                                                [                                                               
E           (1, '1', 'Paris'),                                               (1, '1', 'Paris'),                                            
E           (3, '3', 'New York'),                                            (2, '3', 'New York'),                                         
E           (4, '4', 'London'),                                              (3, '4', 'London'),                                           
E         ]                                                       ...
E         

from sqlite-utils.

simonw avatar simonw commented on June 6, 2024

In working on this I learned that rowid values in SQLite are way less stable than I had thought - in particular, they are often entirely rewritten on a VACUUM:

https://www.sqlite.org/lang_vacuum.html#how_vacuum_works

The VACUUM command may change the ROWIDs of entries in any tables that do not have an explicit INTEGER PRIMARY KEY.

So this fix wasn't as valuable as I thought. I need to move away from ever assuming that a rowid is a useful foreign key for anything.

from sqlite-utils.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.