Coder Social home page Coder Social logo

Comments (15)

mattrobenolt avatar mattrobenolt commented on May 25, 2024 1

Yeah, I can't explain why MySQL behaves that way either. Or why mysql2 infers this. But they definitely do extra work to try and coerce numbers. Maybe it's something we can explore too? But that seems like it'd have unexpected behaviors too. When you do cross the boundary, now you get a string.

I can easily see the case where you have a bigint primary key, everything is working for a long time. Eventually you cross the threshold and now you get strings back and code is broken. I'm not sure that'd be a good experience either.

I'd personally recommend, like you suggested, a custom cast function or something for you to explicitly handle this how you want to.

I think we as a library are handling this as correctly as we can in JavaScript until we have some evidence otherwise.

from database-js.

mattrobenolt avatar mattrobenolt commented on May 25, 2024

I suspect it's because your schema contains bigints or some other unsigned 64 bit integer which isn't representable in JavaScript.

We have an example specifically for doing your own casting to a JS BigInt type. https://github.com/planetscale/database-js?tab=readme-ov-file#custom-type-casting-function

The proposal to use BigInts directly was rejected. #90

from database-js.

iheanyi avatar iheanyi commented on May 25, 2024

https://github.com/planetscale/database-js?tab=readme-ov-file#custom-type-casting-function

As linked above, you can use a custom casting function for addressing this. Rather than forcing usage of BigInt, you can opt into it.

from database-js.

twiddler avatar twiddler commented on May 25, 2024

@mattrobenolt There's no schema involved. I am doing

SELECT * FROM
  (
    SELECT 0 AS "num_column"
    UNION ALL
    SELECT 0 AS "num_column"
  ) AS "foo"

(Sorry if that wasn't clear from the code. I originally posted this on kysely-planetscale.)

What puzzles me is that the type changes depending on how many lines the result set has. It'd be fine if it was always BigInts passed as strings. But if it's sometimes strings, sometimes numbers it makes it really hard to trace errors.

from database-js.

mattrobenolt avatar mattrobenolt commented on May 25, 2024

Oh, I did miss that. Lemme see what MySQL is doing here.

from database-js.

mattrobenolt avatar mattrobenolt commented on May 25, 2024

Ok, so.

$ curl -s -u $MYSQL_USER:$MYSQL_PWD https://$MYSQL_HOST/psdb.v1alpha1.Database/Execute -H'content-type: application/json' -d'{"query": "SELECT * FROM ( SELECT 0 AS `num_column` ) AS `foo`"}' | jq .result.fields[0]
{
  "name": "num_column",
  "type": "INT32",
  "table": "foo",
  "orgName": "num_column",
  "columnLength": 2,
  "charset": 63,
  "flags": 32769
}
$ curl -s -u $MYSQL_USER:$MYSQL_PWD https://$MYSQL_HOST/psdb.v1alpha1.Database/Execute -H'content-type: application/json' -d'{"query": "SELECT * FROM ( SELECT 0 AS `num_column` UNION ALL SELECT 0 AS `num_column` ) AS `foo`"}' | jq .result.fields[0]
{
  "name": "num_column",
  "type": "INT64",
  "table": "foo",
  "orgName": "num_column",
  "columnLength": 2,
  "charset": 63,
  "flags": 32769
}

When selecting 1 row, the type we get back is INT32, and when 2 rows, it's INT64.

This is... very interesting. I'm sure it's clear to you now at least why we behave this way in the driver, since an INT32 is safely an integer, but INT64, we need to keep it as a string or convert to BigInt, but either way, we're getting a different type.

I'm going to dig in internally, this might be more of a Vitess this? I'm not sure why it'd behave slightly differently.

Just to be explicit, the queries I'm testing are:

SELECT * FROM (
  SELECT 0 AS `num_column`
) AS `foo`

and

SELECT * FROM (
  SELECT 0 AS `num_column`
  UNION ALL
  SELECT 0 AS `num_column`
) AS `foo`

from database-js.

mattrobenolt avatar mattrobenolt commented on May 25, 2024

Ok, so this is pretty wild. Against a vanilla mysqld 8.0.34, this is what happens:

mysql> SELECT * FROM ( SELECT 0 AS `num_column` ) AS `foo`;
Field   1:  `num_column`
Catalog:    `def`
Database:   ``
Table:      `foo`
Org_table:  ``
Type:       LONG
Collation:  binary (63)
Length:     2
Max_length: 1
Decimals:   0
Flags:      NOT_NULL NUM


+------------+
| num_column |
+------------+
|          0 |
+------------+
1 row in set (0.00 sec)

mysql> SELECT * FROM ( SELECT 0 AS `num_column` UNION ALL SELECT 0 AS `num_column` ) AS `foo`;
Field   1:  `num_column`
Catalog:    `def`
Database:   ``
Table:      `foo`
Org_table:  ``
Type:       LONGLONG
Collation:  binary (63)
Length:     2
Max_length: 1
Decimals:   0
Flags:      NOT_NULL NUM


+------------+
| num_column |
+------------+
|          0 |
|          0 |
+------------+
2 rows in set (0.01 sec)

I can't explain this behavior, but clearly MySQL wants to make the second query LONGLONG and the first just a LONG. Which is 64bit vs 32bit integer.

So both PlanetScale and database-js are handling this query correctly to how MySQL behaves.

I think in this case, the query is just too ambiguous and might benefit from a specific CAST() to yield the correct type, but I sorta suspect this query is just trying to exercise some behavior since this isn't something I think you'd do in practice?

But whatever is happening, we are handling this correctly according to how MySQL behaves.

If anything, the best we'd be able to do here is handle this type as a BigInt with a custom cast function, but then they'd still be different types. One as a normal JS number and other as a BigInt.

from database-js.

twiddler avatar twiddler commented on May 25, 2024

this isn't something I think you'd do in practice?

That rather synthetic example is the essence of the following use case:

I am left joining two tables A and B. Since it's a left join, there might be rows in the result set that have NULL values for B.foo. For those rows, I want to know whether B.foo is NULL because there was 1) no match, or 2) B.foo is actually NULL. I solved it like this:

SELECT
  B.foo,
  B.aId IS NOT NULL as matchFound
FROM
  A LEFT JOIN B
ON
  A.id = B.aId;

The IS NOT NULL evaluates to either 0 or 1, which--contrary to what I expected but just like the query in my first post--gets handed back as a number if there's just one row, and strings when there's more rows.

I think PlanetScale should not do any extra lifting here, but just behave like vanilla MySQL. It didn't for me in my example. I tested against community-mysql-server version 8.0.36 (see https://docs.fedoraproject.org/en-US/quick-docs/installing-mysql-mariadb/).

from database-js.

mattrobenolt avatar mattrobenolt commented on May 25, 2024

This example is now different though than what you first produced. The first examples yield a "derived table" which is what's behaving oddly.

Can you give me a schema that you're working with for this latter example actually doing a LEFT JOIN? I'm curious to see the data types Vitess is returning vs MySQL. It might be that we're doing something wrong here.

from database-js.

mattrobenolt avatar mattrobenolt commented on May 25, 2024

I tried to reproduce with the following:

CREATE TABLE A (
    id BIGINT UNSIGNED AUTO_INCREMENT,
    PRIMARY KEY(id)
);

CREATE TABLE B (
    id BIGINT UNSIGNED AUTO_INCREMENT,
    aId BIGINT UNSIGNED,
    foo TEXT,
    PRIMARY KEY(id),
    KEY(aId)
);

INSERT INTO A VALUES (NULL);
INSERT INTO A VALUES (NULL);
INSERT INTO B VALUES (NULL, 1, 'hello');

SELECT
    B.foo,
    B.aId IS NOT NULL AS matchFound
FROM
    A LEFT JOIN B
ON
    A.id = B.aId;

This setup seems to mirror what you're trying to accomplish.

Against PlanetScale, I get:

mysql> SELECT
    ->     B.foo,
    ->     B.aId IS NOT NULL AS matchFound
    -> FROM
    ->     A LEFT JOIN B
    -> ON
    ->     A.id = B.aId;
Field   1:  `foo`
Catalog:    `def`
Database:   `mattdb`
Table:      `B`
Org_table:  `B`
Type:       BLOB
Collation:  utf8mb4_0900_ai_ci (255)
Length:     262140
Max_length: 5
Decimals:   0
Flags:      BLOB

Field   2:  `matchFound`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       LONGLONG
Collation:  binary (63)
Length:     1
Max_length: 1
Decimals:   0
Flags:      NOT_NULL BINARY NUM

And against vanilla MySQL 8.0.34:

mysql> SELECT
    ->     B.foo,
    ->     B.aId IS NOT NULL AS matchFound
    -> FROM
    ->     A LEFT JOIN B
    -> ON
    ->     A.id = B.aId;
Field   1:  `foo`
Catalog:    `def`
Database:   `mysql`
Table:      `B`
Org_table:  `B`
Type:       BLOB
Collation:  utf8mb4_general_ci (45)
Length:     262140
Max_length: 5
Decimals:   0
Flags:      BLOB

Field   2:  `matchFound`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       LONGLONG
Collation:  binary (63)
Length:     1
Max_length: 1
Decimals:   0
Flags:      NOT_NULL BINARY NUM


+-------+------------+
| foo   | matchFound |
+-------+------------+
| hello |          1 |
| NULL  |          0 |
+-------+------------+
2 rows in set (0.00 sec)

If I delete a row, so there's only 1:

--- PlanetScale
mysql> delete from A where id=2;
Query OK, 1 row affected (0.11 sec)

mysql> SELECT
    ->     B.foo,
    ->     B.aId IS NOT NULL AS matchFound
    -> FROM
    ->     A LEFT JOIN B
    -> ON
    ->     A.id = B.aId;
Field   1:  `foo`
Catalog:    `def`
Database:   `mattdb`
Table:      `B`
Org_table:  `B`
Type:       BLOB
Collation:  utf8mb4_0900_ai_ci (255)
Length:     262140
Max_length: 5
Decimals:   0
Flags:      BLOB

Field   2:  `matchFound`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       LONGLONG
Collation:  binary (63)
Length:     1
Max_length: 1
Decimals:   0
Flags:      NOT_NULL BINARY NUM


+-------+------------+
| foo   | matchFound |
+-------+------------+
| hello |          1 |
+-------+------------+
1 row in set (0.10 sec)
--- MySQL 8.0.34
mysql> delete from A where id=2;
Query OK, 1 row affected (0.02 sec)

mysql> SELECT
    ->     B.foo,
    ->     B.aId IS NOT NULL AS matchFound
    -> FROM
    ->     A LEFT JOIN B
    -> ON
    ->     A.id = B.aId;
Field   1:  `foo`
Catalog:    `def`
Database:   `mysql`
Table:      `B`
Org_table:  `B`
Type:       BLOB
Collation:  utf8mb4_general_ci (45)
Length:     262140
Max_length: 5
Decimals:   0
Flags:      BLOB

Field   2:  `matchFound`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       LONGLONG
Collation:  binary (63)
Length:     1
Max_length: 1
Decimals:   0
Flags:      NOT_NULL BINARY NUM


+-------+------------+
| foo   | matchFound |
+-------+------------+
| hello |          1 |
+-------+------------+
1 row in set (0.00 sec)

In all cases, matchFound is a LONGLONG which is a 64 bit integer.

Am I missing something here? This would yield a string in all cases through our driver.

from database-js.

jacobwgillespie avatar jacobwgillespie commented on May 25, 2024

@twiddler one thing to watch for: I think the mysql2 client is going to type-cast values as either numbers or strings based on the length of the string returned, basically if the string value of the number is short enough, it'll automatically parse it into a number:

https://github.com/sidorares/node-mysql2/blob/225a2bdb6e16ee201121d4a8d36508a792ff6e54/lib/packets/packet.js#L433-L487

There's some pool options that control that behavior (e.g. setting bigNumberStrings: true will mean that the LONGLONG always gets returned as a string), but that might be the source of the difference here.

from database-js.

mattrobenolt avatar mattrobenolt commented on May 25, 2024

Yeah, I had considered something like this where we convert to an integer if we know it's within the safe bounds, but hard to say objectively what's better, because then you just get a string at some point when you've crossed the threshold and your code doesn't anticipate that.

from database-js.

mattrobenolt avatar mattrobenolt commented on May 25, 2024

I think in this case tho, the answer is probably supplying a custom cast function that handles this how you like. But from my understanding of everything here, the types are the same on the protocol level. We just choose to handle LONGLONG types a strings and hands-off on guessing what to do with it, while mysql2 has knobs to tune the behavior.

I think this is my complete understanding at this point.

from database-js.

twiddler avatar twiddler commented on May 25, 2024

Thanks for all your hard work, and sorry for getting back so late. 🙏

the mysql2 client is going to type-cast values as either numbers or strings based on the length of the string returned

That would have explained the behaviour, but even with

      pool: createPool({
        database: "mysql",
        host: "localhost",
        user: "root",
        password: "",
+       bigNumberStrings: true,
      }),

the tests in my opening post still pass:

▶ mysql
  ✔ one row (30.003698ms)
  ✔ two rows (1.971407ms)
▶ mysql (32.916212ms)

Expecting strings fails:

-   assert.deepStrictEqual(q, [{ num_column: 0 }, { num_column: 0 }]);
+   assert.deepStrictEqual(q, [{ num_column: "0" }, { num_column: "0" }]);
▶ mysql
  ✔ one row (33.345367ms)
  ✖ two rows (5.137187ms)
    AssertionError [ERR_ASSERTION]: Expected values to be strictly deep-equal:

Can you give me a schema that you're working with for this latter example actually doing a LEFT JOIN?

CREATE TABLE `A` (
  `id` varchar(191) NOT NULL,
  PRIMARY KEY (`id`),
) ENGINE InnoDB,
  CHARSET utf8mb4,
  COLLATE utf8mb4_0900_ai_ci;

CREATE TABLE `B` (
  `id` varchar(191) NOT NULL,
  `aId` varchar(191) NOT NULL,
  `foo` enum('inherit', 'true', 'false') NOT NULL DEFAULT 'inherit',
  PRIMARY KEY (`id`, `aId`)
) ENGINE InnoDB,
  CHARSET utf8mb4,
  COLLATE utf8mb4_0900_ai_ci;

And the query (for convenience, same as above):

SELECT
  B.foo,
  B.aId IS NOT NULL as matchFound
FROM
  A LEFT JOIN B
ON
  A.id = B.aId;

from database-js.

twiddler avatar twiddler commented on May 25, 2024

So while I am still confused why mysql2 didn't behave as expected when bigNumberStrings: true, I am sure that when it comes to PlanetScale I can solve my use case with a custom cast function. (In my application, I see no reason to just Number(...) any numeric value and then assert it is <= Number.MAX_SAFE_INTEGER--which is ridiculously huge anyway.)

I can't figure out why MySQL treats IS NOT NULL as an INT64, but that question is certainly out of scope here. CAST does not accept any smaller number type, so I'll select B.aId and z.transform null to false and everything else to true.

Thanks for all your input. This was definitely an interesting ride. 🎢

from database-js.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.