Coder Social home page Coder Social logo

Comments (8)

vstinner avatar vstinner commented on September 18, 2024

Which parts of the code use interned strings?

from cpython.

neonene avatar neonene commented on September 18, 2024

If you mean _datetimemodule.c, _datetime_exec() is a place where assertion errors occur, trying to access dict keys (invalid interned strings).

PyObject *d = PyDateTime_DeltaType.tp_dict;
DATETIME_ADD_MACRO(d, "resolution", new_delta(0, 0, 1, 0));
DATETIME_ADD_MACRO(d, "min", new_delta(-MAX_DELTA_DAYS, 0, 0, 0));
DATETIME_ADD_MACRO(d, "max",
new_delta(MAX_DELTA_DAYS, 24*3600-1, 1000000-1, 0));
/* date values */
d = PyDateTime_DateType.tp_dict;
DATETIME_ADD_MACRO(d, "min", new_date(1, 1, 1));
DATETIME_ADD_MACRO(d, "max", new_date(MAXYEAR, 12, 31));
DATETIME_ADD_MACRO(d, "resolution", new_delta(1, 0, 0, 0));
/* time values */
d = PyDateTime_TimeType.tp_dict;
DATETIME_ADD_MACRO(d, "min", new_time(0, 0, 0, 0, Py_None, 0));
DATETIME_ADD_MACRO(d, "max", new_time(23, 59, 59, 999999, Py_None, 0));
DATETIME_ADD_MACRO(d, "resolution", new_delta(0, 0, 1, 0));
/* datetime values */
d = PyDateTime_DateTimeType.tp_dict;
DATETIME_ADD_MACRO(d, "min",
new_datetime(1, 1, 1, 0, 0, 0, 0, Py_None, 0));
DATETIME_ADD_MACRO(d, "max", new_datetime(MAXYEAR, 12, 31, 23, 59, 59,
999999, Py_None, 0));
DATETIME_ADD_MACRO(d, "resolution", new_delta(0, 0, 1, 0));
datetime_state *st = STATIC_STATE();
if (init_state(st) < 0) {
goto error;
}
/* timezone values */
d = PyDateTime_TimeZoneType.tp_dict;
if (PyDict_SetItemString(d, "utc", st->utc) < 0) {
goto error;
}
/* bpo-37642: These attributes are rounded to the nearest minute for backwards
* compatibility, even though the constructor will accept a wider range of
* values. This may change in the future.*/
/* -23:59 */
PyObject *min = create_timezone_from_delta(-1, 60, 0, 1);
DATETIME_ADD_MACRO(d, "min", min);
/* +23:59 */
PyObject *max = create_timezone_from_delta(0, (23 * 60 + 59) * 60, 0, 0);
DATETIME_ADD_MACRO(d, "max", max);

from cpython.

vstinner avatar vstinner commented on September 18, 2024

With a debug build, I confirm that an assertion fails at the first PyDict_SetItemString() call on PyDateTime_DeltaType.tp_dict, on the second _datetime_exec() call.

PyObject *d = PyDateTime_DeltaType.tp_dict;
DATETIME_ADD_MACRO(d, "resolution", new_delta(0, 0, 1, 0));

For me, the root issue is that datetime doesn't use heap types but static types.

The symptom is that PyDict_SetItemString() calls PyUnicode_InternInPlace() and so stores interned strings, but interned strings are cleared at Python exit: in Py_Finalize().

Interned strings are just one symptom, but I expect other side effects of reusing static types between two Python executions.

IMO only isolated extensions are safe to be used with the "sub-interpreter" case:

  • Use multiple interpreters using sub-interpreters
  • Or init/finalize Python multiple times which is a similar but different case

from cpython.

vstinner avatar vstinner commented on September 18, 2024

Initialize/Finalize Python multiple time and import datetime each time lead to a memory corruption

That's one way to see the issue. Another way to see is that Python doesn't support datetime with "sub-interpreters" (see my previous message). This issue is about supporting sub-interpreters in datetime :-)

from cpython.

neonene avatar neonene commented on September 18, 2024

For me, the root issue is that datetime doesn't use heap types but static types.

Yes, but you prevented me from the heap types conversion at #103092 (comment).

Interned strings are just one symptom, but I expect other side effects of reusing static types between two Python executions.

The main and non-isolated subinterpreters share the same module, which avoids redundant initializations. And many objects in a static type are carried over except interned strings. (3.12 needs PR #118618 to carry over static types)

What case are you expecting?

IMO only isolated extensions are safe to be used with the "sub-interpreter" case:

Isolated sub-interpreters are not allowed to load a single-phase init module.

from cpython.

neonene avatar neonene commented on September 18, 2024

Isolated sub-interpreters are not allowed to load a single-phase init module.

See: PEP 554, C-API

That's also why PEP-687 needs to be completed.

from cpython.

vstinner avatar vstinner commented on September 18, 2024

Yes, but you prevented me from the heap types conversion at #103092 (comment).

Right, I am afraid of regression. We should do the conversion in a specific order, step by step, to limit risk and be able to rollback if everything goes wrong.

What case are you expecting?

Objects must not be shared between interpreters, including types. Each interpreter should have its own types to prevent any race condition / concurrent accesses, especially when each interpreter has its own GIL. I don't recall details.

It doesn't matter much. We both agree to isolate the extension.

My proposed plan: #117398 (comment)

from cpython.

neonene avatar neonene commented on September 18, 2024

Can the isolation be backported to 3.13? Your concern for the sub-interpreters is another specific issue, which is not confirmed on release builds right now.

from cpython.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.