Feature or enhancement Proposal: We have a <co

The inspect implementation is very old. I think now we can just do <code class="notran

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Add a `co_lastlineno` in code object about cpython HOT 8 OPEN

gaogaotiantian commented on August 25, 2024

Add a `co_lastlineno` in code object

from cpython.

Comments (8)

iritkatriel commented on August 25, 2024

The inspect implementation is very old. I think now we can just do max(p[1] for p in code_obj.co_positions()), no?

from cpython.

gaogaotiantian commented on August 25, 2024

We can, but should we need to? co_positions could be large for a relatively large function, and I think it's pretty inefficient. If we have some fundamental reasons why we can't have the last line, we can of course fall back to that. But that is not obvious to users. If we have a simple and straightforward O(1) solution that only takes an integer space per code object, why not do it? One reason could be that co_lastlineno will not be used as frequently as co_firstlineno, but I think it's not an uncommon request to get the range of the code.

What might be the concerns here for co_lastlineno? The unnecessary new interface? Or the extra complication in the compiler? I agree the usage might not be imminent, otherwise there would be a lot of complaints, but the cost is pretty small too. If we really don't want to add an extra field to code object, I can understand. If it's the implementation detail, we have alternatives. For example, making it a property which caches the result of calculation from co_positions().

I just think co_lastlineno is very intuitive to users who need to know it.

from cpython.

iritkatriel commented on August 25, 2024

Adding a field to each code object is a cost too. The tradeoff needs to be justified.

The implementation in inspect can change anyway.

from cpython.

gaogaotiantian commented on August 25, 2024

Unlike objects like interpreter frame, code object is already huge - an extra integer field is basically nothing space wise. Of course, adding an extra field would mean more maintenance effort.

co_positions() is very inefficient. For example, it takes about 2ms to get the last line number of a function with 3k lines. For pdb, that means we either need to bear this (in run-time), or cache it somewhere. Caching may work but it will use more memory.

The introduction of co_lastlineno solves more than "getting the source code of a function", it makes a useful scenario very easy - is this line in this function. In pdb, the most useful case is at the "call" event of a frame, we can immediately know whether there's a breakpoint in it. If not, we can disable events in that frame.

It's not easy to cache this, because it would mean that we probably need to cache all the code objects and their last line numbers. It's not ideal to bear the overhead either - that's run-time, not in debugging session, it could make the debugging extremely slow.

It's possible that we can pinpoint the corresponding code object when the breakpoint is set by searching all the code objects created in that file to avoid the run-time line check, but lastlineno also helps there! How else could you determine whether a breakpoint belongs to a certain code object?

This value is very helpful for pdb and other dev tools. More importantly, I think the cost is minimal (except for introducing a new field).

from cpython.

iritkatriel commented on August 25, 2024

@markshannon

from cpython.

brandtbucher commented on August 25, 2024

The introduction of co_lastlineno solves more than "getting the source code of a function", it makes a useful scenario very easy - is this line in this function.

Is this true? I'm thinking about the case where you have a nested function definition:

def foo():      # 1
    ...         # 2
    def bar():  # 3
        ...     # 4
    ...         # 5

Is line 4 "in" foo? Does this match what pdb expects? It seems to me that scanning co_positions would be a more reliable way of telling whether a line event will fire for a given code object (and I'd argue that 3k-line-functions are not the common case here).

from cpython.

brandtbucher commented on August 25, 2024

It's not easy to cache this, because it would mean that we probably need to cache all the code objects and their last line numbers.

Code objects are weak-referenceable, so pdb could in theory maintain a weakref.WeakKeyDictionary[types.CodeType, frozenset[int]] mapping code objects to line events they may create. Doesn't seem too nasty.

(Just to be clear, I'm not super opposed to adding this member if we decide that there's a real need for it. But I'm not yet convinced that it's a game-changer for any of the described use-cases.)

from cpython.

gaogaotiantian commented on August 25, 2024

It's okay for pdb to have a false positive. It will still be much better. Also co_positions() won't help in that case, the inner function will still be listed as part of the outer function. If you want the precise code object, you need to get all the code objects inside the function and build a tree-like data structure to find it. co_lines() might be helpful?

There has been reports about long programs being "undebuggable" (that involves another issue though) so we should care about that. People sometimes put data in their program which could be huge.

I can do this in pdb if an extra field is considered unnecessary. However, from my perspective, the major issue with co_positions is that it's not obvious. You need to be very familiar with the code object to know what a co_positions is and how to use that to get the last line number of the function. It's also not listed in docs of inspect where people often refer to. co_lastlineno is much more obvious.

Now that I found co_lines(), that is more efficient than co_positions() and it actually helps with nested functions, I might want to try that in pdb.

from cpython.

Add a `co_lastlineno` in code object about cpython HOT 8 OPEN

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent