Coder Social home page Coder Social logo

Comments (17)

tim-one avatar tim-one commented on May 28, 2024 2

A few lines down we have:

        str_xc = str(xc)

So we could move that up, and change the test to:

        if len(str_sc) > p:

The downside is that if xc is very large, conversion to string is expsnive. But str(2) is pretty fast ;-)

from cpython.

eendebakpt avatar eendebakpt commented on May 28, 2024 2

A few lines down we have:

        str_xc = str(xc)

So we could move that up, and change the test to:

        if len(str_sc) > p:

The downside is that if xc is very large, conversion to string is expsnive. But str(2) is pretty fast ;-)

@tim-one Your test is not completely equivalent. For p=2 and xc=100 we have len(str_xc) > p true, but xc > 10**p false. Maybe this condition can never be reached because of the checks before, but that is not immediately clear to me. I checked the tests in test_decimal and there a no tests that covers the difference between the two tests above.

A pragmatic solution could be to modify the test to len(str(xc-1)) > p, but that is a bit slower because now both str(xc) and str(xc-1) have to be calculated.

from cpython.

nineteendo avatar nineteendo commented on May 28, 2024 1

Maybe we should limit the precision to a reasonable level (not 1 Quintillion) when using the pure python implementation?

from cpython.

nineteendo avatar nineteendo commented on May 28, 2024

Hmm, it already takes too long for ctx.prec = 999_999_999. You don't need to set the value that high.

from cpython.

cfbolz avatar cfbolz commented on May 28, 2024

I disagree. The pure python implementation is plenty fast enough for a lot of purposes (it's in fact the default implementation in PyPy). And the result of 2**117 is not a very large number. So doing this computation should be instant and not defend on whether the precision is big or totally unreasonably big.

All we need to do is fix the logic in _power_exact somehow.

from cpython.

nineteendo avatar nineteendo commented on May 28, 2024

Hmm, can't we use math.log10() instead? No, because it's not precise.

-if xc > 10**p:
+if log10(xc) > p

from cpython.

nineteendo avatar nineteendo commented on May 28, 2024

Someone will have to find another workaround.

from cpython.

eendebakpt avatar eendebakpt commented on May 28, 2024

We could use log10 to filter out the extreme cases:

-if xc > 10**p:
+if (log10(xc) > p - 5) or xc > 10**p:

(the value 5 could probably be picked better, but it should be high enough to make this robust for any rounding errors)

Not a very elegant solution, so I will try to find a better approach

from cpython.

eendebakpt avatar eendebakpt commented on May 28, 2024

We can check how the c implementation handles this case and adopt a similar strategy.

from cpython.

eendebakpt avatar eendebakpt commented on May 28, 2024

I like the suggestion by @tim-one. If performance is a concern, we can create a simple lru cache (str(xc) might be big and expensive, but len(str(xc)) ends up in the cache, so the cache will be very small)

from cpython.

serhiy-storchaka avatar serhiy-storchaka commented on May 28, 2024

A few lines above we have m > p*100//_log10_lb(xc). It is the same condition (after transforming xc), but more exact. We perhaps can remove it or replace it with a post-conversion check, but then we will fall in other problem: #118164 (comment).

from cpython.

tim-one avatar tim-one commented on May 28, 2024

Yes, a positive int is representable with p digits in decimal floating-point iff its decimal string representation has no more than p digits, or the string matches the regular expression r"^10*$". There are many ways to code the latter test, but actually using re for it would still be approximately infinitely 😉 faster than computing 10**p . Alternatives include obscurities like (untested):

(len(sc) > p
 and sc.startswith('1')
 and all(ch == '0' for ch in islice(sc, 1, None)))

EDIT: although that test will pass no matter how many trailing zeroes there are - even quadrillions. But this function isn't given any information about the maximum exponent allowed.

from cpython.

tim-one avatar tim-one commented on May 28, 2024

Although I don't think the given case can happen. Very early on the code divides 10 out of xc as often as possible.

        while xc % 10 == 0:
            xc //= 10
            xe += 1

So an input xc of 100 (or any other power of 10 > 1) will have long ago been reduced to 1. And all such cases are then handled by the earlier:

        if xc == 1:

branch. So I don't think my original suggestion needs any changes.

from cpython.

tim-one avatar tim-one commented on May 28, 2024

But ... later code can change xc again in various ways, so maybe it can become 1 again. This code gives me a headache 😉. My

... has no more than p digits, or the string matches the regular expression r"^10*$"

should be bulletproof regardless. Note too that there's another instance of the problematic 10**p much earlier in the code:

            if xc >= 10**p:
                return None
            xe = -e-xe
            return _dec_from_triple(0, str(xc), xe)

and again shortly followed by a str(xc).

My guess is the when code reaches either of these blocks, xc will almost always be representable, so the "early-out" test isn't actually saving any expense (the early-out test will fail, and we'll go on to compute the string anyway).

In the context of this function's use in CPython's int -> str conversion, there's no question about that: all early-out tests in this function will fail, because int -> str conversion only ever asks for exactly representable powers.

from cpython.

serhiy-storchaka avatar serhiy-storchaka commented on May 28, 2024

There are many ways to check that xc is exactly 10**p. For example, str(xc) == '1' + '0' * p. But this use case is not currently covered by tests, so if we make an error in the check code (for example remove it as a dead code), the tests will still pass. I am not even sure that it is not a dead code, that xc can be exactly 10**p at this point.

from cpython.

tim-one avatar tim-one commented on May 28, 2024

We can check how the c implementation handles this case and adopt a similar strategy.

Heh. Stefan Krah (libmpdec's author) was using _pydecimal.py to prototype an exact-power implementation. I downloaded the current source for libmpdec today, and this is its implementation of "exact power" (from libmpdec\mpdecimal.c):

/*
 * TODO: Implement algorithm for computing exact powers from decimal.py.
 * In order to prevent infinite loops, this has to be called before
 * using Ziv's strategy for correct rounding.
 */
static int
_mpd_qpow_exact(mpd_t *result, const mpd_t *base, const mpd_t *exp,
                const mpd_context_t *ctx, uint32_t *status)
{
    return 0;
}

So "similar strategy" would be to comment out Python's exact-power function and never call it 🤣

Which holds some attraction for me - but carries risk of changing results in some cases.

from cpython.

tim-one avatar tim-one commented on May 28, 2024

Current main branch no longer tries to compute 10**p, so closing this issue.

from cpython.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.