Coder Social home page Coder Social logo

Comments (10)

littledan avatar littledan commented on June 22, 2024

Could you say why (beyond that it's "strange")? I'd like to make that iterator return objects that are not invalidated by future next() calls.

from proposal-intl-segmenter.

gibson042 avatar gibson042 commented on June 22, 2024

Well, none of the ECMAScript built-in iterators expose any state at all beyond a next method returning ephemeral results, and that pattern should only be broken with good cause. There's a rough analog in the form of lastIndex on RegExp instances (which predates ES iterators), but even that is limited to next start position and includes no information about the last match.

What makes this API so special that it demands auto-memoization of next results, and only covering a subset of their data at that?

from proposal-intl-segmenter.

littledan avatar littledan commented on June 22, 2024

One reason is that we need preceding and following methods, see #9 . Another is performance concerns (I am having trouble finding that thread). Please, you can disagree, but don't assume that all of this is just in place by accident.

from proposal-intl-segmenter.

gibson042 avatar gibson042 commented on June 22, 2024

%SegmentIteratorPrototype%.breakType was added without explanation in 7f8b345 two years ago, and it's hard to debate unstated reasons. But I don't assume that it was accidental, and I'm sorry if i gave that impression... I'm just suggesting that behavior shouldn't diverge from analogous APIs like %StringIterator% without explicit justification.

I don't dispute that arbitrary-index preceding and following methods can be useful, although to be honest I'd have a hard time coming up with sufficient justification and would love to see a realistic application in the FAQ. But although even next requires internal position tracking, none of the three methods need or even benefit from exposing it on the iterator itself (as opposed to only in the iterator results), let alone taking the further step of memoizing breakType.

As for performance, I don't want to get into that guessing game but will observe that if bypassing object allocations were a strong concern (which I'd argue against anyway), then all the result properties should be mirrored as accessors on the iterator, rather than just two of them (in particular, segment itself is currently absent).

from proposal-intl-segmenter.

littledan avatar littledan commented on June 22, 2024

If you want to avoid these kinds of impressions, you could start by asking why, rather than filing a bug claiming incorrectness.

Segment is just a convenience property, which you can calculate based on the string, the position before, and the position after. It is omitted to avoid that allocation.

PRs welcome to improve the documentation to summarize the result of #9.

from proposal-intl-segmenter.

gibson042 avatar gibson042 commented on June 22, 2024

If you want to avoid these kinds of impressions, you could start by asking why, rather than filing a bug claiming incorrectness.

Updated to replace "incorrect" with the more accurate "incomplete". But I try not to phrase issues as questions because the resolution shouldn't be an answer, it should be either an update to the explainer or an update to the spec text—and it's impossible to determine which without an issue to capture discussion. I'm happy to adapt to whatever patterns you prefer, though... where would you like to see such questions?

PRs welcome to improve the documentation to summarize the result of #9.

You keep asking me to submit PRs explaining decisions, but I can't do that for decisions that didn't come with reasons. #9 requested preceding and following, but there is no example code showing how those methods pay for theirselves in realistic situations. And this issue isn't even about that, it's mostly about %SegmentIteratorPrototype%.breakType (which sprang into existence with no GitHub discussion at all).

Segment is just a convenience property, which you can calculate based on the string, the position before, and the position after. It is omitted to avoid that allocation.

The iterator object has internal slots for position and break type, and everything else is derived from those—the iteration result currently has explicit segment, breakType, and position data properties, and (the topic of this issue) the iterator itself has position and breakType getters. Both of those accessors seem to be convenience properties, but no allocations take place until they are invoked (which seems to be moot anyway, since CreateIterResultObject itself necessitates allocations).

I'll open some PRs to clarify what I'm talking about.

from proposal-intl-segmenter.

littledan avatar littledan commented on June 22, 2024

I'm not trying to put the burden on you to make PRs, though I'd really appreciate your help. If no one gets around to it, I hope to eventually come back and do it.

I don't actually understand what's incomplete about #9, or what kind of thing would make them "pay for themselves". What makes them expensive?

About breakType, the rationale is to enable segmentation, with the user checking the breakType (e.g., soft vs hard line break) without the overhead of the iteration protocol and also with the flexibility of preceding and following methods.

from proposal-intl-segmenter.

gibson042 avatar gibson042 commented on June 22, 2024

I don't actually understand what's incomplete about #9

Nothing. This issue is totally unrelated to #9. It is about duplicating information from iteration results on the iterator itself—specifically, breakType and position but not segment.

or what kind of thing would make them "pay for themselves". What makes them expensive?

They are expensive in terms of the cognitive burden and spec complexity of Intl segment iterators being different from ES string iterators and every other built-in iterator, none of which directly expose state.

About breakType, the rationale is to enable segmentation, with the user checking the breakType (e.g., soft vs hard line break) without the overhead of the iteration protocol and also with the flexibility of preceding and following methods.

That sounds like premature optimization, which someone once called "the root of all evil (or at least most of it) in programming". The benefit of avoiding object allocations comes at the cost of introducing significant internal inconsistency in the form of properties that have no analogues on otherwise similar iterators and—in the case of position—an entirely different meaning from the common and already-established convention of identifying an index after the last match.

Couldn't we at least try the simple conventional interface first? If this complexity is actually worthwhile, then perhaps it should be added across the board rather than limited to a single built-in iterator.

from proposal-intl-segmenter.

littledan avatar littledan commented on June 22, 2024

Cc @sebmarkbage who raised the performance issue IIRC

from proposal-intl-segmenter.

littledan avatar littledan commented on June 22, 2024

I removed segment, so this issue should be fixed.

from proposal-intl-segmenter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.