Coder Social home page Coder Social logo

Comments (22)

inexorabletash avatar inexorabletash commented on August 12, 2024 5

Assuming the JS strings are UTF-16 (which the API does), then to encode to UTF-16 is just:

function encode_utf16(s, littleEndian) {
  var a = new Uint8Array(s.length * 2), view = new DataView(a.buffer);
  s.split('').forEach(function(c, i) {
    view.setUint16(i * 2, c.charCodeAt(0), littleEndian);
  });
  return a;
}

from encoding.

egoroof avatar egoroof commented on August 12, 2024 1

In that case, my case for UTF-16 encoder with ID3 tag is not valid any more.

It is! Windows Explorer and Windows Media Player doesn't support id3v2.4 even in windows 10 (It is the biggest part of OS e.i. all windows over mac and linux). Also it isn't supported by some other players and devices (as I know). That's why I use id3v2.3 in https://github.com/egoroof/browser-id3-writer. And I write strings in utf16. Now I have to use a polyfill (it's about +100 KB of size) instead of native browser API (egoroof/browser-id3-writer#36).

So sad 😢

from encoding.

inexorabletash avatar inexorabletash commented on August 12, 2024

Since the API is live in Chrome, let me try and get numbers on usage before we consider removing. That may take a release cycle.

from encoding.

inexorabletash avatar inexorabletash commented on August 12, 2024

Early stats (Chrome Canary) show a whopping 0.0000005% of page loads use TextEncoder with UTF-16 variants. I'll report in again once the counter hits stable.

from encoding.

peteroupc avatar peteroupc commented on August 12, 2024

@inexorabletash : Have you gathered stats on use of UTF-8 with TextEncoder? It's possible that use of UTF-8 in TextEncoder (or TextEncoder in general) might be equally rare.

from encoding.

inexorabletash avatar inexorabletash commented on August 12, 2024

@peteroupc : Good question! Not directly but https://www.chromestatus.com/metrics/feature/timeline/popularity/429 shows the usage of the TextEncoder constructor itself at ~0.0004% of page loads. (So 3 orders of magnitude higher.)

That'd be UTF-8 + UTF-16 + any calls into the constructor that throw due to unsupported encodings. I'm assuming the failure #s due to unsupported encodings are low. https://www.chromestatus.com/metrics/feature/timeline/popularity/430 shows the actual number of calls to encode() at 0.0001% of page loads, which puts a lower bound on the successful construction calls. (4x constructions vs. usage calls is a bit odd)

from encoding.

annevk avatar annevk commented on August 12, 2024

@inexorabletash given https://www.chromestatus.com/metrics/feature/timeline/popularity/1061 can we go ahead with this?

from encoding.

inexorabletash avatar inexorabletash commented on August 12, 2024

I'd like the counter to hit the stable channel first. It only went in in December, so it'll be in the M49 release, which should be at the start of March.

That said, it looks extremely promising, even in the internal (absolute, not %) numbers.

from encoding.

jungshik avatar jungshik commented on August 12, 2024

I wonder if web-based OS such as Chrome/Chromium OS and Firefox OS has any need for UTF-16 encoder. One (edge) use case I can think of is to produce ID3 tag for mp3.

from encoding.

annevk avatar annevk commented on August 12, 2024

@jungshik that also applies to "pure ASCII" or "unaltered iso-8859-1" which we so far have easily resisted to add. Might not hurt to ask though.

from encoding.

jungshik avatar jungshik commented on August 12, 2024

Oh. I forgot that ID3 v2.4 supports UTF-8 ( https://en.wikipedia.org/wiki/ID3 ). In that case, my case for UTF-16 encoder with ID3 tag is not valid any more.

from encoding.

inexorabletash avatar inexorabletash commented on August 12, 2024

Counter has hit stable. Usage is around 0.00000000001%. Seems safe to remove.

from encoding.

annevk avatar annevk commented on August 12, 2024

Review of #36 appreciated.

from encoding.

inexorabletash avatar inexorabletash commented on August 12, 2024

Tracking bug for Chrome update: https://crbug.com/595351

from encoding.

inexorabletash avatar inexorabletash commented on August 12, 2024

We'll want to update web-platform-tests as well.

from encoding.

Ms2ger avatar Ms2ger commented on August 12, 2024

web-platform-tests/wpt#2706

from encoding.

annevk avatar annevk commented on August 12, 2024

If your input is a JavaScript string, it should be quite easy to compute UTF-16 bytes. Definitely not something requiring a 100KiB polyfill.

from encoding.

egoroof avatar egoroof commented on August 12, 2024

https://github.com/inexorabletash/text-encoding/blob/master/lib/encoding.js it is about 100 KB.
But yes, I'll try to write a small polyfill with bugs :)

from encoding.

egoroof avatar egoroof commented on August 12, 2024

@inexorabletash thanks for this short code! Why don't you use it in your polyfill since the standart doesn't say anything about encoding steps for utf16?

from encoding.

inexorabletash avatar inexorabletash commented on August 12, 2024

Why don't you use it in your polyfill since the standart doesn't say anything about encoding steps for utf16?

I just haven't gotten around to updating it, and previously it attempted to match the spec exactly to look for spec bugs. PRs welcome. :)

from encoding.

domenic avatar domenic commented on August 12, 2024

I'm confused. The polyfill should not contain UTF16 encoding, since the spec does not.

from encoding.

inexorabletash avatar inexorabletash commented on August 12, 2024

The polyfill doesn't expose non-UTF-8 encoders by default. But there's a way to opt-in to nonstandard behavior. Further discussion should probably go over in https://github.com/inexorabletash/text-encoding

from encoding.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.