Comments (22)
Assuming the JS strings are UTF-16 (which the API does), then to encode to UTF-16 is just:
function encode_utf16(s, littleEndian) {
var a = new Uint8Array(s.length * 2), view = new DataView(a.buffer);
s.split('').forEach(function(c, i) {
view.setUint16(i * 2, c.charCodeAt(0), littleEndian);
});
return a;
}
from encoding.
In that case, my case for UTF-16 encoder with ID3 tag is not valid any more.
It is! Windows Explorer and Windows Media Player doesn't support id3v2.4 even in windows 10 (It is the biggest part of OS e.i. all windows over mac and linux). Also it isn't supported by some other players and devices (as I know). That's why I use id3v2.3 in https://github.com/egoroof/browser-id3-writer. And I write strings in utf16. Now I have to use a polyfill (it's about +100 KB of size) instead of native browser API (egoroof/browser-id3-writer#36).
So sad 😢
from encoding.
Since the API is live in Chrome, let me try and get numbers on usage before we consider removing. That may take a release cycle.
from encoding.
Early stats (Chrome Canary) show a whopping 0.0000005% of page loads use TextEncoder with UTF-16 variants. I'll report in again once the counter hits stable.
from encoding.
@inexorabletash : Have you gathered stats on use of UTF-8 with TextEncoder? It's possible that use of UTF-8 in TextEncoder (or TextEncoder in general) might be equally rare.
from encoding.
@peteroupc : Good question! Not directly but https://www.chromestatus.com/metrics/feature/timeline/popularity/429 shows the usage of the TextEncoder constructor itself at ~0.0004% of page loads. (So 3 orders of magnitude higher.)
That'd be UTF-8 + UTF-16 + any calls into the constructor that throw due to unsupported encodings. I'm assuming the failure #s due to unsupported encodings are low. https://www.chromestatus.com/metrics/feature/timeline/popularity/430 shows the actual number of calls to encode()
at 0.0001% of page loads, which puts a lower bound on the successful construction calls. (4x constructions vs. usage calls is a bit odd)
from encoding.
@inexorabletash given https://www.chromestatus.com/metrics/feature/timeline/popularity/1061 can we go ahead with this?
from encoding.
I'd like the counter to hit the stable channel first. It only went in in December, so it'll be in the M49 release, which should be at the start of March.
That said, it looks extremely promising, even in the internal (absolute, not %) numbers.
from encoding.
I wonder if web-based OS such as Chrome/Chromium OS and Firefox OS has any need for UTF-16 encoder. One (edge) use case I can think of is to produce ID3 tag for mp3.
from encoding.
@jungshik that also applies to "pure ASCII" or "unaltered iso-8859-1" which we so far have easily resisted to add. Might not hurt to ask though.
from encoding.
Oh. I forgot that ID3 v2.4 supports UTF-8 ( https://en.wikipedia.org/wiki/ID3 ). In that case, my case for UTF-16 encoder with ID3 tag is not valid any more.
from encoding.
Counter has hit stable. Usage is around 0.00000000001%. Seems safe to remove.
from encoding.
Review of #36 appreciated.
from encoding.
Tracking bug for Chrome update: https://crbug.com/595351
from encoding.
We'll want to update web-platform-tests as well.
from encoding.
from encoding.
If your input is a JavaScript string, it should be quite easy to compute UTF-16 bytes. Definitely not something requiring a 100KiB polyfill.
from encoding.
https://github.com/inexorabletash/text-encoding/blob/master/lib/encoding.js it is about 100 KB.
But yes, I'll try to write a small polyfill with bugs :)
from encoding.
@inexorabletash thanks for this short code! Why don't you use it in your polyfill since the standart doesn't say anything about encoding steps for utf16?
from encoding.
Why don't you use it in your polyfill since the standart doesn't say anything about encoding steps for utf16?
I just haven't gotten around to updating it, and previously it attempted to match the spec exactly to look for spec bugs. PRs welcome. :)
from encoding.
I'm confused. The polyfill should not contain UTF16 encoding, since the spec does not.
from encoding.
The polyfill doesn't expose non-UTF-8 encoders by default. But there's a way to opt-in to nonstandard behavior. Further discussion should probably go over in https://github.com/inexorabletash/text-encoding
from encoding.
Related Issues (20)
- Add NeXTSTEP encoding HOT 2
- "For logical right shifts operands must have at ..." HOT 4
- Corner cases arising from Big5 encoder not excluding HKSCS codes with lead bytes 0xFA–FE HOT 6
- End-of-queue during decoding of GB18030 should not mask ASCII characters. HOT 4
- gb18030 encoder using index gb18030 ranges pointer HOT 4
- aria-label usage in BMP coverage table HOT 4
- Bug in TextDecoderStream around processing the end of stream. HOT 1
- Add a static decode and encode method to `TextEncoder` and `TextDecoder` HOT 10
- Shift_JIS decoder HOT 12
- [GB18030] Wrong codepoint at index 7533 HOT 4
- TextDecoderStream: empty Uint8Array should result in an empty string HOT 4
- 7-bit ASCII encoding HOT 3
- The concept of "output encoding" is not described anywhere HOT 5
- Visualization tables has lack of descriptions HOT 2
- Why Big5 index contains unmappable characters? HOT 2
- Consider adding windows-936-2000 as a label for GBK HOT 2
- Preface punctuation
- Reflect changes in GB 18030-2022 HOT 5
- Make encodeInto() throw when given a detached buffer HOT 5
- Ambiguous wording in GB18030 decoder HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from encoding.