Comments (21)
/cc @cdumez who I noticed recently updated WebKit's encoding labels in WebKit/WebKit@f203fd0
I assume it would be a bad option to rename "replacement" to something like, say, "csiso2022kr"?
from encoding.
@domenic Despite having done the update in WebKit recently, I am actually not familiar with this area enough to comment. I know that we do have the "replacement" label for the replacement encoding in WebKit though.
from encoding.
I know that we do have the "replacement" label for the replacement encoding in WebKit though.
Are you sure? Testing indicates otherwise.
from encoding.
I assume it would be a bad option to rename "replacement" to something like, say, "csiso2022kr"?
That would be pretty confusing, so I think that's a bad option.
from encoding.
@hsivonen: I wrote the patch very recently so you would need to try a WebKit nightly build.
from encoding.
@cdumez someone did try a nightly. It appears that WebKit (per spec and other browsers) does not alias "replacement" to the behavior of the replacement encoding.
from encoding.
Oh my bad, there was some last minute review feedback on my patch and it appears it killed the replacement alias. Sorry about the bad information.
from encoding.
We added replacement in https://www.w3.org/Bugs/Public/show_bug.cgi?id=21057. I don't think we really discussed adding it as a label before. I'm certainly open to adding it as it does simplify the system a little bit, but also not a whole lot.
@jungshik thoughts?
from encoding.
FWIW, I concur with the OP - frequent special cases in the code and tests, but sad about adding it to the web just to make implementations cleaner. So no vote either way.
from encoding.
I'm going to close this since everyone is on the fence. Feel free to reopen though if you feel strongly since it doesn't seem like it would be a hard sell.
from encoding.
I'm going to reopen this to add "replacement" as a label as the special cases apparently continue to cause problems (at least in Gecko while integrating encoding-rs) for no real gain.
As nobody objected and @hsivonen now favors this approach I hope that is acceptable, but I'll leave some time for feedback just in case.
from encoding.
I'm surprised this is such a big deal in code as there's nothing in the standard (or any standards that use this standard) to my knowledge that trips over this.
Nevertheless, I created the PR, review appreciated. Note that before landing it I should probably:
- Update tests.
- File bugs against Firefox, WebKit, and Chromium. Not sure about Edge as they haven't made an effort to comply thus far I think.
from encoding.
Shouldn't we get another implementation interested before landing?
from encoding.
I considered @inexorabletash's reply above as such, but happy to wait for something more explicit.
from encoding.
It involves deleting code on our side so I'm okay with the change.
from encoding.
FWIW, I put up a Blink change: https://chromium-review.googlesource.com/c/559973/ - I'll wait for test updates to hit WPT and roll into Blink, though.
Sanity check: we would now expect an HTML file with <meta http-equiv="content-type" content="text/html; charset=replacement">
to render as � yes?
from encoding.
@inexorabletash yeah, that wouldn't be any different from it saying iso-2022-kr or some such. Note that it would have to appear within the first 1024 bytes.
@hsivonen do you want to review the change?
from encoding.
Bugs:
- https://bugs.webkit.org/show_bug.cgi?id=174577
- hsivonen/encoding_rs#22
- https://bugs.chromium.org/p/chromium/issues/detail?id=744405
To my knowledge Edge hasn't made an effort yet so I'm not including them for now.
from encoding.
https://developer.microsoft.com/en-us/microsoft-edge/platform/status/encodingstandard/ lists them as "In Development" so they might appreciate a bug.
from encoding.
FYI, Blink change landed.
Worth noting: @annevk's WPT changes didn't trip any failures when rolled into Blink's CI since the tests currently exercise the encoding labels via the API, and replacement encodings already threw just like unknown labels.
We've got blink-specific tests that use XHR and data: URLs to verify various encodings and specifically that the replacement ones yield U+FFFD. I was lazy and just added "replacement" to the list. We should probably tidy up and upstream those (among other fun cases like UTF-7).
from encoding.
@inexorabletash yeah, that would be great.
Filed https://developer.microsoft.com/en-us/microsoft-edge/platform/issues/12808940/ against Edge.
from encoding.
Related Issues (20)
- End-of-queue during decoding of GB18030 should not mask ASCII characters. HOT 4
- gb18030 encoder using index gb18030 ranges pointer HOT 4
- aria-label usage in BMP coverage table HOT 4
- Bug in TextDecoderStream around processing the end of stream. HOT 1
- Add a static decode and encode method to `TextEncoder` and `TextDecoder` HOT 10
- Shift_JIS decoder HOT 12
- [GB18030] Wrong codepoint at index 7533 HOT 4
- TextDecoderStream: empty Uint8Array should result in an empty string HOT 4
- 7-bit ASCII encoding HOT 3
- The concept of "output encoding" is not described anywhere HOT 5
- Visualization tables has lack of descriptions HOT 2
- Why Big5 index contains unmappable characters? HOT 2
- Consider adding windows-936-2000 as a label for GBK HOT 2
- Preface punctuation
- Reflect changes in GB 18030-2022 HOT 5
- Make encodeInto() throw when given a detached buffer HOT 5
- Ambiguous wording in GB18030 decoder HOT 4
- Reference link wrong in "If ioQueue is empty..." HOT 1
- Fast byteLength() HOT 4
- Throw exception when text encode alloc memory fail.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from encoding.