Comments (4)
It only uses index gb18030 ranges if index gb18030 produced no result. So, e.g., I don't see how U+00A4 would reach that point.
from encoding.
I'm sorry for my lack of precision. At the beginning of my post I've included some ''irreversible'' codes as an example of such ``irreversibility''. I agree that only surrogates (among all strange codes) pass to that point at the moment. It is not a problem: they are denied already by an assertion at process an item.
I'm rather wondering about some kind of protection from inconsistencies of two data files (or promise of consistency). Potential holes at pointer => code point are explicitly eliminated by gb18030 decoder, next at index gb18030 ranges code point and finally while looking at index-gb18030-ranges.txt description.
If we consider code point => pointer direction (encoding) there is no obvious way to state a consistency. Especially the standard says: we are using two files for [only/primarily] space saving purposes. At the first view we have 1:1 map and next not always reversible function for remnants. It rings the bell: ``what if some irreversible value passed?'' (those bells in my head when I'm returning to the standard every time ;-).
I know that an adding my verification is too heavy and unnecessary. But please consider some light-weight solution: maybe your short and concise sentence ``It only uses...'' as a note after point 7. in gb18030 encoder; some comment at index table description or so. Thank you for your answer and time.
from encoding.
Apologies for the slow follow-up here, maybe something like this for the description:
... It therefore only superficially matches the GB18030-2005 standard for code points encoded as four bytes. It complements index gb18030 described above. See also ...
I suppose we could add this step between step 8 and 9 to the gb18030 encoder but I'd like @hsivonen or @inexorabletash's input as to whether that seems okay:
Assert: code point is the index gb18030 ranges code point for pointer.
If we make these changes, would you like to be acknowledged as chncc?
from encoding.
@jungshik may be able to reason about this w/r/t ICU
from encoding.
Related Issues (20)
- Add NeXTSTEP encoding HOT 2
- "For logical right shifts operands must have at ..." HOT 4
- Corner cases arising from Big5 encoder not excluding HKSCS codes with lead bytes 0xFA–FE HOT 6
- End-of-queue during decoding of GB18030 should not mask ASCII characters. HOT 4
- aria-label usage in BMP coverage table HOT 4
- Bug in TextDecoderStream around processing the end of stream. HOT 1
- Add a static decode and encode method to `TextEncoder` and `TextDecoder` HOT 10
- Shift_JIS decoder HOT 12
- [GB18030] Wrong codepoint at index 7533 HOT 4
- TextDecoderStream: empty Uint8Array should result in an empty string HOT 4
- 7-bit ASCII encoding HOT 3
- The concept of "output encoding" is not described anywhere HOT 5
- Visualization tables has lack of descriptions HOT 2
- Why Big5 index contains unmappable characters? HOT 2
- Consider adding windows-936-2000 as a label for GBK HOT 2
- Preface punctuation
- Reflect changes in GB 18030-2022 HOT 5
- Make encodeInto() throw when given a detached buffer HOT 5
- Ambiguous wording in GB18030 decoder HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from encoding.