Comments (7)
Hi! I think this character is in the extended, microsoft code page 932, not in the original Shift_JIS or JIS X 0208.
I took the Shift_JIS table from unicode.com.
Libiconv seems to agree:
$ printf '\x87\x40' | iconv -f shift_jis -t utf-16be | hexdump
iconv: (stdin):1:0: cannot convert
$ printf '\x87\x40' | iconv -f cp932 -t utf-16be | hexdump
0000000 24 60
(U+2460 is ①)
from iconv-lite.
http://www8.plala.or.jp/tkubota1/unicode-symbols-map2.html
from iconv-lite.
Although I can probably use Encoding Standard's variation - they have these characters http://encoding.spec.whatwg.org/index-jis0208.txt
from iconv-lite.
Hi~(^__^)
I didn't know there is an Encoding Standard for web. I'm kind of believing in the Unicode religion, and think the Unicode is the encoding standard for web. End users should be forced to generate new information in Unicode form if they are willing to communicate these information globally. So for now I just can't see a point for a standard converting API built into browsers. I think the Unicode is the answer, not a new converting API in every browsers.
But we do need some kind of way to convert legacy information into the new Unicode representation form. By we, I mean information vendors who are hosting a server. There are information who have difficulties to rebuild from scratch in Unicode form. We should leave these information in their original form if possible. Modern browsers have capability to view these legacy forms. If there ever be a converting is needed, it should be done only once, so it's better for server to do this job. What we need is a code mapping way able to upgrade most legacy information into Unicode form.
About the upgrade mapping way, I vote for the Windows CP932 as the source form (I didn't know CP932 is a super set of SHIFTJIS). I think Windows has the major end user usage share in Japan. Well, I didn't conduct a surgery. Maybe add the Mac encoding is fine either.
The mapping table in http://encoding.spec.whatwg.org/#indexes is said comes from formerly proprietary extensions from IBM and NEC. So even IBM and NEC have abandoned these mapping, I think there is no need to write new code for them, and it looks like CP932 is also built from IBM and NEC 's idea, maybe they just share many common parts. I think if it's really needed, maybe could merge those two, but it's easier to just follow CP932: ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT
But if CP932 has some copyright issues, then I guess we have to follow the Encoding Standard mapping.
(excuse me for ①)
from iconv-lite.
I moved to Encoding Standard as the base for Shift_JIS and CP932 encodings in b5a80d1, so this issue should be fixed. Also, added test for this specific request, see below.
from iconv-lite.
Seems that Encoding Standard is almost the same as CP932, extended for compatibility. The authors did a great job reviewing best practices from browsers there.
from iconv-lite.
Great! Thank you for working on this.
Waiting for 0.4 on NPM~ (^__^)
from iconv-lite.
Related Issues (20)
- API to get supported encodings HOT 5
- UCS-2BE, UCS-2LE not supported HOT 1
- Give users the option to load a subset of encodings to minimize js bundle size HOT 1
- Version w/Uint8Array option? HOT 2
- Problem with the type declaration of the first parameter of the decode method in index.d.ts? HOT 5
- Unable to resolve module string_decoder could not be found HOT 3
- Help bundling for vanilla web HOT 1
- Wrong maccyrillic decoding HOT 11
- can not work in cocos creator
- it affects page performance.
- setHighWaterMark
- Cant build project with iconv-lite HOT 1
- "Encoding not recognized: 'UNICODE UTF-8'" HOT 1
- Provide performant iconv.encodeInto API (previously iconv.byteLength) HOT 5
- UTF-16BE not exported correctly HOT 1
- when to update the version?
- Is the project still under maintenance? HOT 5
- How to stop iconv-lite from automatically console.log the entire document HOT 2
- CP737 encoding for Greek HOT 2
- Convert wrong character when utf8 to sjis
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from iconv-lite.