Coder Social home page Coder Social logo

Comments (7)

ashtuchkin avatar ashtuchkin commented on May 29, 2024

Hi! I think this character is in the extended, microsoft code page 932, not in the original Shift_JIS or JIS X 0208.
I took the Shift_JIS table from unicode.com.

Libiconv seems to agree:

$ printf '\x87\x40' | iconv -f shift_jis -t utf-16be | hexdump
iconv: (stdin):1:0: cannot convert
$ printf '\x87\x40' | iconv -f cp932 -t utf-16be | hexdump
0000000 24 60

(U+2460 is ①)

from iconv-lite.

ashtuchkin avatar ashtuchkin commented on May 29, 2024

http://www8.plala.or.jp/tkubota1/unicode-symbols-map2.html

from iconv-lite.

ashtuchkin avatar ashtuchkin commented on May 29, 2024

Although I can probably use Encoding Standard's variation - they have these characters http://encoding.spec.whatwg.org/index-jis0208.txt

from iconv-lite.

allfoxwy avatar allfoxwy commented on May 29, 2024

Hi~(^__^)

I didn't know there is an Encoding Standard for web. I'm kind of believing in the Unicode religion, and think the Unicode is the encoding standard for web. End users should be forced to generate new information in Unicode form if they are willing to communicate these information globally. So for now I just can't see a point for a standard converting API built into browsers. I think the Unicode is the answer, not a new converting API in every browsers.

But we do need some kind of way to convert legacy information into the new Unicode representation form. By we, I mean information vendors who are hosting a server. There are information who have difficulties to rebuild from scratch in Unicode form. We should leave these information in their original form if possible. Modern browsers have capability to view these legacy forms. If there ever be a converting is needed, it should be done only once, so it's better for server to do this job. What we need is a code mapping way able to upgrade most legacy information into Unicode form.

About the upgrade mapping way, I vote for the Windows CP932 as the source form (I didn't know CP932 is a super set of SHIFTJIS). I think Windows has the major end user usage share in Japan. Well, I didn't conduct a surgery. Maybe add the Mac encoding is fine either.

The mapping table in http://encoding.spec.whatwg.org/#indexes is said comes from formerly proprietary extensions from IBM and NEC. So even IBM and NEC have abandoned these mapping, I think there is no need to write new code for them, and it looks like CP932 is also built from IBM and NEC 's idea, maybe they just share many common parts. I think if it's really needed, maybe could merge those two, but it's easier to just follow CP932: ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT

But if CP932 has some copyright issues, then I guess we have to follow the Encoding Standard mapping.

(excuse me for ①)

from iconv-lite.

ashtuchkin avatar ashtuchkin commented on May 29, 2024

I moved to Encoding Standard as the base for Shift_JIS and CP932 encodings in b5a80d1, so this issue should be fixed. Also, added test for this specific request, see below.

from iconv-lite.

ashtuchkin avatar ashtuchkin commented on May 29, 2024

Seems that Encoding Standard is almost the same as CP932, extended for compatibility. The authors did a great job reviewing best practices from browsers there.

from iconv-lite.

allfoxwy avatar allfoxwy commented on May 29, 2024

Great! Thank you for working on this.

Waiting for 0.4 on NPM~ (^__^)

from iconv-lite.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.