Unified Hangul Code
<templatestyles src="Module:Hatnote/styles.css"></templatestyles>
Alias(es) | Windows Code Page 949, IBM Code Page 1363 |
---|---|
Standard | WHATWG Encoding Standard (as "EUC-KR")[1] |
Language(s) | Korean |
Unified Hangul Code (UHC),[2][lower-alpha 1] or Extended Wansung,[4][lower-alpha 2] also known under Microsoft Windows as Code Page 949 (Windows-949, MS949 or ambiguously CP949), is the Microsoft Windows code page for the Korean language. It is an extension of Wansung Code (KS C 5601:1987, encoded as EUC-KR) to include all 11172 Hangul syllables present in Johab (KS C 5601:1992 annex 3).[4][2] This corresponds to the pre-composed syllables available in Unicode 2.0 and later.
Wansung Code has the drawback that it only assigns codes for the 2350 precomposed Hangul syllables which have their own KS X 1001 (KS C 5601) codepoints (out of 11172 in total, not counting those using obsolete jamo), and requires others to use eight-byte composition sequences, which are not supported by some partial implementations of the standard.[5] UHC resolves this by assigning single codes for all possible syllables constructed using modern jamo, by making assignments outside of the encoding space used for KS X 1001.
Terminology
Unified Hangul Code is not registered with IANA as a standard to communicate information over the Internet.[6] Alternatives include UTF-8. However, the W3C/WHATWG Encoding Standard used by HTML5 incorporates the Unified Hangul Code extensions into its definition of "EUC-KR".[1]
Microsoft assigns Windows-949 the label "ks_c_5601-1987",[7][8] which properly applies to KS X 1001 itself (KS C 5601 being the original name of KS X 1001). The WHATWG treat the label "ks_c_5601-1987" interchangeably with "EUC-KR" with the intent of being "compatible with deployed content".[9] The Unicode Consortium's "OBSOLETE/EASTASIA" collection of withdrawn mappings included mappings for Unified Hangul Code as "KSC5601.TXT", with the automatically derived mappings for 7-bit KS X 1001 being included as "KSX1001.TXT".[10]
IBM's code page 949 is another, otherwise unrelated, extension of EUC-KR. International Components for Unicode (ICU) uses "cp949", "949" or "ibm-949" to refer to that IBM code page,[11] and "ms949" or "windows-949" (or several variants of "ks_c_5601-1987") to refer to the Windows mapping of UHC.[12] Python, by contrast, recognises "cp949", "949", "ms949" and "uhc" as labels for UHC, and does not include an IBM-949 codec.[13] Out of the labels incorporating the code page number, the WHATWG recognise only "windows-949".[9]
IBM's code page for Unified Hangul Code is called Code page 1363 (IBM-1363), or "Korean MS-Win". It is a combination of Code page 1126 and Code page 1362.[14] It differs in having a single byte mapping of 0x5C to the Won sign (U+20A9);[15] Windows maps 0x5C to U+005C (the Unicode code point for the backslash) as in ASCII,[12] although fonts often still render it as a Won sign.[16] The IBM mapping for UHC is available as "ibm-1363" in ICU.[15]
Footnotes
<templatestyles src="Reflist/styles.css" />
Cite error: Invalid <references>
tag; parameter "group" is allowed only.
<references />
, or <references group="..." />
References
<templatestyles src="Reflist/styles.css" />
Cite error: Invalid <references>
tag; parameter "group" is allowed only.
<references />
, or <references group="..." />
External links
- Microsoft's Reference for Windows-949
- IBM's documentation for IBM-1363
- Mapping of Windows-949 to Unicode
- ICU demonstration for Windows-949 (with ASCII mappings)
- ICU demonstration for IBM-1363 (with 0x5C as Won sign)
<templatestyles src="Asbox/styles.css"></templatestyles>
<templatestyles src="Asbox/styles.css"></templatestyles>
- ↑ 1.0 1.1 Lua error in package.lua at line 80: module 'strict' not found.
- ↑ 2.0 2.1 Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ 4.0 4.1 Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ 9.0 9.1 Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ 12.0 12.1 Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
- ↑ 15.0 15.1 Lua error in package.lua at line 80: module 'strict' not found.
- ↑ Lua error in package.lua at line 80: module 'strict' not found.
Cite error: <ref>
tags exist for a group named "lower-alpha", but no corresponding <references group="lower-alpha"/>
tag was found, or a closing </ref>
is missing