UTF-16

1980

The Web Hypertext Application Technology Working Group (WHATWG) considers UTF-8 "the mandatory encoding for all [text]" and that for security reasons browser applications should not use UTF-16. == History == In the late 1980s, work began on developing a uniform encoding for a "Universal Character Set" (UCS) that would replace earlier language-specific encodings with one coordinated system.

2000

The early 2-byte encoding was originally called "Unicode", but is now called "UCS-2". When it became increasingly clear that 216 characters would not suffice, It is fully specified in RFC 2781, published in 2000 by the IETF. In the UTF-16 encoding, code points less than 216 are encoded with a single 16-bit code unit equal to the numerical value of the code point, as in the older UCS-2.

Older Windows NT systems (prior to Windows 2000) only support UCS-2.

2015

In Java 7 regular expressions, ICU, and Perl, the syntax "\x{1D11E}" must be used; similarly, in ECMAScript 2015 (JavaScript), the escape format is "\u{1D11E}".

2018

Since insider build 17035 and the April 2018 update, it has added UTF-8 support and as of May 2019 Microsoft recommends software use it instead of UTF-16.

2019

As of May 2019, Windows also supports UTF-8 for 8-bit character based applications. UTF-16 is the only web-encoding incompatible with ASCII, and never gained popularity on the web, where it is used by under 0.005% (less than 1 hundredth of 1 percent) of web pages.

Since insider build 17035 and the April 2018 update, it has added UTF-8 support and as of May 2019 Microsoft recommends software use it instead of UTF-16.




All text is taken from Wikipedia. Text is available under the Creative Commons Attribution-ShareAlike License .

Page generated on 2021-08-05