UTF-16
-
The character is specified as one or two UTF-16 code units in hexadecimal notation .
这个字符指定为十六进制形式的一个或两个UTF-16编码单元。
-
In addition , there is another encoding scheme called UTF-16 that can also be used to represent supplementary characters .
另外,还有一种称为UTF-16的编码方案,它也可以用来表示补充字符。
-
The input file should be encoded in UTF-8 or UTF-16 format .
应该将输入文件编码为UTF-8或UTF-16格式。
-
The DOMString type is explicitly specified to consist of wide UTF-16 characters .
DOMString类型被显式指定包括宽UTF-16字符。
-
Other Unicode formats such as UTF-16 tend to contain numerous zero bytes .
其他Unicode格式如UTF-16往往包含很多零字节。
-
First , unlike UTF-16 , UTF-8 has no endianness issues .
首先,与UTF-16相比,UTF-8没有endianness问题。
-
UTF-8 was chosen as the default format for character data columns , with UTF-16 for graphic data columns .
UTF-8被选择为字符数据列的默认格式,其中UTF-16用于图形数据列。
-
This method completely ignores all the encoding information available , and the returned string is always encoded in UTF-16 .
这个方法完全忽略所有可用的编码信息,所返回的字符串总是用UTF-16编码的。
-
Several encodings are used for Unicode : the two most popular are UTF-8 and UTF-16 .
有几种编码可用于Unicode:最为常用的两个是UTF-8和UTF-16。
-
Compressed UTF-8 will likely be close in size to compressed UTF-16 , regardless of the initial size difference .
压缩后,UTF-8和UTF-16的大小差不多,不论原始大小相差多少。
-
UTF-16 is a variable-width character encoding , once surrogate pairs are taken into account .
如果考虑到替换对,UTF-16是一种变长字符编码。
-
For instance , the Java ™ language 's internal representation of strings is based on UTF-16 , which makes indexing into the string much faster .
比如,Java™语言中字符串的内部表示采用UTF-16,因此对字符串的索引更快。
-
But even when you 're encoding CJK XML in UTF-8 , the actual size gain compared to UTF-16 probably isn 't so large .
但即使用UTF-8编码CJKXML,实际的大小可能也比UTF-16小。
-
Xerces-C + + uses this larger character representation to exchange text as UTF-16 as opposed to UTF-8 or ISO-8859 .
Xerces-C++使用更大的字符表示UTF-16而不是UTF-8或者ISO-8859交换文本。
-
This paper presents a " Fake UTF-16 " coding algorithm , so that all XML parsers can handle GB code in an easy and universal fashion .
本文提出伪UTF16编解码算法,为XML中文数据的解析提供了简单、通用的方法。
-
Thus the encoding attribute of the manually transcoded XML string is still " UTF-16 " instead of " Big5 " .
因此这个以手工方式转码的XML字符串的编码属性仍然是“UTF-16”而不是“Big5”。
-
In UTF-16 , you don 't always know whether the byte " 0x41 " is the letter " A " .
在UTF-16中,就不能确定字节“0x41”是不是字母“A”。
-
UTF-8 is less likely than UTF-16 or other Unicode encodings to cause problems for systems that are unaware of Unicode and XML .
与UTF-16或其他Unicode编码相比,对于不支持Unicode和XML的系统,UTF-8更不容易造成问题。
-
Google doesn 't even allow alternate encodings of Unicode such as UTF-16 , much less non-Unicode encodings like ISO-8859-1 .
Google甚至不允许其他Unicode编码(如UTF-16),更不用说ISO-8859-1这样的非Unicode编码了。
-
Unicode defines character encodings in three distinct sizes UTF-8 , UTF-16 , and UTF-32 while the traditional character type is8 bits .
Unicode用三种不同的大小定义字符编码UTF-8、UTF-16和UTF-32而传统的字符类型是8位的。
-
The third subclause gives fixed point pseudo-code for the remaining modules of the coder . GB Support of XML Parser Using " Fake UTF-16 " Coding Algorithm
第3小节给出用于编码器的保持系数的定点伪码。一个解决XML解析器对中文数据处理的伪UTF-16编码算法
-
IDENTITY_16BIT collator implements CESU-8 ( An8-bit compatibility encoding scheme for UTF-16 ) .
IDENTITY16BIT排序器实现CESU-8(一种8位的兼容UTF-16的编码方案)。
-
For instance , if UTF-16 data is naively loaded into a C string , the string may be truncated on the second byte of the first ASCII character .
比方说,如果UTF-16数据原样加载到C字符串中,字符串可能从第一个ASCII字符的第二个字节截断。
-
The character-based functions may need to convert the input data string to an intermediate UNICODE code page , like UTF-16 or UTF-32 , before its processing can be done .
基于字符的函数可能需要将输入数据字符串转换为一个中间的UNICODE代码页,比如UTF-16或UTF-32,然后才能对它进行处理。
-
Characters in the ASCII range occupy only half the space in UTF-8 that they do in some other encodings of Unicode , particularly UTF-16 .
与其他Unicode编码特别是UTF-16相比,在UTF-8中ASCII字符占用的空间只有一半。
-
Omit the XML declaration , and use the UTF-8 encoding , or use a UTF-16 Unicode Byte Order Mark ( BOM ) at the beginning of your document .
在文档的开头部分,省略XML声明,并使用UTF-8编码,或者使用UTF-16Unicode字节顺序标记(ByteOrderMark,BOM)。
-
Note this only does something if the string has a BOM , otherwise it is assumed that the string isn 't UTF-16 and it is returned unmodified .
注意这个函数只在字符串拥有BOM时有效,否则它推测字符串不是UTF-16编码的而返回没有经过修改的原始值。
-
There are others ( UTF-16 and UTF-32 , for example ) defined by the Unicode consortium , but UTF-8 is the best supported encoding for international character sets .
Unicode协会还规定了其他一些编码方式(如UTF-16,UTF-32),但UTF-8是国际字符集支持得最好的一种。