This project is read-only.

UTF16 RFC + UTF8 Default Reading and Writing ...

May 20, 2009 at 3:14 PM

This RFC explains the UTF16:

Somewhere you will find that UTF16 format takes 1 or 2 16 Bits..
so in theory we can have 32bits but thats not the case..there are ranges that determines how to interpret the character.

Re: UTF8 being the Default enconding and NOT UTF16

Easy, create a streamreader and a streamwriter with only the file name as parameter and check the "Encoding" attribue.

you will see in both cases its UTF8 and not UTF16 as the book mentions.

Case can I get a refund.

May 21, 2009 at 7:48 PM

I think I did not convey correctly my original point.  I was not trying to reading and writing of streams I was talking the internal representation of a character.

"UTF-16 is often used natively, as in the Microsoft.Net char type, the Windows WCHAR type, and other common types. Most common Unicode code points take only one UTF-16 code point (2 bytes). Unicode supplementary characters U+10000 and greater still require two UTF-16 surrogate code points."

Also mentioned on page 125 of the book.

May 21, 2009 at 9:09 PM

We are both on the same wave length, the original post is targetting both issues.

issue 1: size of a UTF16 char on Disk --> 2 or 4 bytes

issue 2: default Encoding when using Stream--> UTF8