PDA

View Full Version : UTF8 Data through socket


raja
04-17-2002, 08:56 PM
Hi
I am using the ISocket_Write() API to send data to my server.
For converting the AECHAR data to byte stream (byte type variable) for sending, i use the WSTR_TO_UTF8 Helper Function. When i use English alphabets in my data the conversion from AECHAR to UTF8 is fine.
But When i supply a AECHAR variable with Korean String to the helper function for conversion (for eg. 7 AECHAR characters ), i get some extra bytes ( i expect to get 14 bytes , it gives me 19 bytes) . Why is this happening? I tried to use typecasting the AECHAR variable to a byte pointer variable instead of using the helper function , but it doesnt reach my server.
Or is there any other way to send data without that conversion.

Eagerly expecting a reply. :-)

thanks for reading
raja

Kevin
05-03-2002, 10:14 AM
UTF8 uses a string of bytes to represent a 16-bit Unicode string where ASCII text (<=U+007F) remains unchanged as a single byte, U+0080-07FF (including Latin, Greek, Cyrillic, Hebrew, and Arabic) is converted to a 2-byte sequence, and U+0800-FFFF (Chinese, Japanese, Korean, and others) becomes a 3-byte sequence.

You are seeing some characters created as 3-byte sequences.

Sorry for the late reply
:)