PDA

View Full Version : Unicode + Wstrcompress + Strexpand


Anand
08-24-2003, 10:43 PM
Hi all,
I am working with an application, which requires to get data from the file (dynamically updated), parse it and display the menu accordingly.
Now in our file, we have chinese characters. I have already written a parse, which takes care of getting exact data and populate the menu. The problem now is, the file which contains the chinese character is in unicode format.
So, to cope up with my code, whenever i read data from the file, i read it as AECHAR and convert it to CHAR by using WSTRCOMPRESS. When I try to expand it using STREXPAND, it contains some chunk characters also.
I know, the WSTRCOMPRESS will reduce two bytes into one when the value is less than 127. So, how can we retrieve back the actual data???
Please give your thoughts.......

Regards
V. Anand

ruben
08-24-2003, 11:45 PM
Now in our file, we have chinese characters. I have already written a parse, which takes care of getting exact data and populate the menu. The problem now is, the file which contains the chinese character is in unicode format.

You did not mention what kind of Unicode encoding you have in the file (UTF8/UTF16 or UCS4(which is very unlikely)).

It is not very clear to me what is the benefit you are getting by doing all the compression, because basic latin(first 127) code point does not consume much memory when you compare with chinese code points.

If having basic latin data set in single byte character format is your objective, you can use UTF8, which is nothing but a character pointer.

So, to cope up with my code, whenever i read data from the file, i read it as AECHAR and convert it to CHAR by using WSTRCOMPRESS. When I try to expand it using STREXPAND, it contains some chunk characters also.

My guess is that WSTRCOMPRESS is messing up something(which may not be true), unless your code is doing something wrong. Chunk characters are nothing but messed up code points.

ruben

Anand
08-26-2003, 03:55 AM
Hi Ruben,
Thanks for your information. I could solve the problem... and now it is working fine. actually there was a problem when you compress and expand the unicode characters using WSTRCOMPRESS and STREXPAND. To get back the original data, we need to do some additional processing on the expanded data. I have done the same and it is working fine........
AGAIN thanks for your update.......