Handling Unicode

Handling Unicode

One very nice thing about the Newton OS is that it is fully Unicode-based. Unfortunately, this raises some issues when communicating with other devices which are not Unicode-aware… in addition to that, Unicode comes in different flavors which all want to be dealt with.

Handling Unicode in IrOBEX transactions requires actions when sending and receiving data that has a textual representation. This includes plain text, HTML, iCalendar and vCard data. My initial strategy for the sending part is that all text based routing data (e.g. all the textual representations generated by the standard route scripts) will be sent as Unicode, to be more specific, UTF-16 Big Endian. This is the native format on the Newton. Other routing formats such as vCard and iCalendar will most likely be UTF-8. Hopefully the receiving end of this will handle that accordingly…

Receiving data is probably a bit easier, I have already implemented a function to recognize and convert ISO8859, UTF-16LE and UTF-16BE for text data. Memory consumption is however a concern, that is not optimized yet.

2003-04-01