[prev] [thread] [next] [lurker] [Date index for 2005/10/17]
> MTAs shouldn't be interpreting any utf-8. MTA is actually an overloaded term, seeing as it incorporates mail transport agents and mail routing agents. In addition software that implements mail transport and routing includes spam filters, virus checkers, vacation and other automatic response software, gateways, and so on. > UCS-4 breaks little assumptions like "A byte with a null in is the end > of a string". Any software that deals with "bytes" rather than "characters" is going to have to be rewritten. Right now it's being rewritten to use UTF-8 using libraries that incorporate all kinds of mind-bogglingly heavy l11n and i18n code that has immediate, obvious, and significant performance costs. Even setting LOCALE to "C" doesn't get you back to the performance you had with software that dealt with characters as atomic objects at the machine level. Metaphorically changing (sizeof(char)) to 4 is about the only way to get that performance back. The cost of reading and writing 4 times as many bytes per character is swamped by the 10x or greater cost of i18n and l11n code.There's stuff above here
Generated at 20:00 on 17 Oct 2005 by mariachi 0.52