Tags: email charsets Outlook
|It's fairly well accepted that the font WingDings should not be used in webpages. Many modern cross platform browsers don't even support it. One of the reasons is that supporting WingDings means supporting creating symbol fonts which requires adding the SYMBOL_CHARSET define to the call to CreateFont instead of the normal ANSI_CHARSET. Due to their cross platform nature they want as much code as possible to be the same regardless of platform. So instances like this where a Windows specific font needs some special handling are often unimplemented to keep things simple. It may even be a strategic design decision made to deliberately weaken the support for it hoping that it's usage would fade away.
In my email client I have two places that I have to special case code just for WingDings. Firstly the above mentioned CreateFont call needs a special flag added so that when rendering HTML email that uses WingDings the correct glyphs are placed on the screen/page. The second place is during conversion from HTML to (unicode) text. Of course the actual bytes to render in the HTML in the document are not unicode or some easily known charset, so you have to convert from "WingDingsCharset" to Unicode using some hacked together table.
Now the question is "Who is creating all this content that uses the WingDings font?"
That's an excellent question dear reader, and the answer is: Outlook. In 2014 Outlook still happily converts:
Into a WingDings HTML font tag containing a single uppercase 'J'. That will be rendered as a smiley face by Outlook and a capital J by most other software. Especially software on Macs and Linux that don't have access to that font by default. Never mind that Unicode has a perfectly functional smiley face character (☺) which could be rendered correctly by pretty much any software these days.
Outlook's character set issues don't stop there though. Take the case that I ran into recently where Outlook was used to send an email containing a URL that included a Unicode character. Now that in itself is somewhat dubious and indeed the URL referred to a Microsoft content server that should've known better than to allow a unicode character in the URL (especially when a perfectly suitable ascii character was available). Now that Unicode URL was specified used proper HTML entity encoding, however the meta charset of the HTML aaaaand the Content-Type of the MIME segment both stated the document was in us-ascii. And well, that Unicode character got converted to garbage at some point. Instead Outlook should make the charset of the HTML "utf-8" or something so that the character can exist in the specified charset. I've taken the approach of assuming the attributes of tags like 'src' and 'href' are in unicode despite the prevailing HTML charset.
Another aspect of the poor charset implementation in Outlook is the tendency to use undefined characters in the specified charset. Take for instance an Outlook generated email says it's using the ISO-8859-1 charset and then goes on to use characters in the 0x80 to 0x9F address space which is according to ISO-8859-1, undefined. Internally Outlook is using the Windows-1252 charset, which DOES define characters in that range. However instead of marking the email as "Windows-1252" it puts the incorrect ISO-8859-1 charset in the headers (they are essentially the same outside of that range). This commonly manifests as smart quotes (0x91 and 0x92) getting rendered incorrectly. Scribe of course has the ability to let the user override the charset which fixes the problem.
Now I'm in no way saying Scribe is perfect and lets all bash on Outlook. But I feel these bugs have been there for over 10 years and still haven't been addressed and something needs to be said. I doubt they will ever fix their software. Which is sad, because it makes the rest of the email software community look bad when we have to deal with the broken emails coming out of Outlook. I try and take a very literal interpretation of the incoming data. So that the user sees the content warts and all. I'm not a fan of sweeping bugs in other clients under my carpet. That has had a very poor history in the web-browser world. Ending in content bugs lasting far longer than they should have.
And if this ever gets back to someone working on the Outlook team: fix thy bugs. It's not even hard... these are really basic problems and it would most likely take a day or less to address.