We all know our normal latin characters. We know them world-wide because English is the primary trade language.
We also know that the Internet has been based on the latin character set. There has been a shift away from this to UTF-8 encoding. But there is now a larger shift. We’re approaching the 20th birthday of the Web, and for the first twenty years the web has been based on Latin-1 for domain names. This is about to change.
English is still the predominant language on the web but, this will not last for long. The Chinese are quickly becoming the dominant player online despite the Great Firewall of China.
For those of you who follow me due to my social media strategy, this is a little break from the norm. I am a nerd and I used a lot of resources to develop this script, so in an effort to give back to the community I’m gonna nerd out for a post.
The Problem with ISO-8859-1 aka Latin1
I am not an expert in character encoding, so I’ll break it down in human terms. Latin1 cannot hold as many characters as UTF8. This becomes a problem when you have encoding from applications such as Microsoft Word, or different languages. UTF8 on the other hand can store everything latin1 can and a whole lot more. As an example, and to learn more about how to make your websites UTF-8 compatible, see Geir Berset’s Mother of all UTF-8 Check lists. (more…)