ICANN set to approve web addresses using non-Latin characters

That's because up until now, the organization that oversees domain names has only accepted URLs with Latin characters. But this week the Internet Corporation for Assigned Names and Numbers (ICANN) is expected to approve a new rule allowing addresses to be written in different scripts, including Arabic, Greek, Hindi, Japanese, Korean, and Cyrillic (Russian).
While the change might not affect English speakers reading this web site all that much, this is huge news for the 1.6 billion internet users who speak languages that don't use Latin characters. So while we have no plans to change the web address for Download Squad, I did consult with Google Translate today to learn that the site would be called something like загрузка Сборная in Russian. Because, you never know.
The new rule could be adopted as soon as Friday, although we probably wouldn't see the new Internationaliised Domain names (IDNs) until mid 2010.
ICANN has been looking at the change for a few years. But there have been technical kinks to work out. Essentially, under the new system, users will be able to enter URLs in a variety of different scripts and the domain name system will apply some new translation techniques in order to ensure that users are taken to the correct web page.












Comments
13
Subscribe to commentsJohn LaurOct 27th 2009 8:59AM
I support this concept, but this is unfortunately a very slippery slope, particularly for phishing attacks. I am not sure if this comment box supports unicode, but if so consider the following domains that are likely available:
paypaǀ.com ("l" replaced with U+01C0)
googǀe.com ("l" replaced with U+01C0)
The problem is that when you suddenly support the entire world's alphabets, there is a lot of visual overlap and subtleties. Browsers are going to have to get very smart about this to make the distinction visible to end users. Here is a sample of characters that visually resemble their ascii counterparts with enough to fool people. Any domain name containing one of these letters is potentially vulnerable (and this is just one quick casual pass through a small subset of unicode):
ɑаʙϲԁеʜјιĸǀחоƨтρυⱱԝ
mxxconOct 26th 2009 12:29PM
maybe icann will not allow mixture of alphabets..its either all latin or all non-latin
MikeOct 26th 2009 1:01PM
I think mxxcon is right!
http://cypruscar.org
JerusalemstyleOct 26th 2009 3:29PM
It exist since many years, but the problems came from the browsers.
See here http://bit.ly/7KlXD for examples.
sRcOct 26th 2009 12:34PM
I had thought this was implemented some time ago. from the way I had thought it was working (the proposed method) is that the international characters get transliterated back into latin characters according to the romanization that they are defined in the Unicode codepage, so the domains are still latin characters they can just be typed differently
Dimitar PanovOct 26th 2009 2:17PM
@Brad, do you associate the Cyrillic alphabet only with the Russian language?
You know, there are several more countries that use the Cyrillic alphabet, why not put all of them there? ;)
Or maybe you can put "Bulgarian" instead of "Russian", because that's where the alphabet was developed in the first place.
DNov 6th 2009 2:13PM
The same reason the latin alphabet is refered to as english.
Muffin_manOct 26th 2009 2:29PM
Not sure if I like this. There are some foreign websites I go to that will replace their urls to native languages making it harder to access them. I guess it only sucks for people like me but otherwise I suppose it's a good idea.
SanskritOct 26th 2009 3:06PM
I'm sure many of the ones that get enough international interest will keep the latin versions.
JamesOct 27th 2009 8:37AM
This is a terrible idea. The Internet speaks English!
Lee MathewsOct 27th 2009 8:48AM
Извините, вы можете повторить, что в русском языке? Я не говорю по-английски.
Lee MathewsOct 27th 2009 8:46AM
Извините, вы можете повторить, что в русском языке? Я не говорю по-английски.
StevenOct 27th 2009 2:50PM
I don't understand...isn't this what PunyCode is for? And isn't it already being used? http://tinyarro.ws, for example...