Search the Archives           Subscribe           About this News Service           Reader Comments

Archived updates for Monday, July 18, 2005

Internationalized Domain Names Still Lack Formal Rules

According to Wired News on July 16, 2005 the formal expansion of Internet domain names to include non-English characters is likely to be delayed in light of this conclusion from the United Nations' June 2005 Report of the Working Group on Internet Governance:

27. Multilingualism

Insufficient progress has been made towards multilingualization. Unresolved
issues include standards for multilingual TLDs, e-mail addresses and keyword
lookup, as well as insufficient multilingual local content. There is a lack of
international coordination.

"In some of the early tests ... it became clear we had opened up the opportunity for registering very misleading names," Vint Cerf, head of the Internet Corporation for Assigned Names and Numbers, reportedly said in a conference call wrapping up ICANN's meetings lasts week in Luxembourg. Nonetheless, these, these, these ,and other registrars continue to offer multilingual domain name registration services.

Because IDN allows websites to use full Unicode names, it also makes it much easier to create a spoofed web site that looks exactly like another, including domain name and security certificate, but in fact is controlled by someone attempting to steal private information. These spoofing attacks potentially open users up to phishing attacks.

These attacks are not due to technical deficiencies in either the Unicode or IDNA specifications, but due to the fact that different characters in different languages can look the same, depending on the font used. For example, Unicode character U+0430, Cyrillic small letter a ("?"), can look identical to Unicode character U+0061, Latin small letter a, ("a") which is the lowercase "a" used in English. Technically, characters that look alike in this way are known as homographs.

Although a computer may display visually identical or very similar glyphs for two different characters, these differences are still significant (to the computer, but not the user) when locating the web sites or validating certificates. Thus, the user's assumption of a one-to-one correspondence between the visual appearance of a name, and the named entity, breaks down.
For example, someone could register a domain name that appears identical to an existing domain but goes somewhere else.

For example, the spoofed domain "p?" contains a Cyrillic a, not a Latin a. In many ways, this is not a new thing. Even staying within the old character set of A-Z, 0-9 and hyphen, G00GLE.COM is easily confused with GOOGLE.COM, for example. What was new was that the expansion of the character repertoire from a few dozen characters in a single alphabet to many thousands of characters in many scripts greatly increased the scope for homograph attacks. In general, this kind of attack is known as a "homograph spoofing attack."

On February 7, 2005, Slashdot reported that this exploit was disclosed at the hacker conference Schmoocon with an example available at On browsers supporting IDNA, the URL "http://www.p?" (the Cyrillic ?) appears to lead to but instead leads to a spoofed PayPal web site that says "Meeow." Mozilla Firefox, which supports IDNA, shows the page as being at the and with a verified security certificate. Firefox displays no warnings of any sort.

Learn more about IDNs here.
    (1)comment(s)     translate     More Updates     Send    


Blogger Johnny Canuck said...

I can relate to that post. Finding the right domain name these days is more challenging and definitely more critical to getting good search engine positioning.

I really enjoy your blog, keep us posted.

October 01, 2005 5:31 PM  

Post a Comment

<< Home