What are Internationalised Domain Names, and why are they important?
Domain names, the Internet’s addressing system, work because they are interoperable and resolve uniquely. This means that any user connected to the Internet, anywhere in the world, can get to the same destination by typing in a domain name (as part of a web- or email address). The plan to internationalise the character sets supported within the Domain Name System is almost as old as the Internet itself. However, technical constraints and the overriding priority of interoperability resulted in a restricted character set within the Domain Name System: ASCII a to z, 0 to 9 and the hyphen.
Technical standards to internationalise domain names were developed from the mid-1990s. The solution retains the domain name system’s restricted character set, and transliterates every other character into it. Each series of non-ASCII characters is transliterated into a string of ASCII characters prefixed with xn-- . The xn-- ASCII forms of the domain names are meaningless to humans, but meaningful to machines that resolve domain names – name servers. Thus, humans see the meaningful, transliterated characters when they navigate the Internet, whilst the underlying technical resolution of domain names remains unchanged.
In more technical terms, punycode is the algorithm used to transform a Unicode Label into an ASCII string. This ASCII string is prefixed with “xn--” (ACE prefix) to create an “A-label” or ACE label (ASCII Compatible Encoding) that the domain name system understands. For more details, see section 2.3 of RFC 5890.
Implementation of IDNs began in 2000 at the second level (under .com and .net) and 2001 (.jp). In the ten years that followed, several ccTLDs deployed IDNs, primarily supporting local language character sets. Some experimented with other strategies for internationalising domain names, but the IDN technology proved the most successful. Following pressure from the ccTLD community, ICANN introduced a fast track process to create IDN ccTLDs in 2007-2008. From 2010, IDNs became available at the top level having completed the specific process set by ICANN (for example, السعودية for Saudi Arabia, рф for the Russian Federation) .
IDNs are technically complex to implement. Many challenges remain, including (at a technical level) how to handle variant characters, which are prevalent in Arabic and Chinese scripts. Another challenge is the user-experience, eg consistent representation in browsers and emails.
Despite the technical challenges, IDNs are viewed by many as a catalyst and a necessary first step to achieving a multilingual Internet. According to UNESCO, in 2008 only 12 languages accounted for 98% of Internet web pages; English, with 72% of web pages, was the dominant language online. Recent reports indicate that other languages are growing rapidly online. For example, by 2010, only 20% of Wikipedia articles were in English. Supporters of IDN believe that enabling users to navigate the Internet in their native language is bound to enhance the linguistic diversity of the online population, and that IDNs are strongly linked to local content.
While this study focuses on the web, it should be noted that other applications also require internationalisation, eg email, file transfer protocol, etc.
For more than a decade, hybrid Internationalised Domain Names have been available at the second level with ASCII Top Level Domains (for example, παράδειγμα.eu in the figure above). This situation was only satisfactory for Latin-based scripts used by most European languages, where the IDN element would commonly reflect accents, or other diacritical marks on Latin characters. For speakers of languages not based on Latin scripts (for example, Chinese, Arabic), the hybrid IDN/ASCII domains were unsatisfactory. Right-to-left scripts, such as Arabic and Hebrew created bi-directional domain names when combined with left-to-right TLD extensions, requiring users to have a familiarity with both their own language, and Latin scripts in order to navigate the Internet. As explained in the report IDNs State of Play 2011, bi-directional domain names not only require Internet users to change script when typing in a single web address, but also potentially confuse the strict hierarchy of the Domain Name System.
Internet governance discussions from 2006 onwards highlighted the lack of IDNs in the root domain zone (which would enable full IDN domain names including at the top level) as a key building block towards the goal of a multilingual Internet. From 2005, there was increasing pressure on ICANN, the global coordinator of Internet domain names, to implement IDNs in the root zone.
In the meantime, some countries created their own work-arounds. For example, China and the Republic of Korea developed keyword searches at the domain name servers for .cn and .kr. For those searching for domains within the country, the keyword system resolves the domain without the user having to type the Latin-script domain ending (TLD). In China and Egypt, browser add-ons were developed to translate a domain into another name that would be looked up on national servers, to enable Internet users to enter local character strings into browsers. However, this solution relied on users downloading a plug-in, which was not compatible with every browser. These efforts indicate the importance that policy makers and technologists have placed on internationalising domain names, and that IDNs emerged as the superior technology amongst a number of alternatives.
In 2009, the ICANN Board approved a fast track process for IDN ccTLDs, describing the programme as a “top priority”. By April 2011, 17 IDN ccTLDs had been launched. Since then, there has been a steady expansion of the number of IDN.IDN registries launched, including .한국 (Republic of Korea), .قطر (Qatar), فلسطين (Palestine), الجزائر (Algeria), .香港 (Hong Kong), سورية (Syrian Arab Republic), .қаз (Kazakhstan), срб (Serbia), 新加坡 and சிங்கப்பூர் (Singapore).
In mid 2013, ICANN signed its first contracts for new gTLDs: شبكة. (.web), .游戏 (games), .сайт (site), and .онлайн (online). The new gTLDs started to launch from the end of 2013 through 2015. By end December 2015, more than 400 new gTLDs were offering IDNs, including nearly 80 IDN new gTLDs.