What is universal acceptance?

Fundamentally, universal acceptance means that an IDN can be used anywhere an ASCII domain name is used. At first, this seems a simple goal: anywhere on the Internet, if a domain name can be used, that domain name can be an IDN. As previous reports in this series have shown, making IDNs usable anywhere a traditional ASCII domain is used is difficult.

As we have seen elsewhere in this report, in the earliest days of the domain name system, domains were limited to a very small subset of ASCII characters.[1] Although IDNs at the second level started to be offered on the market from 2000, the first technical standards relating to IDNs were published by the Internet Engineering Task Force in 2003. Part of the early standards was the ability to store string in the DNS that represented a translated version of non-ASCII, internationalised domain names. IDNs first appeared in the root zone of the DNS in 2010.

However, the availability of IDNs in the root zone does not make them “usable.” Usability, or “acceptance” is a step beyond “availability.” Instead of simply being able to be registered and resolved in the DNS, acceptance means that the names can be used, validated, and displayed in exactly the same contexts as traditional ASCII domain names. At first glance this should be simple: simply enter, display or process the IDN just as you do the ASCII alternative.

There are three key barriers for universal acceptance of IDNs by software, web applications and other Internet technology:

  • the application fails to display the IDN correctly;
  • the application fails to accept that the IDN is a legitimate domain name; and,
  • the application is unable to process or resolve the IDN as a domain name.

In 2016 there has been significant progress in addressing the issue of display. Many browsers, web applications, social media tools and other internet service can now display an IDN correctly. In a moment we shall see that this success is not the result of dedicated labor by legions of applications developers. Instead, it is the result of key, shared software libraries being gradually updated to support Unicode and UTF-8[2]. Almost all modern browsers are able to display IDNs correctly when Unicode[3] is specified as the content type in the communication from the server to the browser.

In the other two areas – the ability to accept or process the IDN – there is much less progress to report in 2016. Many applications and services on the Internet attempt to validate or inspect domain names as they process those strings. This activity leads those applications to erroneously reject IDNs, especially in their native format. Since so many of those applications are custom built for the purpose at hand, they do not profit from the development of shared software libraries. This is also part of the landscape for internationalised email: use of IDNs as part of email addresses is complicated by the fact that the underlying servers are unable to recognise and accept that the IDN is a legitimate part of an email address.

The ability of the underlying system to process and resolve the IDN should be helped by the fact that updates to operating systems mean that DNS clients are more aware of IDNs and how to handle them. In major operating systems, we see significant progress in processing and resolution of IDNs. Unfortunately, when you move beyond traditional operating systems, the news is not good: processing and resolution of IDNs in non-traditional Internet applications and services is often not available or successful.

[1] The letters A-Z, a-z, the digits 0-9 and the hyphen character “-“

[2] Universal Coded Character Set Transform Format 8-bit.

[3] Sometimes known as the Universal Coded Character Set.