There is international consensus on the need to promote linguistic diversity, in cyberspace as well as offline. This is reflected in the World Summit on the Information Society (WSIS) action line C8 (Cultural diversity and identity, linguistic diversity and local content) and UNESCO’s Recommendation concerning the Promotion and Use of Multilingualism and Universal Access to Cyberspace (2003).
In previous reports, we have explored the status of multilingual content online, and noted the gap between the rich diversity of languages spoken in the offline world, and the languages of cyberspace – English is the language of more than half of web content.
Popular web platforms and applications are increasing support for multilingualism. For example, Facebook supports more than 110 languages, as does Instagram, Google Translate is available for more than 100 languages, Twitter supports 34 languages. The world’s most popular apps are also increasing the number of supported languages: Whatsapp is available in up to 60 languages.
Meanwhile, as our annual reports have noted, there is a wide gap between the availability of diverse languages in popular web applications, and the continuing challenge of ensuring universal acceptance of internationalised domain names.
Nevertheless, where IDNs are in use, the language of web content is more diverse than it is with traditional ASCII domains. While there is a long way to go before we see the same linguistic diversity online as there is offline, IDNs seem to help redress the balance, at least as far as the most-spoken languages are concerned.
As a result of our analysis of the language of content associated with IDNs, we can state that:
- IDNs help to enhance linguistic diversity in cyberspace
- The IDN market is more balanced in favour of emerging economies
- IDNs are accurate predictors of the language of web content.
In previous reports we have noted that language of web content tends to follow IDN script. IDNs accurately signal what languages will be found.
The research team has measured the language of websites associated with .eu IDNs. Through access to open gTLD zone files, we have also measured languages associated with IDNs in gTLDs. Based on what we discovered from the open zone files and analysis of the .eu IDNs (and associated variants in other scripts), we have extended the analysis to include content of ccTLD IDNs, which form the majority of IDN registrations. The results remain broadly consistent each year.
In 2019, the research team also performed a study on ccTLDs and linguistic diversity for the CENTR community. Based on the data published in the report, we have been able to update our assumptions on the language of websites associated with IDNs in the ccTLDs identified in that study.