IDN script accurately signals website language


There is a close relationship between the script of IDN and the language of web content.

The above chart shows language of web content (x axis) plotted against script of IDN (y axis). The chart shows the percentage of a websites in a given language (say, Persian language) which are associated with various IDN scripts. The analysis shows that the association between website language and IDN script is not random, but follows expected patterns associated with linguistic diversity.

So, as the chart above shows, 98% of Arabic language websites and 100% Persian language websites in our data set were associated with Arabic script IDNs. Likewise, 98% of Korean language sites were associated with Hangul script IDNs, 100% of Russian and Ukrainian language websites were associated with Cyrillic script IDNs, 99% of Thai language sites were associated with Thai script, and 96% of Greek language sites were associated with Greek script IDNs.

English language web content is associated with many IDNs, and this is consistent with the popularity of the English language in the online environment.