Sometimes, it's easy to get inundated with Wired, Ars Technica, and other news articles and forget that the cyber world's growing population of netizens speaks many languages, not just English. If you think making sense of Internet health, governance, and activism is already a tricky feat, try doing it if your native tongue is Mazanderani or Chuvash or Sakha.
How do we pinpoint the Internet policies, laws, and governing bodies that matter to Americans? Some would turn to the Pew Research Center for an answer. But what about Georgians? Kurds? Sardinians? What do we do when there isn’t a research institute studying how citizens of Country X or speakers of Language Y regard Internet Law Z? One way to make sense of how different language groups engage with the Internet is to look at how they use Wikipedia.
In an era of increasingly rigid paywalls, Wikipedia seems to be one of the few forums that can help us - the us in Kazakhstan, the us in Ethiopia, the us in America, that squishy, virtual us - keep up with what's happening in the digital world and what matters to the people inhabiting it.
This summer, as a research assistant at the Berkman Center, part of my struggle has been trying to figure out what issues and individuals matter most in the context of Internet policy. There are people who have re-envisioned and re-imagined how the Internet can transform information access: Julian Assange, Edward Snowden, and Aaron Swartz. There are "Internet" concepts that have ricocheted their way into a powerful place in the world of geopolitics: cyberwarfare and Internet censorship. There are hackers. Then, there are the hackers hacking the hackers. This Wikipedia chart looks at seven terms, a mix of the aforementioned subjects. It can't possibly be comprehensive, nor does it attempt to be topically systematic. Rather, it aims to serve as an alternative method of looking at Internet activism, health, and governance.
A few disclaimers: I've relied upon Wikipedia's own language classification system. So, if a page is available for "Julian Assange" in Azərbaycanca, I've charted it in the "Azerbaijani" row. Yes, Mingrelian, Zazaki, and Sinhala are all languages. While there are multiple English pages for "hacker," I've selected the term that is most closely associated with the Hacking Team and cyberwarfare. A red X indicates that there is not a Wikipedia page available in the designated language, while a green checkmark shows the opposite. Totals are at the bottom. Lastly, this chart does not include any pages for which there is no English version.
The findings are fascinating. Only the French and English versions of Wikipedia have pages for all seven phrases: hacker, Hacking Team, Edward Snowden, Julian Assange, Aaron Swartz, cyberwarfare, and Internet censorship. As of July 27, 2015, Albanian, Afrikaans, Bosnian, Chuvash, Georgian, Irish, Javanese, Kurdish, Low German, Marathi, Mingrelian, Pashto, Western Punjabi, Scots, Sardinian, Old English, Sinhala, and Venetian only have pages for Julian Assange. Unsurprisingly, "Internet censorship" doesn't have pages in Arabic, Chinese or Azerbaijani. Julian Assange has pages in 19 more languages than Edward Snowden, and Edward Snowden has pages in 20 more languages than Aaron Swartz.
This chart poses more questions than answers. We can’t know, for example, why Pashto only has a page for Julian Assange (and not the six other items included in the mix). We can’t determine precisely what motivated speakers of Sakha to write about Edward Snowden. For each non-English page, we can’t immediately deduce what content Wikipedians translated and what content they created from scratch. These problems aside, it’s my hope that this chart will tempt other people (with more time and perhaps more HTML experience!) to consider constructing versions of their own.
|Hacker||Hacking Team||Edward Snowden||Julian Assange||Aaron Swartz||Cyberwarfare||Internet censorship|
|Belarusian [Taraškievica/ Classical Orthography]||❌||❌||✅||✅||❌||❌||❌|
|Chinese (Min Nan)||✅||❌||✅||❌||❌||❌||❌|