When we translate a gender-neutral language such as Basque or Hungarian, Google assumes the masculine or feminine
Algorithm bias is a well-studied phenomenon. We have seen it for example in the search for images when writing “cook” where the results are mainly from women. Now, via Reddit, we see another striking case where Google’s algorithm makes the decision to assume the genre.
When we go to Google Translate and we decide to translate a phrase from a language with neutral pronouns, such as Basque, Hungarian or Czech, Google places the pronoun in our language “he” or “she” depending on the context. Something that also happens with English and the rest of languages where there is a different pronoun for the masculine and feminine.
What is the explanation behind Google Translate bias
In the example of the Hungarian that we see on Reddit it is written a paragraph built around the pronoun “ő”, which serves to refer generically to both sexes. However, both the Spanish and English translations show that “they are beautiful” and “they smart.” In the same way that “she cooks” and “he is a teacher”. A peculiarity that does not have to be true in the original language.
Hungarian has no gendered pronouns, so Google Translate makes some assumptions pic.twitter.com/oRo0nJfnMc
– Marcos Besteiro (@MarcosBL) March 22, 2021
The translation is just an example and any user can open Google Translate and test their own phrase. The Google Translate algorithm will choose male or female and will assume the most appropriate at all times.
The case is striking, but it should come as no surprise. We are simply facing a direct case of how algorithms are only a reflection of society.
The artificial intelligence of Google Translate, as well as that of most systems, is based on correlation. In this case, they are only showing the most common genre they have found on the net. If Google decides to place the masculine next to teacher, it is because it detects that in the language where it is being translated that is the most common case.
Google’s position on biases in its algorithm is that its “search results are a reflection of content across the web, including how often these types of terms appear and how they are described on the web. This means that sometimes unpleasant descriptions of sensitive topics can affect the search results of images that appear for a certain query “, although they point out that “These results do not reflect Google’s own opinions or beliefs”.