Exploring gender bias in Google Translate

Whenever I come across a black box ML model, I love poking and prodding it to see the world from its point of view. Here’s a fun little experiment I did yesterday.

My native language, Bengali, does not have gendered pronouns; but English does. So I was interested to find out how Google Translate would assign genders while translating from Bengali to English.

For my first test, I chose sentences of the form “S/he is very X” where X is a personality / character trait. The results were quite insightful. The characteristics that were historically considered to be masucline (e.g. bravery, strength, responsibility) were assigned male pronouns. And vice versa, historically female characteristics (sensitivity, empathy, affection) were assigned female pronouns.

The second set of sentences were of the form “S/he is a X” where X is an occupation. Similar to the first set, historically male-dominated occupations were assigned male pronouns. The only occupations with female pronouns were: nurse, health-worker, dancer, tailor, and domestic-worker.