To study whether there are differences in the words used to describe men and women in literature, a computer scientist at the University of Copenhagen (Denmark) together with US researchers used machine learning (learning to analyze millions of books published in English between 1900 and 2008. “Beautiful” and “sexy” are two of the most commonly used adjectives to describe women; “just”, “rational” and “brave” the most used for men. What they did was extract adjectives and verbs associated with gender-specific nouns, such as “prima” or “waiter.” They then categorized those words according to whether they represented something negative, positive, or neutral. Tests showed that verbs negatively associated with the body and appearance were five times more frequently used for women than men. In addition, adjectives describing physical appearance appeared twice as many times in women, while men were described according to their behavior and personal qualities.” We can clearly see that the words used for women refer much more to their appearances than those used to describe men. Therefore, we were able to confirm a widespread perception, only now on a statistical level,” said Isabelle Augenstein, a computer scientist and assistant professor in the Department of Computer Science at the University of Copenhagen.
List of the 11 most frequent adjectives, divided into categories.
Augenstein notes that although many books were published several decades ago they still play an active role as many of today’s language-understanding applications—such as predictive text—take information from material available online . That is, they adopt the lenjuage that we use people and that is then reflected in prejudices and gender stereotypes. It is important then, as artificial intelligence becomes more relevant in society, developing machine learning models that use less biased texts, or trying to have those biases ignored or countered. In this note: