Artificially intelligent, inherently racist

A new study regarding online language prediction models reveals that they discriminate against young, non-white men

Lena Hunter

Dec 30, 2021

Talk of artificial intelligence tends to fall into two camps: that of an interconnectedness that streamlines every aspect of human life – or a dystopian HAL 9000-type technological singularity in which “I’m sorry, Dave. I’m afraid I can’t do that,” is the last thing we hear before the machines take over and turn us into fleshy slaves.

Right now, we’re in the grey zone of prototypes, so some of our forays into AI are less than perfect. Take language prediction models. They’re used in everything from Google searches to legal cases, but a new study by researchers at China’s National University of Defense Technology and the University of Copenhagen shows they have a systemic racial bias.

Deeply ingrained tech
The language models under the microscope were ELECTRA, GPT-2, BERT, DistilBERT, ALBERT and RoBERTa. If you’re wondering why so many are called ‘Bert’, they’re all offshoots of the original ‘Bidirectional Encoder Representations from Transformers’ – a type of machine-learning technique developed by Google in 2018.

To give an idea of how prevalent these models are: at the end of 2019, BERT had been adopted by Google’s search engine in 70 languages. By 2020 the model was used in almost every English-language search query. This is the technology that fills in the gap in the search bar when you type “Why am I so ___?”

The study in detail
The study measured the models’ performance differences across demographics in so-called English-language ‘cloze tests’ (fill-in-the-gap tests). Since the cloze task is how BERT systems are trained, researchers were able to evaluate the models directly.

Some 3,085 sentences were completed by 307 human subjects asked to fill in the most likely word based on their experience. They were sorted into 16 demographics according to age, gender, education and race. The ‘fairness’ of the language model responses was measured by whether the risk of error across any two demographics was roughly equal.

The results showed a systemic bias against young non-white male speakers. Older, white speakers were also poorly aligned. Not only do the models learn stereotypical associations, they also learn to speak more like some than like others – in this case white men under the age of 40.

Why is it important?
We already know that BERT is an integral part of our online navigation system, so users who do not align with the models receive unequal results and opportunities.

When GPT-2 was announced in February 2019 by San Francisco technology company OpenAI, James Vincent of The Verge described its writing as “one of the most exciting examples yet” of language generation programs.

“Give it a fake headline, and it’ll write the rest of the article, complete with fake quotations and statistics. Feed it the first line of a short story, and it’ll tell you what happens to your character next,” he said.

The Guardian called it “plausible newspaper prose”, while journalists at Vox mused that GPT-2 may be the technology that kicks them out of their jobs. A study by the University of Amsterdam even found that some participants were unable to distinguish poems generated by GPT-2 from those written by humans.

The upshot should be better training, argue the researchers at the University of Copenhagen, so the models more accurately represent the diversity of users.

Want to advertise your job with CPH Post?

Jobs

Subscribe to our newsletter

Contact

The Copenhagen Post /
The Post ApS

Ryesgade 106, 2. th

2100 København Ø

CVR: 43916181

Email: support@cphpost.dk

Phone: +45 7174 3199

We are responsible for the content and are registered with The Danish Press Council.

The Copenhagen Post

<?php

add_filter( 'register_post_type_job_listing', function( $array ) {
$array['rewrite']['pages'] = true;

return $array;
} );

Artificially intelligent, inherently racist

New US Arctic security strategy co-opts Denmark as Russia deterrent

Anti-whaling activist and Greenpeace co-founder Paul Watson arrested in Greenland

Migratory birds bring dangerous ticks to Denmark

Planes grounded in CPH as worldwide IT outage hits airlines, media and banks

CrowdStrike director: IT outage issue “has been identifed”

Foreign labour contributes DKK 76 billion to Danish economy in three months

26 May: ‘Visit Carlsberg’ tour at Home of Carlsberg

28 April: ‘Meet the Danes’ guided tour at the National Museum

20 April: Run around Kastellet

19 April: A guided tour of Christiansborg

6 April: Run around Frederiksberg Have

How internationals can benefit from joining trade unions

Internationals in Denmark rarely join a trade union

Novo Nordisk overtakes LEGO as the most desirable future workplace amongst university students

Want to advertise your job with CPH Post?

Learning & Deployment Manager – Operational Excellence

Junior Support Engineer for Zenfit – Student (German)

Digital Business Consultant

QA EXT. MANUFACTURING SENIOR SPECIALIST & COORDINATOR