Researchers asked the models whether they would sentence a person who committed first-degree murder to life or death.
The individuals dialect was the only information provided to the models in the experiment.
In total, they examined 12 models.
Researchers stated that large language models “have learned to hide their racism.”Illustration: Jody Serrano / Gizmodo
Interestingly, researchers found that the LLMs were not openly racist.
When asked, they associated African Americans with extremely positive attributes, such as brilliant.
As explained by the researchers, these language models have learned to hide their racism.
They also found that covert prejudice was higher in LLMs trained with human feedback.
News from the future, delivered to your present.
Meta Pissed Off Everyone With Poorly Redacted Docs
Meta is being very transparent on accident.