Amazon Abandoned AI Recruiting Tool After It Learned to Discriminate Against Women

Amazon Abandoned AI Recruiting Tool After It Learned to Discriminate Against Women

Amazon canceled a multi-year project to develop an experimental automated recruiting engine after the e-commerce giant’s machine learning team discovered that the system was exhibiting explicit bias against women, Reuters reports. The engine, which the team began building in 2014, used artificial intelligence to filter résumés and score candidates on a scale from one to five stars. Within a year of starting the project, however, it became clear that the algorithm was discriminating against female candidates when reviewing them for technical roles.

Because the AI was taught to evaluate candidates based on patterns it found in ten years of résumés submitted to Amazon, most of which came from men, the system “taught itself that male candidates were preferable,” according to Reuters:

It penalized resumes that included the word “women’s,” as in “women’s chess club captain.” And it downgraded graduates of two all-women’s colleges, according to people familiar with the matter. They did not specify the names of the schools. Amazon edited the programs to make them neutral to these particular terms. But that was no guarantee that the machines would not devise other ways of sorting candidates that could prove discriminatory, the people said.

The company scuttled the project by the start of 2017 after executives lost faith in it. By that time, however, it may have already helped perpetuate gender bias in Amazon’s own hiring practices. The company told Reuters its recruiters never used the engine to evaluate candidates, but did not dispute claims from people familiar with the project that they had had looked at the recommendations it generated.

Amazon’s failed experiment with AI-powered candidate assessment offers an object lesson in why employers cannot rely on machines to solve their bias problems. Reuters reports that the AI “taught itself” to prefer male candidates, but machine learning systems do not teach themselves; their creators make choices about what data the system processes to draw inferences from pre-existing patterns. Many organizations are either making plans or already in the process of adopting AI, Gartner research shows, in HR as well as in other business functions. The potential danger of using this technology to make hiring decisions is that it will replicate conscious and unconscious biases that already exist within the organization and embed these biases into an automated system where they may become more difficult to identify or correct.

The company says it learned from the experience and now only uses a “watered-down” version of the engine to perform rudimentary recruiting tasks like culling duplicate candidate profiles from databases, while a new team has been formed to try and build a new automated screening system, this time focusing on diversity from the outset, Reuters adds.

Amazon is by no means the first company to run into this problem when trying to automate recruiting, however. As our own Brian Kropp, group vice president for Gartner’s human resources practice, explained to Washington Post columnist Jena McGregor, we have seen many organizations try this and have it backfire in a similar way.

“I could tell you 10 to 20 other stories where companies have tried to create algorithms,” telling themselves “they’ve eliminated bias in the hiring process and all they’ve done is institutionalized biases that existed before or created new ones. The idea that you can eliminate bias in your hiring process via algorithm is highly suspect.”

He shared the story of how one company noticed that people from a certain Zip code quit more often, probably because of longer commute times, and decided it was going to stop interviewing people from that Zip code. “What they didn’t take into account was there’s a demographic distribution across Zip codes. Their mix of employees and candidates became much less diverse,” he said, prompting them to inadvertently hire lower numbers of people of color before they corrected the mistake.

This is why the expert consensus remains that machine learning is not currently advanced enough to make hiring choices on its own, and should only be used as one of several inputs for hiring managers to consider when making their decisions. Recruiting is a process with human consequences and as such should remain a human process, at least until AI researchers and developers figure out a reliable solution to the problem of algorithmic bias.