Skip to Main Content

Artificial intelligence systems are at work in many areas where we might not realize—making decisions about credit, what ads to show us, and which job applicants to hire. While these systems are really good at systematically combing through lots of data to detect patterns and optimize decisions, the biases held by humans can be transmitted to these systems through the training data.

A team of researchers from Carnegie Mellon University, including CyLab’s Anupam Datta, professor of electrical and computer engineering at CMU Silicon Valley, Matt Fredrikson, assistant professor of computer science, and Ph.D. student Samuel Yeom, are detecting what factors directly or indirectly affect decision outcomes and correcting them when they are used inappropriately.

[Our method] can look inside these machine learning models and discover proxies that are influential in the decisions of the model.

Anupam Datta, Professor, CMU-SV/ECE/CyLab, CMU-SV/ECE/CyLab

Bias often appears in AI systems through factors like race or gender that aren’t directly inputted into the system, but still have a strong influence on their decisions. Discrimination can happen when one of these attributes is strongly correlated with information that is directly used by the system. For example, suppose a system that makes decisions about credit uses zip code as a factor to make its decisions. The direct information about race is not given to the system, but zip code is strongly correlated with race since many neighborhoods are still segregated. By using zip code, the system would be indirectly making decisions based on race. In this case, zip code is a proxy for race.

“If zip code is encoding race and is being used to make decisions about credit, then it’s not a defensible proxy,” said Datta. “That’s what our method can uncover. It can look inside these machine learning models and discover proxies that are influential in the decisions of the model.”

To detect bias and repair algorithms that may be making inappropriate decisions, the researchers have developed detection algorithms that identify the variables in a system that may be exhibiting proxy use in an unfair way. The algorithm combs through the model to detect the variables that are correlated with a protected feature (like race, age, or gender) and heavily influence the decision outcome.

The concept of proxy use in machine learning models was formally studied in an earlier paper by a Carnegie Mellon team including Datta, Fredrikson, Ko, Mardziel, and Sen. The first proxy detection algorithm they created was a slow, brute-force algorithm that works in the context of simple decision tree and random forest models, two classes of machine learning models. Most recently, Yeom, Datta, and Fredrikson developed an algorithm that works on linear regression models and scales to numerous applications where these kinds of models are used in the real world, detailed in a paper presented at NeurIPS 2018.

“Our recent results show that, in the case of linear regression, we can simply treat the input attributes as vectors in a low-dimensional space,” said Yeom, “and this allows us to use an existing convex optimization technique to identify a proxy quickly.”

brain gif

Source: College of Engineering

The researchers’ proxy detection algorithm broadly scales to linear regression models, a type of machine learning model that is widely used in high-stakes applications to make decisions.

Once the algorithm has detected the influential variables, it shares them with a human domain expert who decides if the proxy is used in a way that is unjustified. To demonstrate that the algorithm works in practice, they ran it on a model used by a large police department to predict who is likely to be involved in a shooting incident. This model did not have any strong proxies, but gang membership was found to be a weak proxy for race and gender. A domain expert would then consult this information and decide whether it is a justified proxy. Their method is broadly applicable to machine learning models that are widely used in such high-stakes applications.

Not all instances of proxy use are negative, either. For example, debt-to-income ratio is also strongly associated with race. But if debt-to-income ratio can separately be justified as a strong predictor of creditworthiness, then it is legitimate to use. That is why it is important to have a human domain expert be able to decide once the algorithm has flagged the proxy use.

Detecting and correcting biases in AI systems is just the beginning. As AI systems continue to make important decisions for us, we need to make sure they are fair. Hiring tools, criminal justice systems, and insurance companies all use AI to make decisions, and many other domain areas are continuing to incorporate artificial intelligence in new ways.

“Being able to explain certain aspects of a model's predictions helps not only with identifying sources of bias, but also with recognizing decisions that may at first appear biased, but are ultimately justified,” said Fredrikson. “We believe that this ability is essential when applying the approach to real applications, where the distinction between fair and unfair use of information is not always clear-cut.”