Fairness Metrics & Confusion Matrices

Over View

Fairness metrics tell how accurately machine learning algorithms predict outcomes. They identify and keep track of true and false positives and negatives. For equal odds, one must equalize false positives and false negatives to increase fairness.

It’s important to know that despite efforts to define fairness, no complete definition of it has been developed. In fact, it has been proven that not all fairness criteria can be met at the same time.

Calibration and Equal Odds, for example, are exclusive of one another.

Perhaps, nowhere is this more important for study than practices within the lending industry. Why is that? Lenders find higher rates of true positives favorable. On the flip side, borrowers benefit from false positives. So where does bias come into play?

False positives might be increased when steps are taken to mitigate bias against minorities. The following five perspectives on fairness will together provide a more comprehensive understanding of its facets than any single definition:

Disparate Treatment

Disparate Treatment is a primary form of biased behavior engineered into fairness methods. In the profit-driven world of credit decision processes, lenders often find it most lucrative to identify a reasonable score for each respective racial group at which only 18% are expected to default on mortgages. This method of securing 82% of borrowers who will repay mortgages requires separating applicants along racial lines – and treating them disparately by requiring different credit scores for each group. While minority groups are typically at a disadvantage when enforcing the 82% threshold rule, the method maximizes their profits. Thankfully, these biases have led to R&D for fairer methods.

Fairness through Unawareness

Fairness through Unawareness seeks to remedy disparate treatment by ensuring risk assessments ignore protected attributes. As such, the approach works to attain fairness by omitting protected or sensitive attributes like gender, ethnicity and race. The approach is often applied in cases where a danger exists for the disparate treatment of persons based on these attributes – and tries to avoid that bias by ensuring the system predicts outcomes without knowledge of them.

Advantages: it increases the system’s ease of use legally.

Disadvantage: the potential for redundant encodings to the data that can come in the form of features neighborhood) whose high correction to other features (race) might make them proxies for sensitive data researchers try to withhold from algorithms.

Equal Odds

Equal Odds seek to achieve fairness by matching the rate of true and false positives across all groups. The fraction of non-defaulters and defaulters predicted should be equal across all groups.

The objective: each group, protected and not, needs to have equal oddsof receiving the favorable prediction.

Equal Opportunity

Equal Opportunity measure identifies where the rates of true positive between the protected and the unprotected groups are equal. In other words, they have equal opportunities for a positive outcome. A good example is how equal opportunity is applied to FICO scores for mortgage loans, which is a partial race-blind method that reserves non-discrimination for that proportion of persons likely to repay their loans. Here’s the goal: people who pay back their loan are given an equal opportunity of getting the loan in the first place. But on the other hand, those who are in the potential default group fail to get a guarantee of equal opportunity when applying for a loan.

Demographic Parity

Demographic Parity presents another useful interpretation of fairness, and the definition is used extensively in algorithmic fairness literature. Demographic party occurs when the outcome of a particular measurement given to various groups defined by a particular variable returns the same value in all cases – or returns the same values that remain within a specified distance of one another. Demographic parity addresses the notion of equal opportunity where common measures of errors and accuracy levels in all groups in a study must remain uniform for all the protected features – seeking to ensure that errors – whether statistical parity, accuracy, false positive and negative rates, or positive and negative predictive values are distributed equally.

Why are fairness metrics and confusion matrics important?

Ethical AI seeks to treat all individuals and groups equally and fairly whether in the lending industry or elsewhere. As developers and ML professionals, we can play an active part in maximizing fairness in machine learning for the great good of society.

Interested in learning more about how to develop ethical AI? Our firm can help you put best practices in place to better serve your customers? Contact us! Quickly develop ethical AI that is explainable, equitable, and reliable with help from our complete AI IaaS. Sign up for FREE diagnostics.