As the use of AI- and ML-driven decision-making draws transparency concerns, the need increases for explainability, especially when machine learning models appear in high-risk environments.

David Utassy, Data Scientist, SEON

April 28, 2022

4 Min Read
Fraud
Source: Wavebreakmedia Ltd UC6 via Alamy Stock Photo

Artificial intelligence has evolved rapidly during the last few years and is being applied across industries for endless use cases as a powerful and innovative tool. However, great responsibility comes with great power. Thanks to AI and machine learning (ML), fraud prevention is now more accurate and evolving faster than ever. Real-time scoring technology allows business leaders to detect fraud instantly; however, the use of AI- and ML-driven decision-making has also drawn transparency concerns. Further, the need for explainability arises when ML models appear in high-risk environments.

Explainability and interpretability are getting more important, as the number of crucial decisions made by machines is increasing. "Interpretability is the degree to which a human can understand the cause of a decision," said tech researcher Tim Miller. Thus, evolving interpretability of ML models is crucial and leads to well-trusted automated solutions.

Developers, consumers, and leaders should be aware of the meaning and process of fraud prevention decision-making. Any ML model that exceeds a handful of parameters is complex for most people to understand. However, the explainable AI research community has repeatedly stated that black-box models are not black box anymore due to the development of interpretation tools. With the help of such tools, users are able to understand, and trust ML models more that make important decisions.

The SHAP of Things
SHAP (SHapley Additive exPlanations) is one of the most used model-agnostic explanation tools today. It computes Shapley values from coalitional game theory, which evenly shares the impact of features. When we are fighting fraud based on tabular data and using tree ensemble methods, SHAP’s TreeExplainer algorithm provides the opportunity to get exact local explanations in polynomial time. This is a vast improvement compared to neural network-based explanations because only approximations are feasible with such tools.

With the term "white box," we are referring to the rule engine that calculates the fraud score. By their nature, the black-box and white-box models will not give the same results because the black box gives us results according to what the machine learned from the data, and the white box gives scores according to the predefined rules. We can use such discrepancies to develop both sides. For example, we can tune the rules according to the fraud rings spotted with the black-box model.

Combining black-box models with SHAP lets us understand the model's global behavior and reveals the main features that the model uses to detect fraudulent activities. It will also reveal undesirable bias in the model. For example, it may uncover that a model may be discriminating against specific demographics. It is possible to detect such cases and prevent unfair predictions by global model interpretation.

Additionally, it helps us understand individual predictions made by the model. During the debugging process of ML models, data scientists can observe each prediction independently and interpret it from there. Its feature contribution gives us great intuition about what the model is doing, and we can take action from these inputs for further development. With SHAP, end users are not just getting essential features of the model, they also get information about how (in which direction) each feature is contributing to the model's output, which yields fraud probability.

The Confidence Factor
Finally, confidence is gained from customers by gaining trust in a successful model with the help of SHAP. In general, the faith in a product is higher if we understand what it is doing. People don't like things that they don't understand. With the help of explaining tools, we can look into the black box, understand it better, and start trusting it. And by understanding the model, we can improve it continuously.

An alternative to gradient boosting ML models with SHAP could be Explainable Boosting Machine (EBM), the flagship of InterpretML (Microsoft's AI framework), which is a so-called "glass box" model. The name glass box comes from the fact that it is interpretable by its nature due to its structure. According to the original documentation, "EBMs are often as accurate as state-of-the-art black box models while remaining completely interpretable. Although EBMs are often slower to train than other modern algorithms, EBMs are extremely compact and fast at prediction time." Local Interpretable Model-Agnostic Explanations (LIME) is also a great tool that could be used for black-box explainability; however, it is more popular with models functioning on unstructured data.

With these tools and transparent data points, organizations can confidently make decisions. All stakeholders must know how their tools work to get the best results. Being aware of black-box ML and the various techniques that combine with it can help organizations better understand how they are getting results to reach their business goals.

About the Author(s)

David Utassy

Data Scientist, SEON

Open-minded, determined data scientist, highly interested in machine learning, and explainable AI. Machine Learning Guild Leader in the world's quickest growing fraud prevention company.

Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox.

You May Also Like


More Insights