Fraud Analytics – Financial Industry
Analytics can be used to flag credit card transactions as fraudulent. Here we give an overview of how that is done.
First of all, know that such tools are not perfect which means card processors and retailers still have to manually check lots of transactions. Consider this: a statistical model that is 99% accurate or even 99.9% accurate is not, by definition, 100% accurate. This means the credit card processor has to manually check a certain number of transactions to verify whether they are fraudulent or not. There is a cost to doing this. So such systems operate on a cost-of-checking vs. financial-cost-of-fraud basis to establish a tolerable threshold for fraud. In other words, it is not cost effective to achieve a model that has no flaws.
Below we give a brief survey of statistical techniques used for credit card detection and then look briefly at one product, Falcon, that uses neural networks to identify fraudulent transactions,
When someone writes software for analytics, they always start with ideas developed by academics. Those are the people who discover the techniques and write the algorithms that make such tools possible. Here we draw upon a paper “Statistical Fraud Detection: a Review” written by two academics, Richard J. Bolton and David J. Hand, to give background information on how analytics is applied to credit card fraud detection.
How much money are we talking about here? This document gives some information about that:
- In 2010, 33% of credit and debit card customers in the world reported fraud on their account in the past 5 years.
- There was $5.5 billion in credit card fraud in 2012 of which $3.56 billion was in the USA.
- There we $955 million in losses due to debit card fraud in the USA in 2010.
- There were $2 trillion total purchases in USA in 2012 using Amex, Discover, MasterCard and Visa cards.
Bolton and Hand explain that there are two types of analytic techniques: supervised and unsupervised.
Supervised fraud analysis means using data from known attacks. It:
- Uses samples of both fraudulent and non-fraudulent records to construct and train models
- Assigns new observations into one of the two classes (likely fraud, likely legitimate)
- Only works if this type fraud has previously occurred
Unsupervised fraud analysis looks for variations in account transactions and customer data from observed norms. So it would look for outliers or other events that are statistically significant. The goal is to calculate what is called a “suspicion score.”
Problems with the Models
Bolton and Hand explain that these systems are not 100% accurate. To explain what that means in practical terms, the give an example that says suppose a credit card risk analysis system can:
- Correctly identify 99% of legit transactions as legit
- Correctly identity 99% of fraudulent transactions as fraudulent
Now suppose that in actuality 1/1000 or 0.1% transactions are fraudulent.
This model will say that out of 1000 transactions, 990 are legit and 10 are fraud. But of those fraudulent ones there is only a 99% certainty of the model being right about that. We know from observation that 1 out of 1,000 is fraudulent. So we have to check the 9 others proposed by the model by calling each customer or perhaps doing some other kind of manual investigation. That takes time and costs money. Can the models do any better than that?
Among supervised models, classification models can narrow down the model to flagging only 0.1% of transactions as fraudulent, which in the case of example above would be dead-on accurate. But in the situation of this example here only 0.04% of transactions were actually fraudulent. So out of the 10 in 10,000 flagged by the model only 4 are actually fraudulent and 6 legit. There is also a cost of checking all 10 of these. So a cost-weight acceptable loss threshold is set.
Supervised fraud detection tools
Here are some of the supervised fraud classification techniques. We just cite those here and do not go into detail about how they work. You are encouraged to do further investigation into those if you want to dig deeper.
- linear discriminant analysis
- logistic discrimination
- neural networks
And then there are the rules based techniques:
Link analysis is another technique. It uses the techniques of mining social media networks (or any other kind of graph, i.e., vector and edges) to sort out, for example, if someone linked to someone else is phoning in a credit card transactions using the same fraud technique.
Unsupervised fraud detection tools
Unsupervised fraud detection tools are used when there is no prior legitimate or fraudulent observations available upon which to make decisions.
In this case statistics are used to profile transactions and detect outliers. Some techniques used here are approaches similar to those used in text analysis.
One technique used to detect fraud is an application of the rather esoteric and not-at-all-intuitive Benford’s law. That laws says that certain digits taken from random samples of financial transactions occur at a certain known frequencies. For example, you would think that the number 1 would occur 1/10 times in a dollar amount since there are 10 digits. But on average it occurs 30% of the time. (You would have to read on your own to try and understand that.) So if a batch of transactions varies from this pattern it is probably fraudulent.
Anyone in the USA who has obtained a mortgage or other loan has come to hate the organization FICO. They assign a risk-based number to loan applicants called a credit score. Since they are in the business of detecting risk, FICO has also acquired Falcon software, which uses neural networks to detect fraud.
Here is a graphic from FICO giving a view of the Falcon analytics platform.
You can tell by reading their product literature that this is a supervised learning classification system using neural networks. Since it plugs into a merchant’s POS cashier terminals, it can be used to detect fraudulent transactions right in the store. But I am not sure what a sales clerk is supposed to do when someone standing at the register is flagged as a criminal. Anyway such system could also be plugged into the merchant’s ecommerce web system as well.
The product literature also says that FICO is keeping credit profiles on card holders. We already knew about that. That helps them do classification. They also say that their software includes adaptive analytics which means it responds to up-to-the-minute fraudulent activity to update the model. This they say improves the model by 10% as it learns in real time.
The FICO system can be deployed as a cloud solution. Or they provide their APIs and framework so that a company can build their own fraud detection into their own platform.
So there is a basic overview of how analytics is used to detect credit card fraud. As you can see, these techniques would have applications to assessing any kind of financial risk.
Subscribe to the newsletter
Want to stay on top of the latest information from Opallios?
Sign up for our newsletter, and we’ll let you know about our latest news, updates on our products and services, and helpful tips and articles to learn more about Big Data, Cloud, PaaS platforms like Salesforce.com, and other relevant topics.