This article is taken from our Understanding machine learning in fraud prevention eBook. To read the full book, download here.
Before we discuss why machine learning (ML) is an essential component of your arsenal against zero-day fraud, we will first discount non-ML fraud mitigation in the context of zero-day ad fraud.
Blacklists are an important part of any ad fraud defence because they quickly identify sources that categorically don’t have human traffic such as servers. Blacklists are a very basic first step that removes the lowest-hanging fruit but fraudsters can easily circumvent them by changing the IP addresses of their traffic. When blacklisting IPs that can have human traffic rather than isolating the fraud itself, blacklists can actually result in high volumes of false-positives.
Rule-based detection and mitigation
Rule-based mitigation involves identifying characteristics and thresholds that, when exceeded, block traffic or a traffic source. Rule-based mitigation is good when you know the characteristics that define a particular fraud tactic, such as an impossibly short click to install time. But, it is near impossible to formulate rules for fraud tactics that you have never encountered before.
For a rule to be created, a new fraud type must be observed at a scale which means it is impacting your budget for some time before it can even be recognised. Analysts then need to find the characteristics to confidently filter it from valid traffic and distil them into rules. This process takes time – if you rely solely on rules, you are exposed and fraud is taking your ad spend until that rule exists to stop it. Rules are reactive – a new fraud type exists, then a rule is created. It is always fraud first, then rule.
To stop new fraud tactics, or zero day fraud, and stop the flow of money to fraudsters, a more proactive approach is required. That is where machine learning comes in.
What is machine learning?
Machine learning is a subset of artificial intelligence that extracts patterns and relationships from data and expresses them as a formula that can be applied to new data sets. Over time, as the data changes, new patterns are learned by the model without the need to explicitly program them.
What makes machine learning suited to combating new fraud types?
Because of the scale of data processed, insights can be more valuable and derived much faster-using machine learning than by using human analysis alone.
- More thorough analysis
Multi-dimensionality allows for a much deeper understanding of data, enabling machine learning to detect fraud in traffic that might initially appear valid to an analyst. Humans are normally limited to analysing data on between 2 and 4 dimensions. Within that plane, data can be visualised and patterns easily recognised. Selecting which dimensions to explore and then running manual analysis is a time-consuming process. Machine learning explores all the potential dimensions of a data set to decipher relationships not evident in the 2-dimensional analysis.
- Speed of analysis
Machine learning (supported by the right data architecture) can make predictions quickly, enabling fraud prevention to run in near real-time and at scale. Human analysis is just too slow. In order to stop fraud before fraudsters get paid, fraud needs to be mitigated in real-time. Analysis needs to scale as traffic fluctuates too – you can’t just keep employing an endless string of analysts.
Machine learning is contextual to make every validation based on the circumstances and characteristics of that exact transaction. Where rules are static and based on a generalisation of normal behaviour, machine learning is much more sympathetic to the conditions of each traffic transaction.
Continuous training, verification and self-learning ensure that machine learning models evolve in-line with changes in norms associated with valid traffic.
Human behaviour changes all the time and can vary greatly in different demographic or geographic audiences. For fraud prevention to be effective, it needs to adapt to these changes and audience nuances ensuring legitimate traffic isn’t removed in the fraud mitigation process
Proactive by nature, machine learning effectively trained to understand valid traffic will identify invalid traffic regardless of the tactic employed (known or unknown).
With an accurate picture of what legitimate traffic looks like, it is much easier to identify invalid traffic. In order to future proof fraud mitigation, we need to stop reacting to fraud tactics and start only permitting valid traffic.
Juniper Research forecasts that by 2022, machine learning could save advertisers over $10 billion a year in ad spend that would have been wasted on fraud.
ML tools, such as those used by TrafficGuard, will enable the fight against ad fraud to move from detection to proactive mitigation in real-time
– Juniper Research
Fraud is a constantly moving target. Sophisticated ad fraud doesn’t look like bots. On the surface, it looks like human engagement. Fraudsters adapt their processes to circumvent rules-based fraud detection and their profitability hangs on their ability to evolve.
Instead of reacting to fraud as it evolves with new rules, machine learning can be part of a proactive defence that is tactic-agnostic, more accurate and able to stop fraud before the fraudster gets paid.
Want to learn more about how machine learning can be applied to fraud prevention? Download our eBook