A detailed look at Pocket Media’s advertising fraud-detection strategy. In this entry, the focus is on click fraud, the second of the three types of fraud most common in mobile advertising.
Welcome to the second of three blogs about mobile ad fraud, a practice that costs advertisers more than seven billion dollars annually. In this entry, I focus on click fraud, the second of the three types of fraud most common in mobile advertising. They are:
● Install fraud: fake click, fake user
● Click fraud: fake click, genuine user
● Compliance fraud: genuine click, genuine user
If you missed the first entry, you can see it here. In it, I discuss spotting and avoiding install fraud.
So, what’s mobile click fraud?
In theory, before an install occurs, a user needs to click an ad. But click fraudsters simulate clicks on real devices without the ad ever displaying. These clicks can be generated by hidden malware, or even bots. Easy, right?
Some important industry terms:
• Click injection: A technique mostly used on Android devices, when a malicious app takes advantage of Android’s broadcast ability and gives the fraudster credit for a fake click before a new app is opened. See here to find out more. iOS devices are harder to abuse in this manner.
• Ad-stacking or pixel stuffing: a simpler method, ad-stacking and pixel stuffing essentially crams ads into layers or pixels, making clicks count for more than they’re actually worth.
• Impressions labeled as clicks: here the ad has been loaded, perhaps even shown to the user, but registered as a click.
• Click stuffing and click spamming: this happens when a user goes to a website that generates loads of clicks and places a cookie on their device. When that user goes to an app store later, attribution is claimed by that cookie. More in this video.
How to identify click fraud?
There are three main metrics we can use to identify click fraud.
1. Time to install (TTI)
As mentioned in my first post, TTI depends on:
• the internet speed of the user
• the size of the app to install
These two data points allow you to calculate an average TTI, and find outliers, such as how many installs occurred within 10 seconds after the click. Plugging your data into a dashboard allows you to spot these more quickly, as seen below:
Fig. 1 Example of Time to Install for ‘good’ traffic (top) and ‘bad’ (bottom).
But don’t forget to just look at the dimension time itself! The local timestamp of a click should make sense to identify the good and the bad. Use time zone calculations to make it happen for you for every campaign.
2. Conversion Rate (CR) or Clicks to Install (CTI)
While TTI is a key metric for differentiating good traffic from bad, the data becomes much more useful when combined with conversion rates or Clicks to Install. Understanding how many installs occur within a timeframe can help us identify how many are genuine. Below is an example of how TTI and CR can be combined:
Fig. 2 Conversion rates organized into Time to Install ‘buckets’. Highlighted: 1) high percentage of installs after 1 hour, 2) too low CR, 3) low CR and high percentage of installs within 10 seconds
3. Counting IP + User Agent
IP addresses can be spoofed to hide their source, but we don’t really care if they are. The combination of the IP and the device’s User Agent serves as a better fingerprint. So, when fraudsters use click spamming, stuffing or ad-stacking, it shows in the raw click logs. However, this does require some heavy lifting: database work.
The graph below illustrates the number of clicks by a publisher from an IP address on a given day. The colors represent the affiliate/publisher.
Fig 3. Number of click per IP address by affiliate for one day
You may notice that the same IP address was used for different campaigns within the same timeframe (here a few minutes). This is not something a real human could do. A more advanced fraudster could use a proxy to diversify the IP data, so in order to capture IPs coming from a range, we split up the IP addresses and analyze the clicks in the same way we do to detect install fraud. For ad-stacking or pixel stuffing (and some other types of fraud), we’ll see clicks from the same source with more than one campaign at the same time.
Fig 4. Clicks by IP for 1 source / publisher at the same time, with campaigns in colors.
Instead of waiting for the installs, you can flag fraud by checking it at the click level, but you need some bigger machines working for you. Here’s a summary of how to identify click fraud:
- Use the local timestamp of the click
- Compare distributions of TTI by source/publisher
- Check low CR (click spamming, ad-stacking, etc.)
- Combine CR and TTI
As an industry, it is our responsibility to tackle fraud. By using your business knowledge, logic, and a few fancy tools, together we can make it happen. Stay tuned for the next entry, in which I will cover compliance fraud and how to tackle it.
About the writer
Ignas van den Einde is a business intelligence analyst and fraud specialist for Pocket Media. He’s passionate about creating value through innovation and efficiency, with a creative, data-driven and realistic perspective. In his day-to-day practice, he gives answers to business continuity. He also loves indoor-soccer, golf, and is fanatical about squash.