Fraud and in specific install fraud is a hot topic in the mobile ad industry. In a series of posts, I will outline how to identify different types of fraud. In this article I will focus on Install fraud. Install fraud: fake click, fake user
How to: Four ways to identify Mobile Install Fraud
Would you allow your marketing budget to be drained by companies that simply pretend to use your apps? Of course not.
Fraud and in specific install fraud is a hot topic in the mobile ad industry. Why? Because it’s a widespread issue, and mobile advertising is a booming $100 billion-plus industry. The value of filling ad slots seen by mobile users is only growing. Wouldn’t your advertising dollars be better spent on attracting real users than on doling it out to companies with no actual share? In a series of posts, I will outline how to identify different types of fraud:
- Install fraud: fake click, fake user
- Click fraud: fake click, genuine user
- Compliance fraud: genuine click, genuine user
In this first post, I’ll dive into what install fraud is. Then I’ll discuss how to identify it.
What is install fraud?
The install is where the money is, so companies tend to minimize install fraud first. Install fraud is a tactic used to impersonate human behavior. It is designed to quickly scale a user base and drive up costs for the marketer. These ‘users’ are not genuinely interested in the app. They’re either bots or real humans with malicious intent.
When taking the volume of data into account, from a technological perspective, it’s easier to analyze installs instead of clicks. With a tool like Excel, you’d still be able to analyze the installs (to a max of 1 million), whereas the clicks easily reach 10 million or more per day. You will also need to invest in other BI tools. And when applying algorithms to automate fraud alerts (machine learning), you need to invest in the knowledge and tools to gain control of your traffic. So let’s dig a little deeper.
How to identify fake installs?
When analyzing the data, I always start with asking myself how a real human would respond to this ad. Not all users click the ad, then download the app and keep using day in day out. Still, looking at standard/average behavior makes sense. Ask yourself: how many times have I clicked on an ad I was interested in but never installed and used the app? The answer will give you a baseline conversion rate (CR), or clicks to install (CTI) rate. How many clicks does it take on average to install the app? Some companies have already pointed out two types of fake installs:
- App install farms: a crime for which real traffic to an app is simulated without having the intention to use the app for its purpose. Here’s an example of what one looks like.
- Bots and botnets: devices can be infected by malicious software in order to install apps and perform in-app activity
Four ways to identify install fraud
1. IP address
IP addresses can be spoofed, to hide where they’re actually coming from. To see if an IP address is blacklisted or known in other databases, we run it through external software that in turn gives us more insight about the IP address used. The number of unique IP addresses used to install an app could show us if that same IP address was used for other campaigns. How probable is it that a real human clicks on several offers on the same day, from the same IP address, all within a few minutes? Not so probable.
The graph below illustrates the number of installs by an affiliate IP address on a given day. The colors represent the affiliate/publisher.
Fig 1. Number of installs per IP address by affiliate for one day
When checking the data behind the graph, we see that the same IP address was used for different campaigns within the same time window (a few minutes). Definitely not something a real human could achieve.
As already implied, fraudsters are innovative. A more advanced system wouldn’t use the same IP address but would use a proxy to create diversity in the data.
In order to capture IPs coming from a range, we split up the IP address in four parts and do another count on the first three parts.
A report for this IP range is detailed in this graph below.
Fig 2. Installs per IP range by affiliate within 1 minute
While the number of installs per IP range might not be alarming, the number of splits might be. We also should not forget that the number of unique IP addresses depend on the country’s policy and how they are handled by carriers. The combination of both would lead us to detect proxy-like traffic.
Although I have seen some companies addressing the user agent (explained here) in combination with the IP, I couldn’t use this as a major key to identify suspicious traffic. It’s clear that it should provide a (better) fingerprint, but our data doesn’t back this up (or is mostly free from this fraudulent traffic). If you have discovered fraud using this technique, don’t leave it out of your analysis.
The uniqueness of the user agent is getting less important because of device and software standardisation. As Apple devices are less varied, the IP-User Agent combination is more prone to appear. Some good information about that can be found here.
From a device perspective, we can tell a lot about the traffic. Just looking at the share of the device operating system and model (or even version) generates useful insights. We would be able to discover, for instance, whether some networks were using a device model called AndyWin Emulator. Such a device
model clearly intends to fake a device and install an app from a desktop.
When installs are faked on Android, the fraudsters want to obscure them by using different device models. There’s no leading share of a device model. When I had another look at the OS version, it also seemed to be spoofed per device model. In the market, we notice that there’s a trend of staying on older versions, but compare this to the stats per country to double check. For Apple this is harder to track, because all devices are stored as iPhone or iPad.
Here’s a graph that shows an unusual spread by device model (on an Android campaign). All three campaigns were already flagged for an abnormally long Time to Install.
Fig 3. Installs in % of Total by Device Model for three different affiliates running different campaigns
Compare these shares of device model to bots that are developed to show uniqueness. We see fraud happening on Apple, too. I hope Apple is aware and is trying to close their OS for bots, etc. But we cannot assume Apple devices are free from fraud.
3. Time to Install (TTI)
A lot has already been said and written about the time to install. It’s one of the key metrics to identify fraudulent installs, which might come in too fast or too slow. Additionally, you can store the app ID and link it to the appstore URL to filter out the size of the app. In combination with the average speed per country, we’ve created a way to highlight statistical outliers per country, per campaign. Although the TTI is an important factor for analysis, it will not identify install fraud as such. (In my next post, I will go into more detail about click fraud and how you can use TTI to identify it.) On the other hand the exact time of the install ís an indicator. Use the time zone settings to trace back the local timestamp of the install. Normal behavior towards installs would peak in the morning around 8-9h, in the afternoon around 15h and in the evening around 19-21h. If you see installs kicking in at 2-5 in the morning there might be something with the server delaying your installs or the installs were faked.
To identify install fraud coming from human farms or bots/botnets, you should have a closer look at the in-app activity. It’s quite easy to set up goals for reaching a certain activity that you can trace and evaluate. Either in-app purchases or getting towards a certain game level would be easy to start with. These farms will simulate the install (that’s where the money is in the short term) but will stay away from keeping the app active. If bots were to automate in-app traffic, this would lead to an almost perfect, stable behavior for a particular network/site compared to traffic from other sites.
Conclusion regarding mobile install fraud
So, this is just the beginning. Fraud is here and we all want to minimize it to improve ROI for our mobile ad campaigns. In this first post, I’ve presented the metrics you can use to start identifying install fraud:
- Count the IP addresses that are used for several offers across your site/network to check for farm behavior.
- Use device characteristics to detect unusual shares in app usage.
- Generate an average Time to Install per campaign and highlight statistical deviations.
- Use the local timestamp of the install.
- Use trackable goals to measure in-app activity and define KPIs for reaching those goals.
As an industry, we have the responsibility to tackle fraud. Together we can make that happen! Stay tuned for the next post, where I will cover click fraud and what to do about it.
About the writer
Ignas van den Einde is a business intelligence analyst and fraud specialist for Pocket Media. His passion is to create business value through innovation and efficiency with a creative, data-driven and realistic perspective. In his day-to-day practice, he gives answers to business continuity. He is also an indoor soccer lover, a squash fanatic, and an avid golfer.