Ads and Fraud: A Comprehensive Survey of Fraud in Online Advertising
Abstract
:1. Introduction
- First, we describe the key elements of an online advertising system/platform, including their roles and interactions. We then systematically review different forms of digital advertising platforms.
- We introduce various schemes of advertisement placement as well as different revenue models in online ad platforms.
- We outline different approaches which cyber criminals are known to deploy in order to abuse online advertising models and conduct fraud. Subsequently, we introduce a new classification of all types of fraud in online adverting. This classification is structured around a set of ad fraud W3H questions: Who does What to Whom and How.
- To date, several cutting-edge solutions have been proposed to address the problem of fraud in online ad platforms. This article provides a thorough overview of the proposed methods and technologies in detecting and preventing fraudulent practices from the scientific as well as the practical perspective.
- Finally, we conclude the article by highlighting some open challenges and future research directions in this field while putting a special emphasis on machine learning techniques for the detection and prevention of fraud in online ad systems.
2. Online Advertising Ecosystem—Key Components
3. Different Forms of Digital Advertising
3.1. Contextual Advertising
3.2. Search Advertising
3.3. Behavioral Advertising
3.4. Display Advertising
4. Different Types of Publisher-Advertiser Contract in Digital Advertising
4.1. Manual Media Buying
4.2. Programmatic Media Buying
4.2.1. Real-Time Bidding (RTB)
- RTB has a flexible per-impression buying process which allows brands to bid on single impressions and in real-time, as opposed to the in-advance buying of publishers’ inventory and for a predetermined fixed price as is the case in the manual media buying.
- Advertisers and publishers use a single consolidated dashboard on their DSP or SSP platforms to deal with multiple partners, which results in much more straightforward and easier-to-manage campaigns and facilitates a more effective gathering of impression-level data/statistics.
- Effective gathering of impression-level data/statistics can generally lead to a more adaptable, and ultimately a more profitable, advertising strategy.
- RTB gives publishers the ability to sell ad space that was previously unwanted or unsold by making the pricing of this space flexible (i.e., self-adjusting) [13].
4.2.2. Private Marketplace (PMP)
4.2.3. Programmatic Direct
5. Revenue Models in Digital Advertising Ecosystem
5.1. Impression-Based Model (CPM)
5.2. Click-Based Model (CPC)
5.3. Action-Based Model (CPA)
6. User Tracking and Profiling Techniques
- Session-Based Tracking: A web session is defined as a sequence of user actions on an individual website within a given time frame. The purpose of session-based tracking is to facilitate the monitoring of user actions across multiple visited websites and/or over a longer interval of time. One common form of session-based tracking, called Session Identifiers Stored in Hidden Fields [25,26], involves passing a single identifier from one website to another through a URL. Another form of session-based tracking, called Explicit Web-Form Authentication [27,28], happens when a user logs in to a website and is asked to register before accessing the resources. In the third form of session-based tracking, window.name document object model (DOM) property [29] can be used to store data and be shared and applied by different visited parties/sites. Basically, the W3C Document Object Model (DOM) [30] is a cross-platform interface to access and interact with the content, structure, and style of web documents. It organizes all the objects in a tree structure, and it is language independent. The window.name property is resistant to page reloads, and it is accessible from other domains as well, which allows third-party content to exchange information with the first-party or with another third-party content [24].
- Storage-based Tracking. Cookies, also known as HTTP cookies, are the most common method in this category. A cookie is a small piece of data (i.e., identifier) generated by a web server the first time a user visits the website hosted by this server. The identifier is then sent to be stored in the client’s memory and retrieved each time the user visits the same website. It should be noted that a single website can incorporate content from multiple servers; thus, two different types of cookies are defined—first party and third-party cookies [31]. The first party cookies are set by first party servers directly associated with the URL of the page that the user has explicitly requested. The third-party cookies are widely used by advertising companies and are received when the browser implicitly fetches the third-party content (such as video ads) from a third-party server. Cookies can also be classified in terms of their life span into temporary and permanent cookies. Temporary cookies are stored in the browser cache and are expired as soon as the browser gets closed [32]. In contrast, permanent cookies have an expiration date and remain in the browser memory until that particular point in time (i.e., they can survive throughout multiple sessions). (For more information on other less common forms of storage-based tracking techniques, see [25].)
- Cache-based Tracking. The cache-based tracking mechanisms identify a browser instance and its respective user by deploying various types of caches such as Web cache [24,33], DNS cache [34], and Operational caches [24,35]. Namely, by relying on some distinctive items stored in these caches (e.g., previously acquired images or DNS records), it is possible to determine whether this specific user has already visited a given website or not.
- Fingerprinting. This approach to user tracking relies on different methods that can facilitate the extraction of unique identifiers associated with a user’s device (e.g., IP address, operating system, browser version, system and user languages deployed, etc.) [36,37,38]. These identifiers can then assist in tracking the user across multiple websites. For example, JavaScript and Flash can distinguish between different versions and architectures of operating systems and are referred to as the Operating System Instance Fingerprinting technique [39,40]. Furthermore, JavaScript can also facilitate the acquiring of information about the user’s local time zone and local date in milliseconds.
7. Fraud in Ad Ecosystem
7.1. Categories of Ad Fraud
7.1.1. Placement Fraud
- Malvertising: Malvertising is a form of placement fraud that is carried out by utilizing advertising malware that gets injected into a publisher’s website, and that ends up displaying unwanted ads or pop-ups on the computers of users visiting this site. Sood et al. [48] have pointed out that the common defects in the design of some website widgets pose a high risk of malvertising. Moreover, they have shown that Content Delivery Networks (CDNs)—as third-party ad servers which provide content to different domains on the Internet—are the primary means of spreading malvertising malware. By exploiting the servers of a particular CDN, attackers can inject malicious code in the form of malvertisement and achieve a broad distribution.
- Stuffing and Stacking: Ad fraud techniques that make use of components which are placed inside a web page but cannot be viewed by the naked eye are called ‘stuffing’ and ‘stacking. Stuffing fraud include two primary forms: keywords stuffing and placement stuffing. Keyword stuffing occurs when specific keywords are hidden in the HTML tags of a fraudster’s web page with the intention to increase the value/ranking of this page and its respective ads. On the other hand, placement stuffing is the act of hiding non-textual (i.e., multi-media) components inside a web page, such as: an ad in a small 1 × 1 pixel iframe (refer to Figure 4). Placement stacking is a fraud technique where two or more ads are placed/stacked on top of one other, with only the top ad being actually visible to the user [49]. A single impression or click on such stacked ads would enable the fraudster to bill multiple advertisers.
- Fake Sites: Placement of ads on fake websites is another strategy commonly deployed by online fraudsters. Fake websites used for the purposes of ad fraud typically have one or more of the following features:
- they use legitimate domain names but have no legitimate content except for the ad slots;
- they contain legitimate looking content, which is simply copied from other well-known websites;
- they deploy domain names that are look-alikes of some highly popular domains (this is also known as domain-name spoofing and is discussed next).
- 4.
- Domain-name Spoofing: Domain-name spoofing is a general type of online fraud in which the fraudster deploys a (fake) domain-name that appears as a legitimate/whitelisted domain-name. There are several different ways of how domain-name spoofing can be utilized specifically for the purpose of ad fraud: (i) A low-quality publisher disguises its domain-name as the domain-name of another premium publisher in order to sell its inventory to advertisers at higher prices. (ii) In the RTB systems, whenever a user/browser visits a web page, an ad-request containing the web page’s URL (i.e., domain name) is sent, launching a bidding war among different advertisers for the right to display an ad on this page. In some cases, publishers are allowed to explicitly declare/supply their domain name in these ad-requests. Fraudulent publishers can use this opportunity to misrepresent their inventory—i.e., supply the domain name of a known premium publisher—thus attracting higher bids by advertisers [50,51,52].
- 5.
- Ad injection and Malware: This type of fraud can also take multiple forms. For example, malicious adware may be run on a user’s computer to display unintended advertisement. Additionally, Internet service providers (ISPs) or Wi-Fi service providers may tamper with in-transit HTTP content to surreptitiously insert ads. In another situation, attackers can replace the legitimate websites’ ads with their malicious ads or attempt to put them on top of the other ads to modify the web pages for their malevolent intent. As a result of this, both advertisers and publishers will lose their reputation. For example, in 2014, Comcast, one of the leading ISPs, started to serve ads to their customers through its accessible Wi-Fi hotspots by injecting data into retrieved websites [53]. In particular, Comcast injected its JavaScript snippets into the packets/pages being returned by another real server. This decision raised several security concerns. Mediagazer was one of the victims. A small red advertisement appeared at the bottom of the Mediagazer page saying: “XFINITY Wi-Fi Peppy”, where Mediagazer did not sell this placement to XFINITY, but rather it happened as a result of ad injection. Even though Comcast did not have any malicious intent, the interaction of the JavaScript with the user’s browser and/or the host website could have created security vulnerabilities.
7.1.2. Traffic Fraud
- Impression Fraud: Impression fraud involves the fraudulent generation of visitor traffic to increase the number of impressions on a web page [54]. Impression fraud can be generated by bots or human labour hired to view web pages intentionally, or through the use of expired domains to divert visitors to fraudsters’ websites. Some publishers combine two or more of these approaches to increase the number of generated impressions in the auctions of RTB systems. An interesting study presented in [55] shows that sourced traffic (they are unknown users from unknown places that can be purchased for low CPMs) and bots formed twenty percent of network traffic in 2010. However, this amount dramatically increased to sixty percent in 2015. This increment was directly related to the evolution of advertising platforms (from direct sale in 1995 to RTB in 2015) and the level of fraud in the ad ecosystem (the level of impression fraud changed from low to very high in the same time interval).
- Click Fraud: Pay-per-click (PPC), also known as cost-per-click (CPC) marketing, is an essential marketing strategy for businesses in the digital advertising environment. A viewer’s click on an advertisement explicitly indicates an interest in the ad that may result in a purchase. Advertisers can assess an ad’s performance by measuring/calculating their click-through rate (CTR) ratio. CTR is defined as the number of clicks an ad has received, divided by the number of times the ad was shown (clicks/impressions). The goal of click fraud is to increase the CTR on an ad. It is important to note that both publishers and advertisers may be motivated to conduct click fraud. Namely, publishers may have an interest in committing click fraud as they are rewarded based on the number of executed clicks on the ads they display to their audiences, this type of click fraud is called Publisher Click Inflation. Figure 5 illustrates this attack. On the other side, most advertising campaigns have limited budgets, and each fake click consumes a small portion of that budget. Thus, one way that an advertiser could financially hurt its competitors is by generating a large number of fake clicks on the competitors’ ads. This type of click fraud is called Advertiser Competition Clicks. Similar to impression fraud, click fraud is conducted by means of bots and manual/human click farms [4].
7.1.3. Action Fraud
- Conversion Fraud: Conversion is any interaction with an advertisement that ultimately generates value (e.g., online purchase). Consequently, conversion fraud occurs when a conversion is artificially generated by a non-human entity (e.g., a bot) or by a human with malicious intent. Typically, upon a click on an advertisement, the user is redirected to the branding site (or landing page), which shows summarized information about the advertised services or product. For any purchase through such a landing page, the user is typically required to fill out a form by providing personal information such as name, address, and credit card number. One way of conducting conversion fraud is by filling these forms with fake or stolen customer information. Conversion fraud is generally committed either utilizing lead bots or by lead farms. Lead bots are automated computer programs that can fill out thousands of forms in a blink of an eye with either random or correct information. They are also able to click a link or download files automatically. On the other hand, in the lead farm method, genuine human labourers manually perform clicks to conversion with a malicious intent [1].
- Re-targeting Fraud: Identification and targeting of valuable customers based on their previous online behaviour patterns are common practices in ad-serving platforms. This process is also known as re-targeting [56] or re-marketing. As mentioned earlier, different tracking techniques (e.g., use of cookies) can be deployed to facilitate user re-targeting. The fundamental goal of re-targeting fraud is to give a false impression about previous customers’ behaviour and pretend they have an actual interest in a specific product or a service. In other words, through re-targeting fraud, fraudsters attempt to mislead advertisers into believing that fake users (e.g., a group of bots) are prospective purchasers and encourage them to put a higher bid price on impressions generated by these bots.
- Affiliate Fraud: In affiliate marketing [57], a business entity (also called an affiliate) is rewarded for every visitor that is ultimately brought to the advertiser’s site. Affiliate fraud encompasses a range of fraudulent activities that aim to fool the ad system into giving revenue to an affiliate while he/she actually does not qualify for it. Malware or Adware, cookie stuffing, and URL hijacking methods are three types of affiliate fraud. Malware or Adware (an unwanted software designed to generate advertisements on the screens [58]) installed in a user’s device can redirect him/her to the advertiser’s site via a fake affiliate marketing link. As a result, the affiliate can be in the position to claim a commission from the advertiser. In cookie stuffing based affiliate fraud (refer to Figure 6—different actors and revenue flow in the affiliate marketing system), an affiliate designs a web page to attract audiences potentially interested in a particular brand/product and then stuffs cookies into the audience’s computer. If later any of those audiences decide to purchase from the advertiser’s web page, the affiliate will be entitled to claim the commission. The last technique (Zhu et al. [1]), URL hijacking, which is also called Typosquatting, emerges as a consequence of the mistake users make by typing the name a website address. Typosquatting leads users to the wrong sites allowing the malicious affiliate to later claim commissions for actions that users might take in the future (malicious affiliate typically registers domains with misspelled names of popular websites) [59,60]. Table 2 summarizes all different categories of fraud in online digital advertising systems, and it specifies the actual conductors and victims of each enlisted type of fraud.
8. Taxonomy of Ad Fraud Prevention and Detection Methods
8.1. Detection of Digital Advertising Fraud by Commercial Companies
8.2. Detection of Digital Advertising Fraud in Academia
8.2.1. Placement Fraud Detection
8.2.2. Traffic Fraud Detection
8.2.3. Action Fraud Detection
8.3. Prevention of Digital Advertising Fraud
- Honeypot-based prevention approach (i.e., bluff ads) is a mechanism that allows advertisers to serve some small, unrecognizable bluff or honeypot ads in order to detect fraudulent activities in the ad system. The different conversion rates between the bluff-ads and the legitimate-ads can be used as an indicator to detect ongoing fraudulent activities.
- Signature-based method refers to identifying and preventing malicious traffic and bogus impressions by hunting specific patterns or features in the traffic. This mechanism uses a predefined pattern (signature) to decide if traffic is valid or not. For example, a typical signature is click count on published ads in order to detect duplicate clicks.
- Anomaly-based prevention technique applies statistical analysis and historical data to identify patterns of fraudulent behavior, and then use those patterns in order to detect suspicious ad placements and/or abnormal traffic.
- Credential-based prevention mechanism (also known as Website Popularity or Page Ranking) refers to the task of assessing the creditability of a publisher or an advertiser in order to discover the authenticity of their web page contents or the number of impressions they generate. The reverse crawling (reverse engineering is the process of understanding the functioning and structure of a website and its information [100]. For example, to evaluate a publisher’s credentials, DSPs and advertisers can use reverse engineering/crawling to find the content of web pages and verify that the content matches the tags associated with the impression when bidding) method and trusted website ranking are the most common approach in that endeavor.
9. Conclusions and Future Trends
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhu, X.; Tao, H.; Wu, Z.; Cao, J.; Kalish, K.; Kayne, J. Fraud Prevention in Online Digital Advertising; Springer: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
- Digital Ad Industry Will Gain $8.2 Billion By Eliminating Fraud and Flaws in Internet Supply Chain, IAB & EY Study Shows. Available online: https://www.iab.com/news/digital-ad-industry-will-gain-8-2-billion-by-eliminating-fraud-and-flaws-in-internet-supply-chain-iab-ey-study-shows (accessed on 4 November 2021).
- Ad Fraud Stats. 2021. Available online: https://www.businessofapps.com/research/ad-fraud-statistics/ (accessed on 4 November 2021).
- Pooranian, Z.; Conti, M.; Haddadi, H.; Tafazolli, R. Online advertising security: Issues, taxonomy, and future directions. IEEE Commun. Surv. Tutor. 2021, 23, 2494–2524. [Google Scholar] [CrossRef]
- Cai, Y.; Yee, G.O.; Gu, Y.X.; Lung, C.-H. Threats to online advertising and countermeasures: A technical survey. Digit. Threats Res. Pract. 2020, 1, 1–27. [Google Scholar] [CrossRef]
- Estrada-Jiménez, J.; Parra-Arnau, J.; Rodríguez-Hoyos, A.; Forné, J. Online advertising: Analysis of privacy threats and protection approaches. Comput. Commun. 2017, 100, 32–51. [Google Scholar] [CrossRef] [Green Version]
- What Is an Ad Network and How Does It Work?—Clearcode Blog. 2021. Available online: https://clearcode.cc/blog/what-is-an-ad-network-and-how-does-it-work/ (accessed on 4 November 2021).
- Dave, K.; Varma, V. Computational advertising: Techniques for targeting relevant ads found. Trends Inform. Retr. 2014, 8, 263–418. [Google Scholar] [CrossRef] [Green Version]
- Cook, K. A Brief History of Online Advertising. 2021. Available online: https://blog.hubspot.com/marketing/history-of-online-advertising (accessed on 4 November 2021).
- Panwar, A.; Onut, I.-V.; Miller, J. Towards real time contextual advertising. In International Conference on Web Information Systems Engineering; Springer: Berlin/Heidelberg, Germany, 2014; pp. 445–459. [Google Scholar]
- Boerman, S.C.; Kruikemeier, S.; Zuiderveen Borgesius, F.J. Online behavioral advertising: A literature review and research agenda. J. Advert. 2017, 46, 363–376. [Google Scholar] [CrossRef] [Green Version]
- Types Of Online Advertising. 2020. Available online: https://www.adskills.com/blog/7-types-of-online-advertising/ (accessed on 4 November 2021).
- Understanding RTB, Programmatic Direct, and PMP—Clearcode Blog. 2021. Available online: https://clearcode.cc/blog/rtb-programmatic-direct-pmp/ (accessed on 4 November 2021).
- Wang, J.; Zhang, W.; Yuan, S. Display advertising with Real-Time Bidding (RTB) and behavioural targeting. Found. Trends® Inf. Retr. 2017, 11, 297–435. [Google Scholar] [CrossRef]
- Ultimate Guide to the Private Marketplace for Publishers|Publift. 2020. Available online: https://www.publift.com//adteach/ultimate-guide-to-the-private-marketplace-for-publishers (accessed on 4 November 2021).
- DeBlasio, J.; Guha, S.; Voelker, G.M.; Snoeren, A.C. Exploring the dynamics of search advertiser fraud. In Proceedings of the 2017 Internet Measurement Conference, London, UK, 1–3 November 2017; pp. 157–170. [Google Scholar]
- Wilbur, K.C.; Zhu, Y. Click Fraud. Market. Sci. 2009, 28, 293–308. [Google Scholar] [CrossRef] [Green Version]
- Cufoglu, A. User Profiling-a Short Review. Int. J. Comput. Appl. 2014, 108, 3. [Google Scholar] [CrossRef]
- Haveliwala, T.H.; Jeh, G.M.; Kamvar, S.D. Targeted Advertisements Based on User Profiles and Page Profile. 27 November 2012. Available online: https://patents.google.com/patent/US8321278B2/en (accessed on 4 December 2021).
- Fleuren, M.C.W. User Profiling Techniques: A Comparative Study in the Context of e-Commerce Websites. Bachelor’s Thesis, Utrecht University, Utrecht, The Netherlands, 2012. [Google Scholar]
- Degeling, M.; Herrmann, T. Your interests according to google-a profile-centered analysis for obfuscation of online tracking profiles. arXiv 2016, arXiv:1601.06371. [Google Scholar]
- Google Data Collection Research. 2018. Available online: https://digitalcontentnext.org/blog/2018/08/21/google-data-collection-research/ (accessed on 4 November 2021).
- Dennis, W.L.; Erwin, A.; Galinium, M. Data mining approach for user profile generation on advertisement serving. In Proceedings of the 2016 8th International Conference on Information Technology and Electrical Engineering (ICITEE), Yogyakarta, Indonesia, 5–6 October 2016; pp. 1–6. [Google Scholar]
- Bujlow, T.; Carela-Espanol, V.; Lee, B.-R.; Barlet-Ros, P. A Survey on web tracking: Mechanisms, implications, and defenses. Proc. IEEE 2017, 105, 1476–1510. [Google Scholar] [CrossRef] [Green Version]
- Schmucker, N. Web Tracking. In SNET2 Seminar Paper-Summer Term. Citeseer. 2011. Available online: https://www.semanticscholar.org/paper/Web-Tracking-SNET-2-Seminar-Paper-Summer-Term-2011-Schm%C3%BCcker/304bb388a1e4e74a2109f39ff8ae0b6f66f0dd02 (accessed on 4 December 2021).
- What Is a Session ID? Available online: https://www.ionos.ca/digitalguide/hosting/technical-matters/what-is-a-session-id/ (accessed on 4 December 2021).
- Alaca, F. Strengthening Password-Based Web Authentication through Multiple Supplementary Mechanisms. Ph.D. Thesis, Carleton University, Ottawa, ON, Canada, 2018. [Google Scholar] [CrossRef] [Green Version]
- Session Management—OWASP Cheat Sheet Series. Available online: https://cheatsheetseries.owasp.org/cheatsheets/Session_Management_Cheat_Sheet.html (accessed on 4 December 2021).
- HTTP Cookie. Wikipedia. 2021. Available online: https://en.wikipedia.org/wiki/HTTP_cookie#window.name (accessed on 4 December 2021).
- DOM Standard. Available online: https://dom.spec.whatwg.org/ (accessed on 4 December 2021).
- Tracking Cookies—How to Limit Third-Party Data Collection. Comparitech. 2021. Available online: https://www.comparitech.com/blog/information-security/tracking-cookies/ (accessed on 4 December 2021).
- Nasir, M. Tracking and Identifying Individual Users in a Web Surfing Session; Computer and Network Security, Middlesex University: London, UK, 2014. [Google Scholar]
- Web Caching Basics: Terminology, HTTP Headers, and Caching Strategies. Available online: https://www.digitalocean.com/community/tutorials/web-caching-basics-terminology-http-headers-and-caching-strategies (accessed on 4 December 2021).
- Klein, A.; Pinkas, B. DNS Cache-Based User Tracking. In Proceedings of the Network and Distributed Systems Security (NDSS) Symposium 2019, San Diego, CA, USA, 24–27 February 2019. [Google Scholar] [CrossRef]
- HTTP Caching—HTTP|MDN. Available online: https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching (accessed on 4 December 2021).
- Sánchez, P.M.S.; Valero, J.M.J.; Celdrán, A.H.; Bovet, G.; Pérez, M.G.; Pérez, G.M. A survey on device behavior fingerprinting: Data sources, techniques, application scenarios, and datasets. IEEE Commun. Surv. Tutor. 2021, 23, 1048–1077. [Google Scholar] [CrossRef]
- Kaur, N.; Azam, S.; Kannoorpatti, K.; Yeo, K.C.; Shanmugam, B. Browser fingerprinting as user tracking technology. In Proceedings of the 2017 11th International Conference on Intelligent Systems and Control (ISCO), Coimbatore, India, 5–6 January 2017; pp. 103–111. [Google Scholar]
- Iqbal, U.; Englehardt, S.; Shafiq, Z. Fingerprinting the Fingerprinters: Learning to Detect Browser Fingerprinting Behaviors. In Proceedings of the 2021 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 24–27 May 2021; pp. 1143–1161. [Google Scholar] [CrossRef]
- OS and Application Fingerprinting Techniques|SANS Institute. Available online: https://www.sans.org/white-papers/32923/ (accessed on 4 December 2021).
- Al-Shehari, T.; Shahzad, F. Improving operating system fingerprinting using machine learning techniques. Int. J. Comput. Theory Eng. 2014, 57–62. [Google Scholar] [CrossRef] [Green Version]
- Geradin, D.; Katsifis, D.; Karanikioti, T. Google as a de Facto Privacy Regulator: Analyzing Chrome’s Removal of Third-Party Cookies from an Antitrust Perspective. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3738107# (accessed on 4 December 2021).
- Thomas, I. Planning for a cookie-less future: How browser and mobile privacy changes will impact marketing, targeting and analytics. Appl. Market. Anal. 2021, 7, 6–16. [Google Scholar]
- Wilander, J. Intelligent Tracking Prevention. WebKit. 2017. Available online: https://webkit.org/blog/7675/intelligent-tracking-prevention/ (accessed on 4 December 2021).
- Today’s Firefox Blocks Third-Party Tracking Cookies and Cryptomining by Default|The Mozilla Blog. Available online: https://blog.mozilla.org/en/products/firefox/todays-firefox-blocks-third-party-tracking-cookies-and-cryptomining-by-default/ (accessed on 4 December 2021).
- The Privacy Sandbox—The Chromium Projects. Available online: https://www.chromium.org/Home/chromium-privacy/privacy-sandbox (accessed on 4 December 2021).
- Bogna, J. What Is Google’s FLoC, and How Will It Track You Online? Available online: https://www.howtogeek.com/724441/what-is-googles-floc-and-how-will-it-track-you-online/ (accessed on 4 December 2021).
- Chrome is Removing Third-Party Data. What’s Next? Available online: https://www.match2one.com/blog/how-removal-of-third-party-cookies-affects-digital-marketers/ (accessed on 4 December 2021).
- Sood, A.K.; Enbody, R.J. Malvertising–exploiting web advertising. Comput. Fraud Secur. 2011, 2011, 11–16. [Google Scholar] [CrossRef]
- Edelman, B. Accountable? The problems and solutions of online ad optimization. IEEE Secur. Priv. Mag. 2014, 12, 102–107. [Google Scholar] [CrossRef]
- What Is Ad Fraud and How to Prevent It?|CLICKTRUST. We Are CLICKTRUST. 2020. Available online: https://clicktrust.be/en/blog/ppc/what-is-ad-fraud-and-how-to-counter-it/ (accessed on 4 December 2021).
- Ads.txt: A White Ops Perspective. 2018. Available online: https://www.humansecurity.com/blog/ads.txt-a-white-ops-perspective-1 (accessed on 4 November 2021).
- Vidakovic, R. The Beginner’s Guide to Digital Ad Fraud. AdProfs. Available online: https://adprofs.co/beginners-guide-to-digital-ad-fraud/ (accessed on 4 December 2021).
- Comcast Wi-Fi Serving Self-Promotional Ads via JavaScript Injection|Ars Technica. Available online: https://arstechnica.com/tech-policy/2014/09/why-comcasts-javascript-ad-injections-threaten-security-net-neutrality/ (accessed on 4 November 2021).
- Springborn, K.; Barford, P. Impression fraud in on-line advertising via pay-per-view networks. In Proceedings of the 22nd USENIX Security Symposium (USENIX Security 13), Washington, DC, USA, 14–16 August 2013; pp. 211–226. [Google Scholar]
- Dr. Augustine Fou—Independent Ad Fraud Researcher. Ad Fraud Ecosystem 2017 Update, 11:52:29 UTC. Available online: https://www.slideshare.net/augustinefou/ad-fraud-ecosystem-2017-update (accessed on 4 December 2021).
- What Is Retargeting and Which Problems Might Be Damaging Your Campaign? 2020. Available online: https://www.cheq.ai/retargeting (accessed on 4 November 2021).
- Dwivedi, Y.K.; Rana, N.P.; Alryalat, M.A.A. Affiliate marketing: An overview and analysis of emerging literature. Mark. Rev. 2017, 17, 33–50. [Google Scholar] [CrossRef]
- Adware—What Is It & How to Remove It? Available online: https://www.malwarebytes.com/adware (accessed on 4 November 2021).
- Dam, T.; Klausner, L.D.; Schrittwieser, S. Typosquatting for fun and profit: Cross-country analysis of pop-up scam. J. Cyber Secur. Mobil. 2020, 265–300. [Google Scholar] [CrossRef]
- Szurdi, J.; Kocso, B.; Cseh, G.; Spring, J.; Felegyhazi, M.; Kanich, C. The long “taile” of typosquatting domain names. In Proceedings of the 23rd USENIX Security Symposium, San Diego, CA, USA, 20–22 August 2014; pp. 191–206. [Google Scholar]
- Chachra, N.; Savage, S.; Voelker, G.M. Affiliate crookies: Characterizing affiliate marketing abuse. In Proceedings of the 2015 Internet Measurement Conference, IMC’15, Association for Computing Machinery, New York, NY, USA, 28–30 October 2015; pp. 41–47. [Google Scholar]
- Daswani, N.; Mysen, C.; Rao, V.; Weis, S.; Gharachorloo, K.; Ghosemajumder, S. Online Advertising Fraud. In Crimeware: Understanding New Attacks and Defenses; Addison-Wesley Professional: Boston, MA, USA, 2008; Volume 40, pp. 1–28. [Google Scholar]
- Zheng, Y.; Jeon, B.; Xu, D.; Wu, Q.M.; Zhang, H. Image segmentation by generalized hierarchical fuzzy C-means algorithm. J. Intel. Fuzzy Syst. 2015, 28, 961–973. [Google Scholar] [CrossRef]
- Zhang, Y.; Egelman, S.; Cranor, L.; Hong, J. Phinding Phish: Evaluating Anti-Phishing Tools; Carnegie Mellon University: Pittsburgh, PA, USA, 2007. [Google Scholar]
- Abbasi, A.; Chen, H. A Comparison of tools for detecting fake websites. Computer 2009, 42, 78–86. [Google Scholar] [CrossRef]
- Thomas, K.; Bursztein, E.; Grier, C.; Ho, G.; Jagpal, N.; Kapravelos, A.; Mccoy, D.; Nappa, A.; Paxson, V.; Pearce, P.; et al. Ad injection at scale: Assessing deceptive advertisement modifications. In Proceedings of the 2015 IEEE Symposium on Security and Privacy, San Jose, CA, USA, 17–21 May 2015; pp. 151–167. [Google Scholar] [CrossRef]
- Almahmoud, S.; Hammo, B.; Al-Shboul, B.; Obeid, N. A Hybrid Approach for Identifying Non-Human Traffic in Online Digital Advertising. Multimed. Tools Appl. 2021, 1–34. [Google Scholar] [CrossRef]
- Neal, A.; Kouwenhoven, S.; Sa, O. Quantifying Online Advertising Fraud: Ad-Click Bots vs. Humans; Oxford Bio Chronometrics: London, UK, 2015. [Google Scholar]
- Zhang, L.; Guan, Y. Detecting click fraud in pay-per-click streams of online advertising networks. In Proceedings of the 2008 28th International Conference on Distributed Computing Systems, Washington, DC, USA, 17–20 June 2008; pp. 77–84. [Google Scholar]
- Stitelman, O.; Perlich, C.; Dalessandro, B.; Hook, R.; Raeder, T.; Provost, F. Using Co-Visitation networks for detecting large scale online display advertising exchange fraud. In Proceedings of the 19th ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery, New York, NY, USA, 11–14 August 2013; pp. 1240–1248. [Google Scholar]
- Tian, T.; Zhu, J.; Xia, F.; Zhuang, X.; Zhang, T. Crowd fraud detection in internet advertising. In Proceedings of the 24th International Conference on World Wide Web, WWW’15, International World Wide Web Conferences Steering Committee, Geneva, Switzerland, 18–22 May 2015; pp. 1100–1110. [Google Scholar]
- Shekhter, H. System and Method for Detecting Fraudulent Affiliate Marketing in an Online Environment. U.S. Patent 20110251869A1, 13 October 2011. [Google Scholar]
- Budak, C.; Goel, S.; Rao, J.; Zervas, G. Understanding Emerging Threats to Online Advertising. In Proceedings of the 2016 ACM Conference on Economics and Computation, Maastricht, The Netherlands, 24–28 July 2016; pp. 561–578. [Google Scholar]
- Fight Ad Fraud with SecureAd. Fight Digital Fraud with Oxford BioChronometrics. Available online: https://oxford-biochron.com/fight-ad-fraud-with-securead/ (accessed on 4 November 2021).
- DoubleVerify—DoubleVerify Authenticates the Quality of Digital Media for the World’s Largest Brands Ensuring Viewable, Fraud-Free, Brand-Safe Ads. Available online: https://doubleverify.com/company/ (accessed on 4 November 2021).
- HUMAN. HUMAN|Bot Mitigation|Know Who’s Real. Available online: https://www.humansecurity.com (accessed on 4 November 2021).
- Integral Ad Science|Digital ad Tech & Verification. Available online: https://integralads.com/uk/ (accessed on 4 November 2021).
- Limited, C. © 2021 P. E. Pixalate—Ad Fraud Protection, Privacy, and Compliance Platform (CTV). Available online: https://www.pixalate.com (accessed on 4 November 2021).
- Ad Fraud Protect & Monitor: Stop Affiliate, Influencer Fraud. Available online: https://impact.com/protect-monitor/ (accessed on 4 November 2021).
- ClickGUARDTM|Leading Click Fraud Protection Software. Available online: https://www.clickguard.com/ (accessed on 12 November 2021).
- Measurement, Analytics, & Brand Safety|Moat by Oracle Data Cloud. Available online: https://www.moat.com/ (accessed on 4 November 2021).
- Li, Z.; Zhang, K.; Xie, Y.; Yu, F.; Wang, X. Knowing your enemy: Understanding and detecting malicious web advertising. In Proceedings of the 2012 ACM Conference on Computer and Communications Security, CCS ’12, Association for Computing Machinery, New York, NY, USA, 16–18 October 2012; pp. 674–686. [Google Scholar]
- Kantardzic, M.; Walgampaya, C.; Yampolskiy, R.; Woo, R.J. Click Fraud Prevention via multimodal evidence fusion by Dempster-Shafer theory. In Proceedings of the 2010 IEEE Conference on Multisensor Fusion and Integration, Salt Lake City, UT, USA, 5–7 September 2010; pp. 26–31. [Google Scholar] [CrossRef]
- Ge, L.; King, D.; Kantardzic, M. Collaborative Click Fraud Detection and Prevention System (CCFDP) Improves Monitoring of Software-Based Click Fraud. 2005.E-COMMERCE 2005, 34. Available online: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.420.8672&rep=rep1&type=pdf#page=53 (accessed on 12 November 2021).
- Haddadi, H. Fighting online click-fraud using bluff ads. ACM SIGCOMM Comput. Commun. Rev. 2010, 40, 21–25. [Google Scholar] [CrossRef] [Green Version]
- Dave, V.; Guha, S.; Zhang, Y. Measuring and fingerprinting click-spam in ad networks. In Proceedings of the ACM SIGCOMM 2012 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, Association for Computing Machinery, New York, NY, USA, 13–17 August 2012; pp. 175–186. [Google Scholar]
- Nagaraja, S.; Shah, R. Clicktok: Click Fraud Detection Using Traffic Analysis. In Proceedings of the 12th Conference on Security and Privacy in Wireless and Mobile Networks, WiSec ’19, Association for Computing Machinery, New York, NY, USA, 15–17 May 2019; pp. 105–116. [Google Scholar]
- Basit, A.; Zafar, M.; Javed, A.R.; Jalil, Z. A Novel Ensemble Machine Learning Method to Detect Phishing Attack. In Proceedings of the 2020 IEEE 23rd International Multitopic Conference (INMIC), Bahawalpur, Pakistan, 5–7 November 2020; pp. 1–5. [Google Scholar] [CrossRef]
- Jain, A.K.; Gupta, B.B. Towards detection of phishing websites on client-side using machine learning based approach. Telecommun. Syst. 2017, 68, 687–700. [Google Scholar] [CrossRef]
- Thejas, G.S.; Dheeshjith, S.; Iyengar, S.S.; Sunitha, N.R.; Badrinath, P. A hybrid and effective learning approach for Click Fraud detection. Mach. Learn. Appl. 2020, 3, 100016. [Google Scholar] [CrossRef]
- Thejas, G.S.; Soni, J.; Boroojeni, K.G.; Iyengar, S.S.; Srivastava, K.; Badrinath, P.; Sunitha, N.R.; Prabakar, N.; Upadhyay, H. A multi-time-scale time series analysis for click fraud forecasting using binary labeled imbalanced dataset. In Proceedings of the 2019 4th International Conference on Computational Systems and Information Technology for Sustainable Solution (CSITSS), Bengaluru, India, 20–21 December 2019; Volume 4, pp. 1–8. [Google Scholar]
- Haider, C.M.R.; Iqbal, A.; Rahman, A.H.; Rahman, M.S. An ensemble learning based approach for impression fraud detection in mobile advertising. J. Netw. Comput. Appl. 2018, 112, 126–141. [Google Scholar] [CrossRef]
- Snyder, P.; Kanich, C. No Please, After You: Detecting Fraud in Affiliate Marketing Networks. In WEIS; 2015; Available online: https://www2.cs.uic.edu/~ckanich/papers/snyder2015noplease.pdf (accessed on 12 November 2021).
- Best Practices for Ad Placement—Google AdSense Help. Available online: https://support.google.com/adsense/answer/1282097?hl=en (accessed on 4 November 2021).
- About Confirmed Click—Google AdMob Help. Available online: https://support.google.com/admob/answer/10094971?hl=en#zippy=%2Chow-can-i-fix-accidental-clicks-on-my-ad-units (accessed on 4 November 2021).
- Jakobsson, M.; Ramzan, Z. Crimeware: Understanding New Attacks and Defenses; Addison-Wesley Professional: Boston, MA, USA, 2008. [Google Scholar]
- A Digital Publisher s Guide to Measuring and Mitigating Non-Human Traffic—PDF Free Download. Available online: https://businessdocbox.com/Advertising/74441712-A-digital-publisher-s-guide-to-measuring-and-mitigating-non-human-traffic.html (accessed on 4 November 2021).
- Bot Baseline: Fraud in Digital Advertising. Available online: https://www.ana.net/miccontent/show/id/rr-2019-bot-baseline (accessed on 4 November 2021).
- Stone-Gross, B.; Stevens, R.; Zarras, A.; Kemmerer, R.; Kruegel, C.; Vigna, G. Understanding fraudulent activities in online ad exchanges. In Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference, Association for Computing Machinery, New York, NY, USA, 15–19 August 2011; pp. 279–294. [Google Scholar]
- Kienle, H.M.; German, D.; Muller, H. Legal concerns of web site reverse engineering. In Proceedings of the Sixth IEEE International Workshop on Web Site Evolution Proceedings, Chicago, IL, USA, 11 September 2004; pp. 41–50. [Google Scholar]
- Chen, G.; Cox, J.H.; Uluagac, A.S.; Copeland, J.A. In-depth survey of digital advertising technologies. IEEE Commun. Surv. Tutorials 2016, 18, 2124–2148. [Google Scholar] [CrossRef]
- Dörnyei, K.R. Marketing professionals’ views on online advertising fraud. J. Curr. Issues Res. Advert. 2020, 42, 156–174. [Google Scholar] [CrossRef]
- Fulgoni, G.M. Fraud in digital advertising: A multibillion-dollar black hole: How marketers can minimize losses caused by bogus web traffic. J. Advert. Res. 2016, 56, 122. [Google Scholar] [CrossRef]
- Kshetri, N.; Voas, J. Online advertising fraud. Computer 2019, 52, 58–61. [Google Scholar] [CrossRef] [Green Version]
- Gordon, B.R.; Jerath, K.; Katona, Z.; Narayanan, S.; Shin, J.; Wilbur, K.C. Inefficiencies in digital advertising markets. J. Mark. 2020, 85, 7–25. [Google Scholar] [CrossRef] [Green Version]
- Kanei, F.; Chiba, D.; Hato, K.; Yoshioka, K.; Matsumoto, T.; Akiyama, M. Detecting and understanding online advertising fraud in the wild. IEICE Trans. Inf. Syst. 2020, E103, 1512–1523. [Google Scholar] [CrossRef]
- Lee, H.; Cho, C.-H. Digital advertising: Present and future prospects. Int. J. Advert. 2019, 39, 332–341. [Google Scholar] [CrossRef]
- Kietzmann, J.; Paschen, J.; Treen, E. Artificial intelligence in advertising: How marketers can leverage artificial intelligence along the consumer journey. J. Advert. Res. 2018, 58, 263–267. [Google Scholar] [CrossRef]
- Qin, X.; Jiang, Z. The impact of AI on the advertising process: The chinese experience. J. Advert. 2019, 48, 338–346. [Google Scholar] [CrossRef]
- Chen, G.; Xie, P.; Dong, J.; Wang, T. Understanding programmatic creative: The role of AI. J. Advert. 2019, 48, 347–355. [Google Scholar] [CrossRef]
- Deng, S.; Tan, C.-W.; Wang, W.; Pan, Y. Smart Generation system of personalized advertising copy and its application to advertising practice and research. J. Advert. 2019, 48, 356–365. [Google Scholar] [CrossRef]
- Malthouse, E.C.; Hessary, Y.K.; Vakeel, K.A.; Burke, R.; Fudurić, M. An algorithm for allocating sponsored recommendations and content: Unifying programmatic advertising and recommender systems. J. Advert. 2019, 48, 366–379. [Google Scholar] [CrossRef]
- Alcantara, C.; Schaul, K.; Vynck, G.D.; Albergotti, R. How Big Tech Got So Big: Hundreds of Acquisitions. Available online: https://www.washingtonpost.com/technology/interactive/2021/amazon-apple-facebook-google-acquisitions/ (accessed on 4 November 2021).
- Lai, Z. Research on advertising core business reformation driven by artificial intelligence. J. Physics Conf. Ser. 2021, 1757, 012018. [Google Scholar] [CrossRef]
- Li, H. Special section introduction: Artificial intelligence and advertising. J. Advert. 2019, 48, 333–337. [Google Scholar] [CrossRef]
- Manheim, K.; Kaplan, L. Artificial intelligence: Risks to privacy and democracy. Yale JL Tech. 2019, 21, 106. [Google Scholar]
- Vlačić, B.; Corbo, L.; Costa e Silva, S.; Dabić, M. The evolving role of artificial intelligence in marketing: A review and research agenda. J. Bus. Res. 2021, 128, 187–203. [Google Scholar] [CrossRef]
- Juniper Research: Advertising Fraud Losses to Reach $42 Billion in 2019, Driven by Evolving Tactics by Fraudsters. 2019. Available online: https://www.businesswire.com/news/home/20190520005650/en/Juniper-Research-Advertising-Fraud-Losses-to-Reach-42-Billion-in-2019-Driven-by-Evolving-Tactics-by-Fraudsters (accessed on 4 November 2021).
- Li, Y. Deep reinforcement learning: An overview. arXiv 2017, arXiv:1701.07274. [Google Scholar]
Category | Subcategory |
---|---|
Placement Fraud | Malvertising |
Stuffing and Stacking | |
Fake Sites | |
Domain Spoofing | |
Ad Injection and Malware | |
Traffic Fraud | Impression Fraud |
Click Fraud | |
Action Fraud | Conversion Fraud |
Re-targeting Fraud | |
Affiliate Fraud |
Type of Fraud | Does What (Sub-Type of Fraud) | Who (Fraudster) | to Whom (Victim) | How (Objective) | Ref. | |||
---|---|---|---|---|---|---|---|---|
Placement Fraud (ads and ad related content is placed on a legitimate publisher’s website or a site set-up by a fraudster, with the goal of inflating the number of ad clicks and/or impressions) | Stuffing | Keyword Stuffing | Dishonest Publisher | Advertiser |
| [62,63] | ||
Placement Stuffing | ||||||||
Stacking | ||||||||
Domain Spoofing/Fake Sites | Fraudster | Advertiser/User |
| [64,65] | ||||
Malicious Toolbar/ Malicious Adware | Fraudster/ Dishonest Publisher | Advertiser/User/ Premium Publisher |
| |||||
Ad/Content Injection | Dishonest Publisher/ Deceitful ISP/ Malicious Competitor Publisher | Advertiser/User/Publisher |
| [66] | ||||
Traffic fraud (techniques that deploy ingenuine web visitors to inflate the number of clicks and/or impressions on a website) | Impression Fraud | Dishonest Publisher/ Malicious Competitor Advertiser | Advertiser |
| [67,68,69,70,71] | |||
Click Fraud | Publisher Click Inflation | |||||||
Advertiser Competition Clicks | ||||||||
Action fraud (techniques that falsify user actions or mislead users into performing certain actions so as to generate revenue for the fraudster) | Conversion Fraud | Dishonest Publisher/ Malicious Advertiser/Fraudster | Advertiser/User |
| [5,72,73] | |||
Re-targeting Fraud | Fraudster/ Dishonest Publisher | Advertiser |
| |||||
Affiliate Fraud | Malware and Adware | Nefarious Affiliate | User/ Advertiser |
| [2,5,72,73] | |||
Cookie stuffing | ||||||||
URL Hijacking | User/Advertiser/ Impersonated Publisher |
Threat | Mitigation Strategy | Key Points | Ref. | |
---|---|---|---|---|
Placement Fraud | Malvertising | MadTracer—Topology-based detection model |
| [83] |
Ad Injection | A client-side DOM scanner |
| [66] | |
Fake Website/Phishing Attack | Ensemble model of Artificial Neural Network, K-Nearest Neighbours, and C4.5 with Random Forest Classifier (RFC) |
| [88] | |
Phishing Website | Random Forest, Support Vector Machine, Neural Networks, Logistic Regression, Naïve Bayes |
| [89] | |
Traffic Fraud | Click Fraud | CFXGB (Cascaded Forest and XGBoost), Feature transformation and classification. |
| [90] |
Click fraud | Traffic analysis |
| [87] | |
Click Fraud | Multi-time-scale Time Series Analysis |
| [91] | |
Impression Fraud | Ensemble Learning, Decision Tree classifier and Support Vector Machine |
| [92] (This study introduces a novel impression fraud detection model in mobile advertising. There is a lack of prior research studies on the topic.) | |
Action Fraud | Cookie-stuffing | Tracker—A custom-built Chrome extension |
| [61] |
Cookie-stuffing | Decision-tree based technique |
| [93] |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sadeghpour, S.; Vlajic, N. Ads and Fraud: A Comprehensive Survey of Fraud in Online Advertising. J. Cybersecur. Priv. 2021, 1, 804-832. https://doi.org/10.3390/jcp1040039
Sadeghpour S, Vlajic N. Ads and Fraud: A Comprehensive Survey of Fraud in Online Advertising. Journal of Cybersecurity and Privacy. 2021; 1(4):804-832. https://doi.org/10.3390/jcp1040039
Chicago/Turabian StyleSadeghpour, Shadi, and Natalija Vlajic. 2021. "Ads and Fraud: A Comprehensive Survey of Fraud in Online Advertising" Journal of Cybersecurity and Privacy 1, no. 4: 804-832. https://doi.org/10.3390/jcp1040039
APA StyleSadeghpour, S., & Vlajic, N. (2021). Ads and Fraud: A Comprehensive Survey of Fraud in Online Advertising. Journal of Cybersecurity and Privacy, 1(4), 804-832. https://doi.org/10.3390/jcp1040039