1. Introduction
The introduction of the General Data Protection Regulation (GDPR) has had a significant impact on the online advertising industry, giving users more control over their personal data [
1]. A key feature of the GDPR is its broad applicability, extending beyond European companies to all global companies that process the data of EU citizens. The GDPR strengthens consumers’ rights to control their data, but its implementation, particularly in relation to cookie banners, presents challenges for companies and requires significant investments in legal, privacy and web development resources [
2].
The IAB Europe’s Transparency and Consent Framework (TCF:
https://iabeurope.eu/transparency-consent-framework/, accessed on 18 December 2023) was developed to address GDPR complexities, particularly in managing user data processing permissions. The TCF provides standardization guidelines, but lacks specific directions for cookie banner implementation, leading users through a complex consent process. Companies commonly employ tactics like deceptive dark patterns and cookie paywalls to encourage cookie acceptance. These include making the “Accept all” option more prominent in cookie banners and implementing complex procedures for users opting out. Cookie paywalls pose another indirect influence, where access to content is conditional on data processing consent or subscription payment [
3,
4].
Despite the GDPR’s efforts, users often face tedious decision-making processes with cookie banners. Providing comprehensive information can lead to unappealing banners, conflicting with users’ desire for quick website access. The time-consuming nature of understanding and responding to these banners contributes to the privacy paradox, where privacy-concerned users often consent to all tracking cookies [
5].
Personal Information Management Systems (PIMSs) (
https://edps.europa.eu/data-protection/our-work/subjects/personal-information-management-system_en, accessed on 18 December 2023), also referred to as Personal Information Management Services, have emerged as a solution to these challenges, providing centralized tools for organizing personal information and facilitating informed decision making [
1]. A PIMS ideally automates the updating and communication of user preferences to data controllers and processors, but faces technical and regulatory obstacles, including establishing connections and consensus on a common technical language. Currently, PIMS implementations are very basic, mainly as browser extensions for Google Chrome and Mozilla Firefox. These extensions manage consent decisions by blocking cookie banners, opting out of data processing activities, or automatically responding to data processing permission requests. However, the use of extensions that block all cookie banners may prevent consent for all data processing, potentially affecting website functionality.
In our preceding work [
6], we presented an initial proof of concept illustrating how a PIMS, augmented by a Large Language Model (LLM), can analyze the contents of cookie banners and formulate interaction rules. Building upon this foundation, the current study advances the concept by delineating the specifications for a User Privacy Policy Management System (UPPMS). The UPPMS, a specialized derivative of a PIMS, is designed to aid users in crafting, generating, and implementing their personal privacy policies.
We introduce a standardized methodology for users to define their cookie policy preferences. This has enabled the development of a framework for rule generation that incorporates customized LLMs. We delineate a systematic process for the application of these rules. This approach streamlines the application of cookie policy preferences and helps to avoid deceptive patterns. Additionally, we conducted an experiment to test user behavior when visiting a website with cookie banners for the first time. We compared the time it takes for humans and the UPPMS to express preferences and close the banner.
The remainder of this article is structured as follows.
Section 2 introduces Large Language Models and examines the context related to user privacy signals and the privacy paradox.
Section 3 discusses related works, providing insights into existing research and developments in the field.
Section 4 describes the requirements and how the UPPMS, leveraging a Large Language Model, enables an end-to-end process from defining the user’s policy to its application during navigation. The experiments are described in
Section 5, and the results are presented in
Section 6.
Section 7 discusses the implications of the study. Finally,
Section 8 draws conclusions and proposes future research directions.
2. Background
2.1. Large Language Models
Large Language Models (LLMs) represent an advanced class of machine learning models, particularly in the field of Natural Language Processing (NLP). They are built upon deep neural architectures, such as Transformers, which have demonstrated remarkable capabilities in understanding, generating, and manipulating natural language [
7].
The concept of Transformers, forming the foundation of LLMs, was introduced by Vaswani et al. [
8]. This work laid the groundwork for the development of large-scale language models such as GPT (Generative Pre-trained Transformer), which undergoes extensive unsupervised pre-training followed by supervised fine-tuning on specific tasks [
9]. Training these models involves extensive exposure to vast text corpora, allowing them to acquire a profound grasp of language subtleties and a broad general knowledge. These models excel in generating coherent and contextually relevant texts, addressing queries, translating languages, and executing text comprehension tasks. For instance, in [
10], it has been demonstrated that LLMs effectively comprehend HTML, significantly enhancing accuracy and efficiency in tasks such as semantic classification of HTML elements, description generation for HTML inputs, and autonomous web navigation.
In our work, we employ customized LLMs to perform tasks that necessitate a syntactic and semantic understanding of a web page. We fine-tune LLM models to identify cookie banners within a web page and, through text comprehension, discern the buttons on the cookie banners and the types of actions they enable for accepting or rejecting cookies.
2.2. Cookies
Cookies are small pieces of data that are stored in a user’s web browser. They are also referred to as web cookies, browser cookies, Internet cookies, or HTTP cookies. While different taxonomies for cookies exist, especially from a technical perspective, a detailed discussion of cookie categorization is beyond the scope of this article. What is relevant to our discussion, however, is the differentiation between technical cookies and profiling cookies established by applicable laws regarding cookies. Technical cookies primarily ensure the appropriate functioning of a website and are also referred to as “strictly necessary”. In contrast, profiling cookies are cookies that gather non-anonymized information that enables the tracking of a person’s browsing activity even across various devices. They are not treated equally from a regulatory standpoint. According to the EU’s GDPR, strictly necessary cookies can be stored without the user’s explicit consent as long as a notice is displayed. In practice, the level of detail in the categories of cookies presented to users can vary.
Table 1 shows cookies categorized by the most common purposes. With “strictly necessary” cookies, we refer to all cookies that are essential for the basic functioning of the website. Without these cookies, requests cannot be properly delivered. For instance, they enable mechanisms such as authentication and load balancing of requests. Functional cookies, on the other hand, involve other website functionalities that are not essential but can improve the user experience. For example, they enable the personalization of user interactions by remembering language preferences or location to customize content delivery.
The other main types of non-essential cookies are performance, analytics, and targeting cookies. Performance cookies anonymously collect user interaction data, aggregating it for performance analysis. They contribute to optimizing aspects such as website visit duration and loading speeds, thereby aiding in performance enhancement. Analytics cookies delve into user behavior, providing website owners with insights to comprehend and improve website interactions. Primarily first-party, these cookies focus on enhancing user experience by analyzing and optimizing user engagement with the website. Targeting cookies, predominantly third-party, are designed to deliver relevant advertisements based on user profiles. They follow users across different websites to enable targeted advertising. This classification also recognizes the existence of cookies that may defy these categories or have yet to be classified. Such cookies are often labeled as “Other” or “Unclassified” in consent banners.
2.3. Cookie Banners
To store types of cookiesther than strictly necessary, publishers must explicitly request users’ consent for their use. In most cases, this explicit consent is implemented by displaying a cookie consent banner, informing users of the types of cookies that may be stored, and giving them the possibility to accept or reject them according to their preferences. It is well known that cookie banners are not always impartial [
11]. Often, these banners incorporate design elements that subtly encourage users to accept all cookies through deceptive patterns [
12]. These nudges are also known as dark patterns [
13].
Deceptive patterns highlight the buttons that grant permission for all purposes specified by the publisher, making them more visible to users. This design may include prominently presenting an “accept all” option on the first layer of the cookie banner, which is the part that the user sees immediately. In contrast, denying permission for data processing is often less straightforward, as many cookie banners do not provide a “deny all” button, making the process more complex for users who wish to opt out. This discrepancy in design between accepting and denying cookies serves as a nudge towards easier acceptance of all cookies, potentially influencing user behavior towards less privacy-conscious decisions [
14,
15].
Cookie paywalls represent another notable behavior by companies, perceived as a subtle means to influence user decisions. This practice has been adopted by some publishers as a response to the limitations imposed by the GDPR. Cookie paywalls operate on the principle that users can access the publisher’s content if they either provide permission for their personal data to be processed for online advertising purposes, which often includes profiling and targeting or pay a subscription fee to the publisher, often including an ad-free experience.
Even though there are no explicit patterns attempting to influence user behavior, the decision-making process could become tedious [
16]. The banner must provide comprehensive information, but this requirement can lead to boring banners that conflict with the user’s desire for quick access to the website’s content. It has been calculated in [
1] that a user visits an average of 2.49 new publishers per day. In a worst-case scenario, it would take approximately 79.13 min per day to make all possible decisions regarding permission to process data. Furthermore, opting out is much more complicated than opting in, and in particular, changing or revoking previously accepted settings is very difficult [
17].
The time and effort required to read and understand the information on cookie banners, coupled with the need to make multiple decisions, can be burdensome for users. This may explain why even users who claim to be concerned about their privacy accept all tracking cookies at the same time—this is the so-called privacy paradox.
2.4. Privacy Paradox
The privacy paradox, as discussed in the literature, refers to the contradiction between users’ expressed concerns regarding online privacy and their actual behavior, which often does not align with these concerns [
18]. This phenomenon is observable in various contexts. For instance, Iacono et al. [
19] highlight this trend in recent years, where there is a growing concern about cybercrime but a reduced inclination to take measures to protect against such risks. The cause of the privacy paradox is a subject of much debate in the academic literature. Most researchers identify decision biases, lack of experience, and the illusion of control as possible explanations for the privacy paradox. Users may have a distorted perception of their ability to control personal information online, leading to behavior inconsistent with their privacy concerns [
5]. A recent empirical study also suggests that user behavior may vary depending on age and medical conditions [
20].
On the other hand, some works argue that the privacy paradox is a myth created by flawed logic. They contend that people’s everyday behavior is not an accurate indicator of their preferences because it is distorted by biases, heuristics, manipulations, and other factors [
21]. This perspective suggests that regulations focused on how information is used, retained, and transferred could have a more significant impact on this issue. Additionally, in [
22], it has been highlighted that consumers fundamentally care about online privacy and provide evidence of numerous actions they take to protect it. However, in some cases, achieving desired levels of privacy is prohibitive or even undesirable through individual actions, necessitating political intervention for privacy protection.
We think that the privacy paradox may be rooted in the way cookie banner interfaces are designed and the process required to express preferences. Possible reasons behind the privacy paradox in this case could include decision fatigue, information overload, and the influence of dark patterns. Under these assumptions, even users who are attentive to their online privacy may passively accept cookies.
A company adopting a standard like the Transparency and Consent Framework (TCF), which is GDPR-compliant, is not obliged to explicitly follow any cookie banner template. Currently, the TCF is one of the most widely used standards as it is built by the industry for the industry (
https://iabeurope.eu/transparency-consent-framework/, accessed on 18 December 2023).
2.5. Transparency and Consent Framework
Privacy preference signals are digital representations of user choices for processing personal data online. These signals have evolved in response to changing data privacy concerns and regulations. Hils et al. [
23] discern two waves of privacy preference signals. The initial wave featured early standardization attempts, exemplified by platforms like the Platform for Privacy Preferences (P3P) and initiatives like Do Not Track (DNT) and the Network Advertising Initiative’s (NAI) opt-out standards. These were primarily browser-centric and aimed at establishing universal settings across web interactions. The subsequent wave, influenced by legal frameworks such as the California Consumer Privacy Act (CCPA) and the General Data Protection Regulation (GDPR), introduced signals like the Global Privacy Control (GPC) and the Transparency and Consent Framework (TCF). Unlike their predecessors, GPC and TCF are more industry-driven, with a specific focus on user consent within the realms of online advertising and data processing. The TCF, aligning with GDPR principles, facilitates the communication of users’ consent choices to publishers and advertisers, marking a notable departure from the earlier challenges faced by the browser-centric signals of the first wave. TCF is currently the most widely adopted specification, and projections suggest that this trend will continue in the future [
23]. The TCF includes several actors including Publishers, Vendors, Consent Management Platforms (CMPs), Global Vendor List (GVL), and the Interactive Advertising Bureau Europe (IAB Europe) itself (
https://iabeurope.eu/iab-europe-transparency-consent-framework-policies/, accessed on 18 December 2023). Publishers, the digital property operators, bear the responsibility of presenting the framework’s user interface to consumers and establishing legal bases, including user consent, for vendors processing personal data derived from user visits to their content. Vendors, in this context, refer to companies involved in digital advertising delivery or related online activities within a publisher’s digital domain. These entities may access end-user devices or process the personal data of visiting users, with their roles under GDPR varying as either controllers, processors, or both, depending on specific scenarios.
Consent Management Platforms (CMPs) play a crucial role as mediators, centralizing and overseeing end-user transparency, consent, and objections. These platforms, which can be either privately operated by publishers for their purposes or commercially available for others, manage the legal basis statuses of vendors listed on the Global Vendor List (GVL). They facilitate the establishment of legal bases for processing, acquire necessary user consent, manage user objections, and communicate these statuses within the digital ecosystem.
Overseeing the TCF is the Interactive Advertising Bureau Europe (IAB Europe), which administers and governs the Framework, including its policies, specifications, and the GVL. IAB Europe’s role is dynamic, involving periodic policy updates to ensure the Framework’s continued efficacy. The Global Vendor List, maintained by IAB Europe, catalogs vendors registered to participate in the framework. This list is pivotal for CMPs, publishers, and individual vendors, its structure and content being defined by the framework’s specifications. Collectively, these actors form the backbone of the TCF, ensuring compliance and facilitating a transparent digital advertising environment.
To understand what happens under the TCF, we describe a concrete scenario in which User X visits the website of Publisher A for the first time. This process can be abstractly described through the following sequence of actions.
User X arrives at Publisher A’s website.
Publisher A contacts its Consent Management Platform (CMP), CMP A, using a CMP tag—a JavaScript tag added to the website.
CMP A’s code runs on the page and checks if there is a Transparency Consent (TC) string in User X’s local storage corresponding to Publisher A.
Since it is User X’s first visit, no TC string is found. Then, CMP A displays a cookie banner.
User X makes consent choices for each purpose listed on the cookie banner.
CMP A creates a TC string for User X and Publisher A by encoding the consent information according to TCF standards.
Publisher A stores the TC string locally on User X’s device.
Publisher A uses its CMP API to decode the TC string, understanding which purposes User X allows it to pursue.
Subsequently, when User X revisits Publisher A, steps 1–3 are repeated to check the existing consent preferences, but steps 4–7 will be skipped, proceeding directly to step 8.
Our proposal integrates into the Transparency and Consent Framework (TCF), taking the place of the user in the interaction with cookie banners, as illustrated in
Figure 1. Once the user has defined their privacy policy within the UPPMS, the system intercepts the CMP request and subsequently responds on behalf of the user according to the established policy.
3. Related Work
Most of the systems presented in the literature that aim to assist users with privacy issues using automated techniques focus on privacy policies rather than privacy preference signals. In [
24], automatic assessment of privacy policies using machine learning, natural language processing, and manual annotation is proposed. The use of these technologies for the same purpose is explored in [
25]. Deficiencies of policies in terms of readability and ambiguity are evaluated in [
26], while [
27] investigates how language technologies can support users in better understanding these policies. In [
28], machine learning-based strategies for automating GDPR compliance checks in data processing agreements are introduced.
The Platform for Privacy Preferences (P3P:
https://www.w3.org/TR/P3P/, accessed on 18 December 2023) was the first attempt to standardize privacy preferences in a machine-readable format. The P3P facilitated the development of the first user agents for managing privacy preference signals. One example was Privacy Bird (
http://www.privacybird.org/, accessed on 18 December 2023), a browser add-on that analyzes the privacy policies of websites and compares them to the user’s personal preferences. Other more rigid approaches include the Mozilla Firefox add-on Targeted Advertising Cookie Opt-Out (TACO), which provided an updated opt-out cookie list but deleted all current cookies. However, the P3P protocol became obsolete because of limited adoption by AdTech vendors, making P3P-based solutions outdated. Additionally, privacy advocates criticized P3P for failing to impose consequences for false privacy practice reports (
https://archive.epic.org/reports/prettypoorprivacy.html, accessed on 18 December 2023).
Following P3P’s obsolescence, browsers like Mozilla Firefox incorporated alternative measures like Do Not Track (DNT). However, DNT faced similar challenges as P3P, particularly when major AdTech vendors chose to ignore DNT signals. This decision was influenced by Microsoft’s default activation of DNT in Internet Explorer (
https://www.iab.com/news/do-not-track-set-to-on-by-default-in-internet-explorer-10iab-response/, accessed on 18 December 2023). Over time, even pro-privacy browsers like Firefox ceased supporting DNT (
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/DNT, accessed on 18 December 2023). New browsers and add-ons, such as Brave (
https://brave.com/web-standards-at-brave/4-global-privacy-control/, accessed on 18 December 2023) and Privacy Badger (
https://www.eff.org/gpc-privacy-badger, accessed on 18 December 2023), have adopted the Global Privacy Control (GPC) signal, although skepticism remains about its widespread adoption by AdTech vendors [
23].
The General Data Protection Regulation (GDPR) necessitated compliance strategies for websites, leading to the emergence of Consent Management Platforms (CMPs) and “Consent as a Service” solutions. Quantcast (
https://www.quantcast.com/, accessed on 18 December 2023) and similar companies have facilitated TCF adoption, paralleling the impact of Let’s Encrypt (
https://letsencrypt.org/, accessed on 18 December 2023) on HTTPS adoption [
23]. In this landscape, PIMSs have emerged, offering centralized management of personal data [
1]. However, current PIMSs offer limited functionality, primarily focused on managing consent decisions. These tools, generally available as browser extensions, face challenges in addressing the variability of consent requirements across different websites.
Two of the most popular browser extensions are “I Don’t Care About Cookies” (
https://www.i-dont-care-about-cookies.eu/, accessed on 18 December 2023), which simplifies the cookie consent process by granting permission for all or only necessary cookies, and “Never-Consent” (
https://www.ghostery.com/blog/never-consent-by-ghostery, accessed on 18 December 2023) by Ghostery, which removes cookie pop-ups and expresses disagreement with online tracking. The technical specification for Advanced Data Protection Control (ADPC:
https://www.dataprotectioncontrol.org/, accessed on 18 December 2023) provides an alternative approach that permits users to establish general or specific consent indications, while authorizing publishers to solicit consent via TCF or specific requests. While ADPC is still in its developmental stages, a prototype is accessible for Firefox and Chromium-based browsers.
The history of privacy preference signals indicates that websites and AdTech vendors have been hesitant to adopt a unified signal. Indeed, it is expected that investment in TCF signals will continue [
23]. Within this future perspective, our proposed User Privacy Policy Management System stands out as a valuable tool in supporting users. This system automatically handles consent requests using a user’s policy within the TCF framework, reducing the time users spend declaring their preferences and minimizing their exposure to unfair practices by publishers.
4. User Privacy Policy Management System
A User Privacy Policy Management System (UPPMS) must provide users with three fundamental functionalities:
Offer the user a direct and understandable means to define their privacy policy;
Generate rules based on the user’s preferences to implement the specified policy;
Automatically enforce the user’s policy during navigation.
Figure 2 provides an abstract representation of the logical elements interacting with the UPPMS.
The UPPMS generates various rules corresponding to different sequences of actions required to apply diverse types of policies on a specific website, utilizing an LLM. These rules are stored in a distributed database. Users declare their policies within the UPPMS. Subsequently, as the user visits each page, the UPPMS retrieves and applies the necessary rules to express the user’s preferences.
Given a list of websites, there is a finite number of policies that may be applied to a particular website. Rules for implementing these policies could be generated at any time. If a user intends to apply a policy for which there are no generated rules, these rules could be generated lazily and added later to the distributed database accessible to all users. The list of policies that are not applicable is also stored in the distributed rule database. In the event that a user with a particular policy accesses a website where the policy cannot be applied (for example, there is no sequence of actions that enables policy application), the user is informed that the website does not support policy application.
The remainder of this section describes how users declare their policies, how rules are generated, and how rules are applied.
4.1. User Policy Definition
User privacy policies should be standardized by establishing a default setting system that reflects different levels of privacy and data usage. This will provide users with a simplified decision-making process. A list of user privacy policies based on the categories of cookies described above is presented in
Table 2.
The REJECT ALL policy is the most restrictive, permitting only strictly necessary cookies. Conversely, the ACCEPT ALL policy is the most permissive and allows for any type of cookie. Between these policies are varying combinations of acceptance regarding analytics, functional, performance, and targeting cookies. For example, the policy labeled ACCEPT P provides authorization solely for performance cookies, whereas the policy labeled ACCEPT F-P permits functional and performance cookies but denies analytics and advertising cookies.
A standardized schema for defining user privacy policies would allow users to easily select a privacy level that matches their personal preferences. However, not all users have the computer literacy to directly define their policies. To bring all user types to common ground, a step-by-step approach must be used to guide them through the process of defining privacy policies.
Figure 3 illustrates a policy definition procedure that assists the user.
This process offers two options: a quick route for experienced users to select the policy directly and a step-by-step wizard to guide other users. See
Appendix A for an example of how to present the wizard to users.
4.2. Rules Generation
To apply a specific user policy, it is necessary to execute a sequence of actions (the rules) that must be applied to a banner. The rule generation process is thus an iterative process aimed at identifying the sequence of actions necessary to close a cookie banner while expressing the user’s preferences.
The process for obtaining the list of actions, as depicted in
Figure 4, involves several steps. Initially, the source code of the targeted web page is extracted (step 1). This code is then input into an LLM to identify the selector for locating the cookie banner (step 2). Once the source code of the cookie banner is extracted (step 3), it is fed into another LLM, along with the user’s policy. This specialized LLM interprets the text and the semantic meaning of the banner elements, determining the action required to progress toward the goal of applying the policy. It returns the selector for the element that needs to be clicked to execute the correct action (step 4). This selector is added to a list of actions that will be provided as output upon completion of the process (step 5). The web page is then revisited, executing all the actions (step 6). The state of the page after performing all the actions is captured, and the iterative process continues with the next iteration (step 7).
This iterative procedure allows for step-by-step navigation through various banner screens until the objective is achieved, if possible. In the case of success, the list of actions required to apply the specific policy to the particular website is returned. This information is stored in a distributed database and made available to users upon request.
The inability to achieve the goal primarily occurs in two cases, as identified by the LLMs. The first scenario involves the absence of the cookie banner. The second is related to the fact that the cookie banner does not allow the application of the user’s policy. For example, there might be no buttons for making choices, or it could be a cookie paywall. In both cases, this information will be stored in the database, and the user will be notified of the scenario when visiting the specific website.
4.3. Enforcement of the Rules
The process of applying a policy is less complex than the process of generating one. The UPPMS, after guiding the user through generating their policy as previously described, will track the user’s policy through an identifier called the policy ID. For instance, in Listing 1, an example is provided of how a user’s policy might be represented, where the user intends to reject all cookies.
Listing 1. Example of a user policy rejecting all cookies.
|
{ "version": "1.0.0", "user_policy_id": "REJECT_ALL" } |
When a user visits a web page, the UPPMS will conduct a direct search within the rules database using the website URL and the user’s policy as keys. At this point, the sequence of rules to apply to enforce the user’s policy will be retrieved.
Two particular cases need consideration during the policy application, concerning the inapplicability of the policy and the absence of a rule in the database. The first case occurs when the policy is identified in the database but is not applicable to the specific website due to the website’s limitations. In this scenario, based on the user’s preferences, the most restrictive policy, the most permissive policy, or no choice at all might be applied. It could be left to the user to interact with the banner.
In the second case, if the rule being attempted for application is no longer valid (e.g., due to changes in the website’s banners or the presence of a cookie paywall) or if the rule has never been generated for that specific website, a notification will be shown to the user. The user will be informed that UPPMS cannot be used for a specific reason, and a request will be sent to a remote server responsible for generating the rule. Even though the current user may not benefit from UPPMS support in this instance, the generated rule will be made available for future users.
5. Experiments
The main goals of the experimental phase were to observe the behavior of users during their first visit to a website that implements a cookie banner compliant with the TCF, and to measure the average time users take to implement a policy (i.e., a set of actions indicating a preference) on the cookie banner. The time taken by users to express their policy is then compared to the time taken by UPPMS.
The Quantcast website was chosen as the target website for this experiment for the following reasons:
Users were asked to identify which products were available on the Quantcast website. They were given no information about the experiment, with the justification that this was a preliminary phase. In reality, the purpose of this request was to focus their attention on a specific task and observe their initial interaction with the banner upon entering an unknown website.
To ascertain if users read the banner before taking action, it is assumed that users can read at a rate of 238 words per minute in silent reading. Given that the Quantcast website’s initial screen displays 152 words, it is considered that users did not read it if they performed the first action in less than 38 s. Once the users closed the banner, the purpose of the experiment was clarified, and the second phase of measuring the timing of expressing specific choices was initiated.
Users were then asked to apply the previously defined ACCEPT ALL, REJECT ALL, and ACCEPT P policies to the Quantcast website banner. In this specific case, ACCEPT P is equivalent to accepting cookies only for measuring content performance.
We evaluated the time required to implement the same policies using the proposed UPPMS. The UPPMS was developed as a browser extension that enforces the policy during a visit to a website. To assess UPPMS performance, we monitored the execution time of the web extension code and computed statistics. Page visits were conducted over a broadband internet connection, with a delay time of 100 ms between actions.
For generating rules outlined in Listing 2, we employed two GPTs (
https://openai.com/blog/introducing-gpts, accessed on 18 December 2023)—custom versions of ChatGPT 4.0 designed to incorporate instructions, additional knowledge, and various skills. The first model specialized in identifying cookie banners, and locating the container ID that houses the banner on a given web page. The second model focused on interpreting banners to determine the subsequent rule for policy enforcement. Refer to
Appendix B for the instructions used during model fine-tuning.
Listing 2. JSON file with the rules used in the experiments on the Quantcast website. |
{ "www.quantcast.com": { "ACCEPT_P": [ ".css-1hy2vtq", "#Purposes-id\\:8 > button", "#Purposes-id\\:8 .qc-cmp2-toggle", ".qc-cmp2-footer .css-47sehv" ], "ACCEPT_ALL": [ ".css-47sehv" ], "REJECT_ALL": [ ".css-1hy2vtq", ".qc-cmp2-header-links > button.css-8rroe4:first-child", ".css-47sehv" ] } }
|
6. Results
The experimentation phase involved a sample of 16 individuals aged between 23 and 58 years. All participants were experienced with computers, as they use them on a daily basis for work. Among the participants, 73% had a bachelor’s or master’s degree, while the remaining percentage had a lower degree.
The experiments were conducted by navigating a website using a laptop browser and a broadband connection. This section presents the results of the analysis of user interaction with the Quantcast cookie banner and a comparison with the application times of the policies implemented by UPPMS.
6.1. Users’ Behavior
Figure 6 depicts the flow of user behavior during the experiment and their self-reported behavior in general. The first part distinguishes whether users read the initial information on the cookie banner and their first action on the banner. The subsequent section outlines the actions they reported taking and their privacy concerns.
Some 94% of the participants took their first action on the banner in less than 4 s. Only one participant read the banner carefully, taking more than 38 s, before making the initial choice on the banner. In the observed behavior of participants regarding their first interaction with the cookie banner, it was found that the majority (81%) initially clicked the “Agree” button. However, this action does not necessarily reflect their usual preferences. Only 23% of these individuals report regularly accepting cookies. Some 13% said that they have no fixed habits, but choose one of the fastest alternatives to remove the cookie banner, while a minority said that they would only reject cookies if the reject button was immediately visible. Another minority stated that they would leave the website if they could not locate the reject button. The majority of users who clicked the “Agree” button (46%) said that during the experiment they were focused on completing the task and therefore tried to close the banner as quickly as possible. Only 27% of the participants who said that they always refused all cookies really did so consistently over the course of the experiment.
6.2. Time Required to Apply Policies
The boxplot in
Figure 7 illustrates the average time taken to perform the required actions for the three different policies under investigation in the experiment. The execution of the actions required for the “ACCEPT ALL” policy generally demonstrates quicker and less variable times. In contrast, the times required to perform actions for the “REJECT ALL” and “ACCEPT P” policies exhibit greater variability and higher average durations. Particularly, “ACCEPT P” highlights a case with a significantly longer duration than the others.
Figure 8 shows the average time required to execute three policies, arranged in ascending order of complexity. Policy complexity is determined by the number of clicks needed from the user and the concentration required to understand the meaning of the clauses. On average, users spend several seconds performing a click, and this time increases with the number of required actions. In contrast, the UPPMS completes tasks within milliseconds. In detail, the UPPMS averages 0.011 s for the “ACCEPT ALL” policy, 0.203 s for the “REJECT ALL” policy, and 0.302 s for the “ACCEPT P” policy.
7. Discussion
The results from the experiment show that half of the participants behaved inconsistently with their stated usual practices. The main reason given was that the need to complete a specific task led them to opt for the quickest way to remove the banner, which in this case was the “Agree” button that allowed for the acceptance of all cookies. This tendency suggests a predominance of operational efficiency over individual privacy concerns.
Most users did not read the privacy policy on the initial page of the banner despite valuing their own privacy. This can be attributed to the information overload present in these banners. For instance, it was estimated that reading all the information on the first page of Quantcast’s cookie banner would take about 38 s.
Another factor contributing to this discrepancy between preference and actual choice is decision fatigue. Users find themselves overwhelmed by the abundance of choices, which pushes them towards the option perceived as less demanding.
Generally, we observed that users tended to adopt a binary strategy: they either accepted all cookies or rejected all. This behavior appears to be influenced by the design and usability of the banner itself, as often these inline options allow for quicker decision-making [
29].
In the Quantcast cookie banner, the “Reject All” button was located in a secondary interface. Some participants reported that it was counterintuitive to use this “Reject All” button, as they expected the banner to close automatically after selecting it. However, an additional click on the “Save and Exit” button was required. This particular difficulty was not experienced by users who typically always reject all cookies. These latter users were also generally faster at implementing the ACCEPT P policy.
Our findings suggest that decision fatigue and information overload may contribute to the privacy paradox in user interaction with cookie banners. This paradox refers to users choosing the quickest option to close banners, usually by accepting all cookies, despite potentially having personal preferences for greater privacy. Although our experimental results align with the findings in the literature, our sample has limitations that prevent it from being representative of all user types. Our sample consisted entirely of users with demonstrated computer experience, which may have also provided an optimistic estimate of the time required for users to implement the policies. It is important to note that a user with limited computer experience may face additional challenges when carrying out the actions required by a policy.
The interview method may also have contributed to the optimistic estimates. This is because users were asked to implement the REJECT ALL policy first and then the ACCEPT P policy. Some users went through all the clauses thoroughly to make sure they had rejected everything, thus familiarizing themselves with the banner. This behavior may have given them a temporary advantage when performing actions for the following policy. However, the average time to execute the ACCEPT P policy was still the longest overall. The adoption of a tool like UPPMS eliminates the need for users to interact directly with cookie banners. This avoids the complexity and deceptive patterns commonly found in cookie banners and has the potential to address the privacy paradox in this context.
We highlight, however, that the decision-making process associated with cookie banners can be burdensome, even without deceptive patterns. The introduction of structured processes for standardizing cookie policies can increase user awareness and automate interactions with consent requests. In this regard, we have introduced standardized policies and a process for acquiring user preferences that feature a few categories with descriptions comprehensible to the average user.
8. Conclusions
In the absence of clear cookie banner implementation guidelines, users go through a time-consuming and intricate preference expression process that ultimately directs them toward accepting cookies. Our experiment highlighted the privacy paradox in which most users, due to decision making fatigue and information overload, choose the quickest option to dismiss cookie banners (the “Reject All” button), contradicting their privacy preferences. The proposed system guides users in formulating their cookie policy and generates rules to enforce user preferences on all visited websites. As a result, users can enjoy a smooth browsing experience as intricate banners with deceptive patterns disappear automatically in less than a second.
The UPPMS is currently a simple browser extension. However, in the future, it is intended to become a cross-platform service usable on both mobile and desktop applications. The UPPMS can suggest products and services that align with the user’s privacy needs. For example, if a website’s terms do not align with user privacy concerns, the UPPMS may suggest alternative websites that better meet their privacy needs. A tailored version of the UPPMS could also be used in other contexts, such as user experience, to identify deceptive patterns within interfaces or assess the level of difficulty for users in expressing their preferences.