1. Introduction
Vishing, a portmanteau of “Voice” and “Phishing”, has evolved with advancements in technology, allowing attackers to employ additional phishing channels such as web, Short Message Service (SMS), and email for more sophisticated scams [
1]. These techniques have continued to result in annual losses amounting to millions of dollars for both individuals and organizations [
2,
3]. The emergence of attacks utilizing vishing applications has significantly escalated the fraud landscape; leveraging these applications has been shown to incur losses up to ten times greater than those caused by traditional phishing campaigns that do not utilize such apps [
4].
With the proliferation of vishing applications, the most commonly observed attack methods have been the Call Redirection Attack and the Display Overlay Attack [
4]. Firstly, the Call Redirection Attack involves redirecting the number dialed by the user to the attacker’s phone number. However, since the attacker’s phone number would appear on the user’s outgoing call screen, this method is often combined with a Display Overlay Attack for added deception. The Display Overlay Attack, on the other hand, is used to conceal the attacker’s phone number by overlaying a fake waiting screen during incoming or outgoing calls [
5]. By masking the real phone number with a counterfeit screen, users are less likely to notice discrepancies, thereby increasing the likelihood of them divulging personal information without suspicion. Additionally, we defined a new attack: Duplicated Contacts Attack. This attack involves registering the attacker’s phone number under a contact name that matches a target contact’s name or editing an existing contact on the target’s device to insert the attacker’s phone number.
In previous research, due to the lack of clearly defined vishing attack techniques, researchers explored various dimensions of vishing and investigated user behavior in response to these threats. Based on these studies, research has been conducted to enhance users’ security awareness, focusing on enabling individuals to recognize and respond to phishing attempts. These include user studies analyzing why individuals respond (or do not respond) to vishing attempts [
6], research aimed at increasing awareness of mobile scams through education [
7], and detailed crime reports explaining how attackers succeeded in their fraudulent schemes [
8]. However, the emergence of vishing applications has introduced new attack vectors, such as Call Redirection and Display Overlay Attacks, leaving gaps in the existing body of research. To fill these gaps, an approach involving modifications to the Android Operating System (OS) was proposed, which monitors specific Application Programming Interfaces (API) to detect malicious behaviors [
4]. However, this method requires users to directly alter the Android OS, which poses a significant barrier to adoption. Additionally, it is limited to monitoring specific APIs used in malicious techniques. If Google introduces new APIs with similar functionalities, this approach would fail to detect them. There has also been research utilizing Large Language Models (LLMs) to assist victims in responding to vishing attempts. However, these approaches face challenges due to limited training data, which introduces biases and reduces the reliability of the results [
9].
In this work, Ventinel focuses on monitoring the user’s screen to defend against Display Overlay and Call Redirection Attacks. When an incoming call is detected, Ventinel uses Optical Character Recognition (OCR) to compare the number displayed on the user’s screen with the incoming number to detect the Display Overlay Attack. Additionally, when an outgoing call is initiated, Ventinel compares the number dialed by the user with the number that successfully connects, which allows it to detect the Call Redirection Attack. Then, through OCR, Ventinel checks for the Display Overlay Attack that could be used alongside Call Redirection Attack by comparing the number displayed on the screen with the number dialed by the user. Finally, Ventinel examines the user’s contacts and cross-references the phone numbers stored in the recent call log and SMS records to identify unused contacts, thereby defending against Duplicated Contacts Attacks. Ventinel stores the identified phone numbers in the blacklist and, after confirming the user’s intention to delete them, proceeds with deleting the contacts. In the process, Ventinel does not collect or store unnecessary data and only accesses the minimum data needed to monitor duplicate contacts by asking for explicit permission from the user.
This method emphasizes pre-incident detection and operates entirely within the Android application, ensuring user convenience. Furthermore, even if the APIs used to execute the same malicious actions are changed, the Ventinel remains capable of detection. Ventinel demonstrated a 100% detection rate for Display Overlay, Call Redirection, and Duplicated Contacts, achieving higher scores compared to commercial apps available on the Android Play Store. Additionally, we conducted a user study using a benchmark that we developed, which revealed that, when using Ventinel, only approximately 8.9% of users responded on average.
We present the following contributions:
We analyze APIs provided in Android level 29 and above that exhibit behaviors similar to malicious activities.
We identify and highlight the limitations of currently available commercial vishing defense applications.
We examine existing attack techniques used by vishing applications and propose novel attack methods.
We implement a defense mechanism within an application to counteract vishing attacks, specifically focusing on Display Overlay Attacks, Call Redirection Attacks, and the newly defined Duplicated Contacts Attack, all without requiring OS modifications.
3. Background
In this section, we introduce the malicious behaviors found in vishing applications and the Android APIs that enable the creation of these behaviors. We also explain how vishing applications use Android APIs differently depending on the Android API level. Furthermore, we discuss known vishing defense approaches and their limitations.
3.1. Malicious Behavior in Vishing Apps
Vishing, a combination of “Phishing” and “Voice”, refers to a type of mobile cyber crime that uses voice calls to financially exploit victims [
20,
21]. Attackers lure victims into installing vishing applications and subsequently impersonate public institutions or acquaintances through these applications. Consequently, victims are misled by insidious fraud and inadvertently provide sensitive information, potentially leading to financial harm [
22]. The malicious behaviors identified in the vishing applications we examined are as follows:
3.1.1. Call Redirection
Android provides APIs for Call Redirection, enabling developers to create applications that cancel outgoing calls and automatically connect to a different phone number. Therefore, attackers develop vishing applications to cancel a victim’s requested outgoing call and redirect it to an attacker. To implement Call Redirection, attackers employ different APIs based on the target Android API level. We detail different Android APIs corresponding to API levels in
Section 3.2. Additionally, vishing applications combine Display Overlay with Call Redirection to disguise the redirected outgoing number.
3.1.2. Display Overlay
Android supports APIs that facilitate the output of specific pop-up screens or full-screen displays, known as Display Overlay [
23,
24]. Vishing applications exploit these APIs to perform malicious activities by using Display Overlay to cover the victim’s device screen with a fake interface, thereby deceiving the victim about the outgoing caller ID. Specifically, when vishing applications change the outgoing call, they overlay a fake screen to disguise the altered outgoing number display. When victims receive a call from attackers, vishing applications overlay a screen that appears to be from a legitimate institution (e.g., police, bank, public institution) to deceive the victim.
3.1.3. Duplicated Contacts
Android provides APIs that allow developers to retrieve, and modify contacts stored on a device. Vishing applications exploit these APIs to exfiltrate or modify contacts on a victim’s device for malicious purposes. Additionally, we introduce a malicious behavior called
Duplicated Contacts. Duplicated Contacts involves adding the attacker’s number to the victim’s contact list under an existing contact name or editing it as a secondary contact. According to a FINLEY, Jason R., et al. [
25], many smartphone users store contacts on their devices rather than memorizing phone numbers, making Duplicated Contacts effective in tricking victims into thinking they are receiving a call from someone they know. Android users overlook the actual number displayed beneath the contact name, and iOS users do not see the number at all, allowing attackers to exploit this limitation.
3.1.4. Application Repackaging
Application repackaging is a technique that involves decompiling an application, modifying its code, and recompiling it into a distributable format (such as Android Application Package (APK) or iOS Application Archive (IPA)). Attackers exploit this method by decompiling signed applications, adding malicious code, and repackaging them [
26]. The repackaged applications maintain the layout of the original applications, making it difficult for victims to detect the alterations [
27]. Attackers have deceived victims by utilizing vishing applications that encompass the previously mentioned behaviors. Experts caution that individuals and businesses incur annual financial losses amounting to millions of dollars as a result of vishing-related fraud [
2,
3]. To prevent vishing fraud, researchers have proposed various approaches to detect vishing behaviors using different methods [
4,
28,
29,
30]. Unfortunately, despite these efforts, incidents of financial loss resulting from vishing continue to rise [
31]. Therefore, we propose an approach to detect malicious behaviors, specifically Display Overlay, Call Redirection, and Duplicated Contacts, without modifying the Android OS.
3.2. Vishing Apps According to Android API Levels
The Android platform provides various APIs (e.g., TelephonyManager API, LocationManager API, etc.) that request information from the device or manipulate it directly, facilitating flexible feature development for applications. Attackers exploit these APIs to create malicious applications, such as vishing applications. Google has recognized these issues and has been consistently implementing security patches [
32,
33], as well as removing APIs that were exploited by malicious apps, restricting the values that apps request from APIs, and requiring additional permission checks before app installation, among others. Specifically, Google introduced security patches in the API level 29 update to prevent the abuse of Call Redirection [
34]. In this section, we introduce the alterations in Call Redirection and Display Overlay Attacks in vishing applications based on this security patch.
Lower API Level 29. In Android API levels 28 and below, vishing applications implement Call Redirection using
BroadcastReceiver. At this level, the Android system broadcasts
NEW_OUTGOING_CALL just before an outgoing call is made. Consequently, vishing applications retrieve or modify the broadcast value in the
onReceive() callback function, thus invoking
getResultData() to obtain the victim’s phone number and using
setResultData() to redirect the call. Vishing applications involve Display Overlay to present a counterfeit screen when displaying specific call numbers. Therefore, these applications employ APIs to detect phone events and retrieve the incoming or outgoing numbers. Vishing applications utilize the
onCallStateChanged() and
onCallAdded() callback functions to retrieve call states and incoming number, as shown in
Table 3. Specifically,
onCallStateChanged() provides both incoming and outgoing phone numbers through
EXTRA_INCOMING_NUMBER.
API Level 29 and Above. In Android API level 29 and above, Android modified the call-related APIs and categorized the phone permissions in more detail. First, to prevent Call Redirection that altered broadcast values, Android restricted access to broadcast values and introduced the CallRedirectionService. Android modified the behavior of the BroadcastReceiver so that the setResultData() method called just before an outgoing call returned null. Furthermore, for an application to utilize the CallRedirectionService, user consent is required, and the application must be registered as the default call redirecting app. Second, Android restricted the APIs that provided incoming or outgoing numbers in API level 28 and below by introducing the CallScreeningService. In earlier versions, Android apps with permissions related to TelephonyManager could call APIs within callback functions to obtain call status as well as incoming or outgoing numbers. In these versions, Android added the CallScreeningService to manage call events and introduced settings for the default app for caller ID and spam. Therefore, for an Android app to access the call numbers, it must implement the CallScreeningService and obtain user consent to be registered as the default app. Finally, the APIs required to execute Display Overlay are available across all Android API levels.
In this section, we present an approach for detecting malicious behavior in vishing applications running in the background. Our approach identifies malicious behaviors, including Call Redirection, Display Overlay, and Duplicated Contacts.
3.3. Call Redirection Detection
Vishing applications include Call Redirection Attacks to alter the outgoing call on the victim’s phone using setResultData() or CallRedirectionService. These applications also implement Display Overlay Attacks that obscure changes to the outgoing number, making it challenging for victims to detect Call Redirection. To detect Call Redirection Attacks, we retrieve and compare both the original outgoing number and the redirecting outgoing number using Android APIs. Additionally, we obtain the number displayed on the call standby screen and compare it with the actual outgoing number for double verification. This approach reduces the false positives, thereby protecting users from both inadvertent and malicious call alterations.
3.4. Display Overlay Detection
Vishing applications employ Display Overlay Attacks that present a fake screen during call events (incoming or outgoing), preventing victims from identifying the actual numbers. To detect these attacks, we compare the number displayed on the screen with the actual number retrieved using Android APIs. We capture a screen during call events and extract the phone number from the captured image using OCR. We also retrieve the actual incoming or outgoing number by invoking the Android APIs in the background application, and then we compare the extracted number from the screen with the actual number. Our approach enhances the detection of Call Redirection Attacks by double verification.
3.5. Duplicated Contacts Detection
Vishing applications conduct Duplicated Contact Attacks to make victims mistake an attacker’s number for an existing contact. Adding the attacker’s contact using a name that matches the target or editing the target’s contact to include the attacker’s phone number (i.e., the secondary phone number) seen in
Figure 1. These applications generate Duplicated Contacts by adding the attacker’s number either under an existing contact name or as a secondary phone number to an existing contact. To counter this, we identify contacts with duplicate names or secondary phone numbers and prompts the user to review and delete any suspicious duplicates.
To effectively counter Call Redirection and Display Overlay Attacks, we searched for real phone numbers and distinguished them from those obtained through OCR. We also adjusted the timing of number comparisons to account for network delays, as OCR begins when incoming and outgoing call events are triggered, and the actual phone number is retrieved.
4. Design
In this section, we present
Ventinel, a system designed to detect the malicious behavior of vishing applications without necessitating modifications to the Android OS. As detailed in
Figure 2,
Ventinel operates as a background application, detecting malicious activities from vishing applications in real time and promptly notifying the user.
Ventinel is composed of two core modules:
Sentry and
Signaller.
Sentry is continuously monitors vishing applications for signs of malicious behaviors, and
Signaller sends alerts to users when
Sentry transmits information.
4.1. Sentry of Ventinel
The Sentry of Ventinel operates in the background to monitor the malicious behavior of vishing applications. It utilizes Android APIs to gather various types of information, such as incoming and outgoing numbers, to verify the presence of malicious activities. Sentry detects specific malicious behaviors, including Call Redirection, Display Overlay manipulation, and Duplicated Contacts. When the Android device is on IDLE and not engaged in a call, Sentry performs a Duplicate Contacts Verification. Upon the user’s attempt to initiate a call, Sentry conducts a Call Redirection Verification. Following this, Sentry carries out a secondary check through Display Overlay analysis. It captures the screen displayed by the device during both incoming and outgoing call states and performs a Display Overlay Verification to detect any potential malicious activity.
4.1.1. Duplicated Contacts Verification
The Sentry module conducts a search for duplicate contacts within the contacts stored on the Android device while in IDLE. Any contacts identified as duplicates are forwarded to the Signaller module for further action. Vishing applications add or modify contacts without the victim’s awareness, causing the attacker’s number to appear on the incoming or outgoing call screen.
Algorithm 1 is pseudocode that enables the Sentry of Ventinel to perform Duplicate Contact Verification. The Sentry accesses the complete contact list, SMS messages, and call logs from the Android device. To facilitate Sentry, the following permissions must be granted by the user explicitly:
READ_CONTACTS
READ_SMS
READ_CALL_LOG
Algorithm 1 Duplicated Contact Verification |
- 1:
List of existing Contacts - 2:
initialize empty hash table - 3:
for in do - 4:
name of - 5:
phone numbers of - 6:
- 7:
if already in then - 8:
Add to - 9:
end if - 10:
end for - 11:
initialize empty hash table - 12:
for each in do - 13:
if .length > 1 then - 14:
- 15:
end if - 16:
end for - 17:
List of Recent Call and SMS history - 18:
- 19:
for in do - 20:
phone number in - 21:
Add to - 22:
end for - 23:
for each in do - 24:
for in do - 25:
if then - 26:
Delete in - 27:
end if - 28:
end for - 29:
end for
|
Once these permissions are obtained, the Sentry identifies any contacts with the same name or multiple phone numbers associated with a single name as duplicates. Additionally, phone numbers that exist in recent call and text records are created as a whiteList, assuming that they are related to the user. The Sentry then removes these whitelisted numbers from the identified duplicates, generating a list of suspicious duplicate contacts, which is subsequently forwarded to the Signaller.
4.1.2. Call Redirection Verification
The Sentry is designed to detect Call Redirection Attacks that intercept calls initiated by vishing applications and to alert the victim accordingly. To accomplish this, it utilizes Android APIs to request and validate call-related information from the device initiating the call. As such, Sentry requires the following permissions:
READ_PHONE_STATE
READ_CALL_LOG
For devices operating at API level 29 and lower, an additional permission, PROCESS_OUTGOING_CALLS, is necessary, while devices operating at API level 29 and above require the permission READ_PHONE_NUMBER. When a phone number is entered and the dial button is pressed in Android dialing applications, the Operating System triggers the ACTION_NEW_OUTGOING_CALL action and transmits EXTRA_PHONE_NUMBER, which contains the dialed number as an intent value. The Sentry requests this value and stores it in a designated variable for comparison against the phone number obtained when the call state transitions to EXTRA_STATE_OFFHOOK, as this value remains unaffected by any Call Redirection Attack. Subsequently, when the call state changes to EXTRA_STATE_OFFHOOK, the intent value is cleared, and the final dialed phone number is retrieved using getResultData(). The initially stored value (the intended phone number) is then compared with the final phone number received after dialing. The result of this comparison is forwarded to the Signaller. This approach is effective for detecting Call Redirection Attacks across both API levels 28 and below as well as 29 and above.
4.1.3. Display Overlay Verification
Algorithm 2 shows the pseudocode for
Ventinel’s Sentry to conduct Display Overlay Verification. To monitor for Display Overlay,
Sentry compares the phone numbers displayed on the screen during incoming and outgoing calls with the actual phone numbers being dialed. To initiate this process,
Sentry first captures the screen at the beginning of an incoming or outgoing call and utilizes ML Kit [
35], an OCR tool provided by Google, to extract the phone number displayed on the screen. This extraction process reads text that conforms to the phone number format from the top left to the right of the screen. Simultaneously,
Sentry employs Android APIs to retrieve the outgoing or incoming number that the device is attempting to dial. For outgoing calls, it retrieves the number from the
ACTION_NEW_OUTGOING_CALL action, while for incoming calls, it uses
getStringExtra(incomingNumber). Finally,
Sentry compares the two phone numbers obtained during the call event to determine whether a Display Overlay has occurred and forwards the results to the
Signaller. To facilitate screen capture, Ventinel sets the foreground service type to mediaProjection and requires the following permissions:
READ_PHONE_STATE
READ_CALL_LOG
Algorithm 2 Display Overlay Verification of incoming call |
- 1:
Captured Screenshot - 2:
Extracted Phone Number from - 3:
- 4:
Phone Call State - 5:
if is EXTRA_STATE_RINGING then - 6:
Incoming Phone Number - 7:
if is not then - 8:
Detect Malicious Behavior - 9:
end if - 10:
end if - 11:
if is EXTRA_STATE_OFFHOOK then - 12:
Outgoing Phone Number - 13:
if is not then - 14:
Detect Malicious Behavior - 15:
end if - 16:
end if
|
For devices operating at API level 29 and above, an additional permission,
READ_PHONE_NUMBER, is necessary. To implement the APIs that verify the outgoing number, the permissions detailed in
Section 4.1.2 must be obtained. Moreover, to ensure
Ventinel can run in the background together with other applications, it necessitates the
FOREGROUND_SERVICE permission.
4.2. Signaller of Ventinel
In Ventinel, the Signaller is responsible for notifying users of the malicious behaviors identified by the Sentry in real time. The Signaller generates a vibration on the device and displays a warning pop-up to alert the user of any detected Display Overlay or Call Redirection Attacks. To address Duplicated Contacts Attacks, the Signaller presents the user with a list of identified suspicious contacts and requests their consent to delete these entries. Upon receiving the user’s approval, the Signaller proceeds with the deletion of the specified contacts.
6. Discussion
Ventinel is engineered to detect attacks from vishing applications (Call Redirection, Display Overlay, and Duplicated Contacts) while operating in the background without necessitating modifications to the Android OS. Vishing applications utilize diverse attack strategies across different Android API levels. To address this, we conducted a detailed analysis of the malicious behavior patterns associated with various API levels and developed Ventinel to effectively identify and mitigate these threats.
6.1. Lacking Visibility
In
Section 5.4, Group B, which used
Ventinel, exhibited an average response rate of 8.9%. Although the overall figure is notably low, we analyzed the experiment to understand why a specific part of Group B exhibited the high response rate. The analysis indicated that the signaller’s warning window lacked sufficient visibility. Additionally, the widespread use of Bluetooth earphones and wearable devices led to instances where participants failed to notice the warnings. To address this issue, it is essential to enhance the delivery of alerts related to malicious activities by incorporating voice prompts or establishing a connection between the wearable device and the user’s mobile phone for more effective notifications.
6.2. Optimize Performance
We used ML Kit, which requires API level 21 or higher. If your device runs an API level below 21, this feature will not be available. Devices with API levels below 21 need to rely on alternative solutions like OpenCV. Additionally, since ML Kit’s performance has been tested on a limited range of devices, it may encounter issues on less powerful hardware. Therefore, optimization is necessary for these devices.
6.3. APIs Permission Settings
Ventinel functions entirely within the confines of the application and requires explicit user authorization to operate effectively. It is essential that we provide appropriate notification to users to ensure that all necessary permissions are granted. Furthermore, Ventinel utilizes APIs that Google has mandated be disabled starting from API level 29. Consequently, if the minimum API level is set to 29 or higher, this feature will be completely deactivated. To address this limitation, it is essential to switch to an alternative API that gives the same results.
6.4. Relay Station Attack
Ventinel is designed to protect against the malicious techniques employed by vishing applications. However, it is important to note that if a relay station alters the phone number,
Ventinel will be unable to detect this modification. To mitigate this limitation, it is necessary to utilize the relay station detection APIs provided by services such as Twilio [
37], Hiya [
38], YouMail [
39], and others. Since domestic services are not supported, close collaboration with telecommunications providers such as SK Telecom [
40], LG Telecom [
41], and KT Telecom [
42] will be essential for effective implementation.