1. Introduction
End-to-end encryption based instant messaging (IM) apps have captured the attention of users due to increasing security and privacy concerns. Owing to this, organizations and individuals have adopted various types of IM platforms for their regular communication. Further, the salient features of IM apps such as convenience, speed, immediate receipt, acknowledgment, and reply to messages attract people of all ages. Similarly, organizations enjoy inexpensive and effortless marketing/advertising opportunities by using these platforms [
1]. According to Juniper research [
2], IM users substantially reach 4.3 billion in 2020 which is a yearly increase of approximately 9%. The COVID-19 pandemic has also increased the use of IM applications due to work from home policies in most organizations [
3]. Hence, it is noteworthy that IM-based apps are performing a vital role in today’s communications.
Common IM apps such as Signal, WhatsApp, Facebook Messenger, WeChat, and many more, employ end-to-end encryption techniques during transmission to provide security and privacy to user data. The Signal app claims to use the secure protocol during communication, i.e., Signal Protocol. Malicious actors are also taking advantage of the end-to-end encryption feature of IM apps. Thus, the provision of security features (end-to-end encryption) in these apps provide an attractive communication medium for digital crime or fraudulent activities. Therefore, forensic investigation of digital crimes may increase effort and high possibility of failure scenarios due to end-to-end encryption services in IM apps [
4].
For investigation purposes, device and network forensics strategies need to be followed. Device forensics helps to determine the file structure and extraction of useful information stored on the devices. In literature, numerous strategies of device forensics based on social media apps have already been proposed and are also useful, and have their merits and demerits [
5,
6,
7,
8,
9,
10,
11]. Some of them also investigated encrypted databases and file structures to identify the evidence [
12,
13]. However, device forensic strategies cannot be fully utilized in the case of end-to-end encryption on the user’s or the app’s data [
5,
7,
10,
11,
13,
14,
15]. It was also observed during the literature review phase that IM applications are extensively analyzed but limited to the device’s internal structure of databases.
On the other hand, network forensics is a type of forensic analysis where the reconstruction of events is performed after an incident or cybercrime. In other words, a major concern is tracing communications through the network (live/backup) packets, i.e., to identify the traces of the apps’ or users’ behavior, data breaches, and other illicit activities. Thus, encrypted network traffic analysis uncovers the network-related artifacts which can be beneficial to the forensic investigator and are shown great interest [
16,
17]. However, the provision of end-to-end encryption protects the data during transmission. This becomes a challenge for the forensic investigator while identifying the behaviors of apps’ or user’s data from encrypted traffic. In literature, there are limited encrypted network forensics strategies proposed, especially on an IM-based app [
8,
18,
19,
20,
21] that can be used to find some important events such as connection establishment, an encryption protocol, and payload sizes for analysis [
21]. Moreover, these can also identify the basic user’s or app’s behavior, i.e., calling, texting, etc. [
8,
18,
19]. However, there is still room to explore the characteristics of encrypted IM app-based packets to identify the apps’ or user’s behavior and employed clients’ and servers’ IPs to support forensic investigation.
In this research, we studied the popular IM Signal app. In the case of the Signal app, the communication data are (end-to-end) encrypted, owing to which it becomes difficult to investigate or often impossible to obtain the corresponding plain text as the keys are unknown. Therefore, in the proposed strategy we successfully demonstrated how artifacts of potential value can be extracted from an encrypted network traffic analysis of the Signal app.
The motivation behind choosing the Signal app is that it is a popular cross-platform encrypted messaging service developed by the Signal Foundation (previously known as Open Whisper Systems) powered by the open-source Signal Protocol (2013) and is available for android, IOS, and desktop. It consists of common features such as audio calls, video calls, voice messages, text, media messages, stickers, typing indicators, and many more. Each conversation has a safety number that can be verified among the communicating parties. Signal application encrypts all the messages using the Signal Protocol that is based on secure cryptographic algorithms. Recently, well-known IM apps, such as WhatsApp, Facebook Messenger, and Skype (in private conversation mode) employed the Signal Protocol to achieve secrecy and privacy. The protocol consists of a Double Ratchet algorithm, XEdDSA, and VXEdDSA, X3DH, and Sesame. XEdDSA creates a single key pair that is used in the elliptic curve Diffie–Hellman and signatures. VXEdDSA is an extension of XEdDSA to make it a verifiable random function (VRF) [
22]. X3DH (extended Diffie–Hellman) is used to mutually authenticate the users and provides forward secrecy [
23]. After sharing a common secret key, the messages are communicated between users by using the Double Ratchet encryption algorithm [
24]. For session management of messages, Sesame (session management for asynchronous) is used with the Double Ratchet algorithm and X3DH [
25]. Owing to the provision of security services, the app has been endorsed by technologists and cryptographers [
26].
We performed an extensive traffic analysis of the Signal app while incorporating the firewall approach for the investigation. The firewall helps to understand the pattern of connectivity and communication activities. We thus forced the Signal client to connect to its server in a controlled environment and this arrangement revealed the obscured design of Signal app. Further, we monitored the live traffic to dissect the bytes, and payload-based patterns to identify the different activities of the Signal app, i.e., calls, text, typing indicator, etc. The major steps of the proposed strategy that are employed to analyze the encrypted network traffic on IM-based apps are shown in
Figure 1.
1.1. Contributions
The motivation behind this research is to demonstrate how artifacts of potential value can be extracted from the Signal app. To this end, this study makes the following novel contributions:
The identification of artifacts generated by the Signal app from the encrypted network traffic;
Extensive experimental analysis is performed to validate the traffic patterns of respective activities;
The proposed strategy is applicable to block/intercept the real-time apps’ activities from the live network traffic by tuning a dynamic set of firewall rules. Moreover, the proposed strategy is 100% applicable on Android devices.
1.2. Organization
The remaining study is structured as follows:
Section 2 sheds light on previous work performed related to the forensics analysis of social networking apps. The experimental setup is discussed in
Section 3.
Section 4 showcases the results of the experiments. This section also discusses the performed activities and their corresponding network communication patterns.
Section 5 shows the discussion and use cases of the proposed encrypted network forensic strategy based on various use cases, i.e., crime scene reconstruction with a case study.
Section 6 describes the benefits of the proposed strategy.
Section 7 describes the limitation of the proposed strategy with future work. Lastly,
Section 8 sheds light on the conclusion.
2. Literature Review
Owing to their extensive use and popularity, social media or instant messaging apps have previously been a subject of study from a device and network forensic perspective. Regarding the device forensic aspect, a study carried out by Zhang et al. [
5] showed that database artifacts of popular social media apps such as Messenger, Hangouts and Line were unencrypted, unlike WhatsApp which keeps it encrypted. After rooting the android device, the chat messages, timestamps, and contact lists were found in files. However, network forensics was not performed on these apps. Similarly, Awan [
27] performed forensic analysis of Facebook, Twitter, and LinkedIn on four different platforms, i.e., IOS, Android, Windows, and Blackberry. The study revealed many important artifacts that can be recovered from device memory except the Blackberry device. The extracted artifacts include important locations and databases with information related to the visited profiles, names, tweets, contacts, profile ID, etc. [
27].
Gregorio et al. [
11] investigated the device forensics of various messaging applications such as Telegram Messenger, on the Windows Phone. This study successfully analyzed the information structure such as user data, chat, conversation, etc. However, the study is limited to only device forensics aspects. Similarly, Al-Mutawa et al. studied applications such as Facebook, Twitter, and MySpace on android, IOS, and Blackberry devices. Through forensic device analysis of these apps, it was shown that valuable data reside in the device memory which can be extracted after logical acquisition except from Blackberry-based devices [
6]. He et. al. presented a study in which they identified mobile applications from encrypted network traffic. In this study they correlated multiple factors such as IP addresses, and DNS queries to reach the conclusion [
28].
Recently, Knox et al. [
29] targeted the Happn social dating app for forensic analysis and successfully identified various artifacts for investigation purposes. The extraction of artifacts from Happn apps poses a personal security risk and can also result in privacy violations of various natures. Recently, Choi et al. [
13] analyzed the locations and file formats of personal data files in (KakaoTalk, NateOn, and QQ) instant messaging applications. They examined the encryption and decryption procedures for internal databases. This study found that some encrypted database files can successfully be recovered without requiring a user password. Meanwhile, this study also identified that QQ messenger utilized the external server for storing its encryption key for the database files. In a similar study, Anglano et al. [
7] targeted the Telegram Messenger app for analysis. The decoding and correlating information helped to reconstruct the contact list, contents of messages, and logs of voice calls. Most experiments performed in the research were performed on virtualized android smartphones. Subsets of these experiments were also performed on a physical device to validate the results. These results can help in solving the cases during forensics investigations. However, this study did not perform any network forensic analysis for end-to-end encryption-based traffic analysis.
Walnycky et al. [
8] analyzed the device and network traffic of 20 applications. They were able to reconstruct the messages and collected evidence traces such as passwords, screenshots, pictures, etc. Similarly, a network forensics analysis was targeted by Yusoff et al. [
30] on social networking apps, i.e., Facebook, Twitter, and Telegram on Firefox Mobile OS simulator. In this study, simulator-based generated traffic was sniffed by the Wireshark network monitoring tool. The authors successfully showed the IP addresses, ports, domains, and subdomains. However, their proposed strategy was unable to decode the different artifacts/patterns, events, and activities, etc. Recently, a device and network forensic analysis of the IMO app was reported in [
18]. In this work, the authors studied the IMO app file structure for Android and IOS platforms. The proposed strategy successfully identified the pattern of activities in encrypted network traffic; however, the proposed strategy was limited to only the IMO app. In other work, Conti et al. [
15] used a machine learning technique to decode the user actions based on domain filtering, packet filtering, packet intervals, and timeout. The results showed that Facebook, Gmail, and Twitter apps can be used to infer user activities. This strategy can help in learning the behavior of one or more users thereby facilitating intelligent decision making by both an adversary and security expert. The authors discuss that some countermeasures can also be taken to frustrate the attackers such as adding some padding during communication etc. which can be used to avoid the attacks [
15]. Anglano et al. [
14] proposed an automated tool named AnForA for Android forensic analysis. The targeted Android applications are installed on a virtualized Android device and followed by a set of activities performed to monitor the device storage artifacts. In another study, authors analyzed WeChat, Viber, WhatsApp, and Telegram apps. The results show that there are possible ways to retrieve the encrypted database files of various apps such as WeChat and Viber from a rooted phone [
12]. Similarly, Wu et al. analyzed the WeChat app and explored different scenarios such as acquiring the user data and decrypting the encoded database [
9]. Similarly, Anglano et al. [
10] analyzed the artifacts of the WhatsApp messenger app. This study managed to decode and interpret the artifacts from WhatsApp and successfully showed the correlation of artifacts among activities. Similarly, in another study, the authors performed a network analysis of WhatsApp. This strategy managed to decrypt the network traffic and is fully capable of obtaining the forensic artifacts related to WhatsApp calling activity [
19]. Similarly, a dedicated study on the Signal app to evaluate the security of its key exchange service was presented [
31]. A man-in-the-middle (MITM) attack was employed to compromise the key-exchange service. In experiments, authors launched the attack on a rooted device with the installation of the Cydia Substrate [
32] and SSL Trustkiller [
33]. Further, a PC was designated to intercept the traffic using an MIMT proxy and provides a WLAN hotspot for smartphones. For traffic interception, a dedicated script was written while the modified/customized version of the Signal app was installed on the attacker’s device to launch MITM attacks. As a result, the authors demonstrated that 21 out of 28 users failed to verify the identity of the other users by comparing encryption keys. However, the whole experiment was conducted and only applicable to an obsolete version of the Signal app.
From the aforementioned studies, most of the work focused on device forensics of social media applications with interesting artifacts. However, limited work was found on network forensic analysis especially on encrypted network traffic for instant messaging apps. A comparison of existing work can be seen in
Table 1.
There is still room to study the encrypted network traffic to identify various activities of user’s and app’s behaviors that further can be utilized for the evidence, either in direct/indirect or supportive manners in investigation purposes. Therefore, the motivation of this study is to propose a strategy to investigate the encrypted network traffic to uncover the various patterns/artifacts. Research is demonstrated by using the Signal app as a case study.
3. Experimental Setup
To capture network traffic, network infrastructure was designed as depicted in
Figure 2 inspired by [
18]. The proposed experimental setup includes a firewall, an access point, and a PC to display the ongoing traffic of mobile devices passing through the network. The Signal app-based mobile device was connected to the internet through a wireless access point, and all the internet traffic of the wireless access point was routed through a PC-based pfSense firewall [
34]. The role of the firewall was to monitor, sniff, and capture the entire network traffic. Thus, the firewall-generated trace files were saved for analysis. The configured firewall filtered out the Signal app-based internet traffic in a controlled environment. The access point aided to connect other devices such as mobiles and PCs with firewalls.
For network traffic analysis, the Wireshark software [
35] was installed to monitor the encrypted traffic by analyzing the trace files. It is noteworthy that the IP addresses and ports were in plaintext while the payload was encrypted for the sake of secrecy and privacy services. The IP addresses and ports created an opportunity to determine the behaviors of apps and its various activities performed during the communication. During experiments, we collected a large no. of trace files for analysis purposes, some of them are available on [
36] for better understanding.
Table 2 shows the detail of the devices and tools used during experiments.
To analyze the Signal app’s behavior, e.g., how client and server communication works, the proposed strategy sniffed the encrypted traffic between the client and its servers, where it showed a list of different servers with IP addresses and ports. Identification of ports and IP addresses aid in imposing new rules in the network. For example, during experiments we observed that different app servers provide separate services such as text, media, and calls (
Section 5). The detail of user activities such as opening the Signal app, text indicators, text messages, and audio calls are discussed in the result and analysis section.
3.1. Justification of Firewall
Placing a firewall in the investigation network allows the control to detect the app behavior effectively. Firewall rules are used to confirm the default app behavior. Furthermore, restrictions can be applied, and ulterior app behaviors can be brought to light. Default and alternate traffic patterns can assist the forensics examiners in solving the cases involving Signal or any other app. Moreover, it also helps in observing the client-server connectivity design pattern such as TCP/UDP ports and server ranges, etc. In this scenario, firewall configuration requires defining rules according to network requirements for both WAN and LAN. By default, both inbound and outbound traffic are blocked through the firewall. The LAN rules are defined to configure access to internet resources. As a standard practice, there are no rules defined for WAN NIC which means all inbound traffic are blocked unless configured. If there is any web server in a network, a WAN rule can be defined to access that server.
The
packet capture option in the firewall interface is used to capture the live traffic of the target device as shown in
Figure 3. Explicit capturing of smartphone traffic of the Signal app through a firewall or any other device in the network is possible with the help of IP addresses as shown in
Figure 4. This helps to reduce the reliance on port mirroring deployed in the setup explained in the paper [
18]. The captured live network traffic is saved from LAN/WAN as trace files which are then analyzed using Wireshark.
3.2. Configuration
The IP address for WAN is provided as 10.2.31.180/24 whereas the IP address of LAN interface is 192.168.0.6/24, thus static IP having a pool of DHCP LAN IPs range from 192.168.0.1 to 192.168.0.40.
4. Results and Analysis
The Signal app uses encryption algorithms to secure data for communication. Due to the encrypted payload, only packet sizes, frequency of the packets, and repetitive patterns of packets can be exploited for analysis. Therefore, certain payload sizes may help to determine the specific activity types. Although the privacy and secrecy are not compromised, extensive inspection of packet sizes (patterns, frequencies) reveals the prominent activities of an app. In this study, an extensive analysis of traffic dumps was performed, which revealed bytes and payload patterns of different activities. Although patterns emerged, as expected they did not help in the reconstruction of actual data. However, through the experimental setup explained in the earlier section, regulatory authorities and law enforcement agencies can determine the behaviors of the Signal app by observing consistent connectivity patterns. The complete summary can be seen at the end of the Results and Analysis section.
For better understanding, we created the following users with descriptions to conduct the app activities:
User A: target user, who’s entire set of activities are monitored;
User B: to whom User A communicates.
4.1. Identifying of Ports and Servers’ Range
After extensive monitoring of the encrypted traffic of the Signal app passing through the firewall, it was revealed that a common 443 TCP port is used throughout the communication process. In the case of blocking the 443 port on the firewall, it will also block other services on the smartphone that may be linked with the same port. Hence, blocking of the 443 port is not a suitable approach to analyze the other connectivity patterns of the Signal app. Therefore, the firewall rules are updated to evaluate new network patterns against different activities performed by the target user. Hence, network patterns against activities such as calling, texting, and typing are monitored and analyzed repeatedly to firmly reach the results.
Unlike many other XMPP (eXtensible Messaging Presence Protocol)-based applications that utilized specific TCP ports such as 5222, 5223, and 5228, the Signal app uses port 443 for communication. For voice and video calls, random UDP open ports are employed by the Signal app. However, through an extensive performing of various Signal app activities, we found a common set of specific chat server IP addresses with respect to our geographic location as shown in
Table 3 (detail in
Section 5). Similarly, for call services, the Signal client app uses Google’s servers in combination with the observed IP addresses of servers as shown in
Table 3. This may have been performed for better connectivity or load balancing of traffic.
4.2. Identification of Signal App Traffic
To identify the activities associated with the Signal app, we dumped the network traffic from the targeted Android device. As the captured traffic was encrypted it could not be decrypted without the cryptographic key. Therefore, we extensively performed and captured the various activities to uncover traffic patterns against the respective activities. User A is a target device on which multiple activities are performed. For better understanding, we analyzed and classified target User A (device) in the capacity of the sender, caller, and receiver, and its respective activities on the app are as follows:
Accessing/opening the Signal app;
Target device (User A) typing patterns;
User B typing patterns;
Call initiated by User A (caller);
Call received by User A (callee);
Media messages;
Video calls.
4.2.1. Accessing/Opening the Signal App
The opening of the Signal app activity represents the start of app traces on the Android device. Initially, it was difficult to classify the Signal-only traffic from the shared network traffic packets. Therefore, an extensive number of trace files were captured and monitored during the opening of the Signal app. The process of the above activity is as follows.
We observed that a standard DNS query was sent to access the Signal server from the client end, i.e.,
textsecure-service.whispersystems.org. In response, a list of eight servers with IP addresses was sent back to the client to establish a connection. In most cases, only the first two servers in the list were employed to establish communications. Further, it was noticed that the Signal app employed the two random ports with the server 443 port for authentication. An authentication process is shown in
Figure 5. A TCP connection was established in three steps, SYN, SYN ACK, and ACK between client and server. After that, a TLS handshake occurred by using TLSv1.2. Certificates, and session keys were exchanged between the client and server. Next, the client exchanged the key change cipher specs and sent an encrypted handshake message to the server. In response, the server sent a new session ticket, changed the cipher specifications, and sent an encrypted handshake message to the client device using TLS.
After the negotiation of session keys between the client and the server, the following packet patterns with payloads were observed between the server and client ends.
Signal client app (IP: 192.168.0.5) sent packets of (449, 372, 127) bytes with a payload of size (383, 306, 61), to the Signal servers;
In response, the server (IP: 54.175.47.110) sent packets of (261, 261, 137) bytes to the Signal client with a payload of (195, 195, 71).
The above activity was performed multiple times to confirm the patterns of opening the Signal app as shown in
Figure 6. Hence, we concluded that the above patterns of bytes transmission indicate the opening or accessing of the Signal application.
4.2.2. Target Device (User A) Typing Pattern
To notify the patterns of the Signal app when target User A starts typing a message in a chat window, certain flow patterns with fixed payload sizes were noticed. To be assured, these patterns were observed several times in trace files to deduce results as shown in
Figure 7. Target User A is highlighted with IP 192.168.137.69. It was noticed that when User A with IP 192.168.137.69 started typing in a chat window, 1181, 1182 and 1183 bytes of data packets with 1115 and 1116 payload size were sent to the Signal server. In response, the server sent 139 or 140 bytes of data packets with payload sizes of 73, 74 to the Signal client. The Signal client in response sent an empty packet of 66 bytes with payload size 0 to the server as an acknowledgment of the previous packets. Meanwhile, it was noticed that the Signal app indicated the typing status to the respective Signal servers, even though the message was not being sent to another party.
4.2.3. User B Typing Patterns
In this section, we will analyze the traffic patterns of User B. Once User B started typing a message in his/her chat window, which would be sent to the target User A, it was always noticed that 729 bytes of packets with 663 bytes of payload were sent from the server to the client at the device of target User A. The client at target User A also sent 122 bytes of the packet in response with 56 bytes of payload. The target client device A also sent a response with packets of 105 bytes. In sequence, the server kept sending alerts to the client of 101, 97 bytes data packet with 35 and 31 payload sizes as shown in
Figure 8. This pattern was observed and confirmed repeatedly through multiple iterations.
4.2.4. Call Initiated by User A (Caller)
Once User A dialed a call, the IP address of the receiver was also captured. While this scenario differs in the case of sending text messages, the IP address of the receiver would never be known even if the users were on the same network. Hence, binding between both users is explicitly seen. It was observed that the Signal app connected to at least three servers to finally connect with the receiver or callee. When a user opens the app, it is connected to a Signal server. Next, the user opens a chat window of the receiver to place a voice call. Further, when a user dials a call, the client app makes a DNS query to the Simple/Session Traversal of UDP through NAT (STUN) server of Google (
stun1.l.google.com) and the Traversal Using Relays around NAT (TURN) server (
turn1.whispersystems.org). In response to these DNS queries, the STUN server provides a Google server such as (64.233.161.127) and the DNS turn query provides a server IP of
whispersystems.org/signal server such as (52.47.185.9) as shown in
Figure 9.
Stun protocol is used for UDP calls to connect the electronic devices behind the NAT. This resolves the issues related to devices that are deployed between different NAT. Whereas TURN protocol is an extension of STUN. It is used to relay the messages between two devices after a connection is established. It can also be noted that binding is performed by the STUN server, while further communication is relayed between two devices and performed through the TURN server. These patterns were observed while the audio call was established, hence 1388 bytes of reassembled PDUs were sent to the Signal server to indicate audio call activity. When the binding request between the users was successful, it was observed that the private IP address of the other user was also visible. In
Figure 10, the IP address of the receiver was 192.168.10.10, and the IP address of the initiating user was 192.168.137.206.
The port numbers used by the TURN servers were 80 and 3478. In most cases, it was noticed that the public IP of the user was mapped with the TURN server. When the call was successfully established between the users, multiple UDP packets of length 22 were found that were transmitted between both users as shown in
Figure 11. When the call ended, ICMP packets were sent from the caller to the server that showed that the destination host was unreachable. The traffic patterns related to the indication of ending a call are shown in
Figure 12.
4.2.5. Call Received by User A (Callee)
When another user dialed a call to target device A, certain patterns were observed. Initially, the server sent a TCP segment of reassembled PDU to the client with packets of length 1454 bytes and a payload size of 1388 bytes. In response, the client sent 66 bytes of packets to the server. Later UDP packets were observed flowing between the sender and the receiver. It was also noticed that the IP address of the caller was also captured during this activity. The UDP packets of 64 bytes with a payload size of 22 bytes were observed between the receiver and the caller. As shown in
Figure 13 and
Figure 14, the server’s IP was 76.223.92.165, the IP of User A was 192.168.137.252, and the IP of User B was 192.168.10.3. The traffic patterns of incoming calls to User A is shown in
Figure 13. The flow of UDP packets during a call is shown in
Figure 14.
4.2.6. Media Messages
Media messages include sending pictures, videos, and stickers. These messages are sent from the target User A device to another user. Traced files of media messages were captured and saved for detailed inspection. We noticed that the bytes patterns were the same for all (picture, videos, and stickers) activities. Hence, distinguishing among these messages was difficult. The patterns included multiple groups of reassembled PDU that were seen delivered from client to server. The size of each PDU was 1454 with a payload size of 1388 bytes. It was also noticed that the client used only one port to send messages to the server port 443. After sending the messages, the server sent an equal acknowledgment to the PDU list of size 108 bytes with a payload size of 42 bytes as shown in
Figure 15. Due to similar pattern of all the above media messages, it was difficult to distinguish the types, (i.e., image, sticker) of messages. Moreover, the IP address the of receiver was also difficult to identify.
4.2.7. Video Calls
During the video call, similar patterns of audio calls were observed as shown in
Figure 16. The only difference was found in the length of UDP packets that varies from 800 bytes up to 1250 bytes.
A summary of observed encrypted traffic patterns is shown in
Table 4. As we learned the behavior of all major activities, now we can filter out the application traffic by looking at these patterns. From a generic network traffic dump, we can identify the parties without knowing anything about the parties beforehand. By looking at the patterns of protocol, forensic experts, network administrators, and security engineers can filter the Signal packets first and then further inspect them to identify the information about the user activities, type of communication, etc. to examine many cases depicting various scenarios.
4.3. Performance Comparision
In this section, we discuss and compare the identified potential artifacts from the proposed and existing strategies as shown in
Table 5. It is depicted that [
7,
9,
12,
29] strategies have not targeted the network traffic analysis. Meanwhile, Karpisek et al. [
19] aimed the network forensic of WhatsApp and successfully identified the video/audio calls (initiating, duration, termination) activity with an involved IP address. However, the messaging, typing, accessing, or closings of apps artifacts were not explored. Furthermore, the above results were taken from WhatsApp version 2.11.x.x that was released on 5 March 2015 almost 6 years ago. Similarly, Walnycky et al. [
8] successfully identified the call send/receive messages, etc. on various apps except for the Signal app. Recently, Sudozai et al. [
18] successfully identified the artifacts only on the IMO app. However, the proposed strategy successfully identified the app opening/closing, typing, media messages, calling with involved IP addresses from the encrypted network traffic for the Signal app.
5. Discussion and Use Cases
In this section, we will discuss various use cases to validate and verify the identified evidence from the proposed forensic strategy regarding the Signal app analysis.
5.1. Blocking List of Servers
While performing all the above-mentioned activities, certain results are deduced. For example, during multiple activities, we observed a group of Signal servers that are found in the response of the DNS query. While monitoring the network packets during the Signal app opening and connection times, it was noted that the app always connects to one of the servers mentioned in the DNS query, i.e.,
textsecureservice.whispersystems.org. During the experimentation, a group of eight type A servers were discovered in most DNS queries as shown in
Figure 17. It was also observed that an application establishes a connection with one of these servers using two random ports. The details of all those servers can be found in packets details in the Wireshark as shown in
Figure 18.
The Signal app attempts to connect either with the first server using one of the two ports or with the first two servers using any of the two ports mentioned in the list of the servers. It was also noticed that one port was used to establish a connection for indicating the typing patterns while the other port was used to send the actual data in the context of text messages. There are almost 10 servers that are used to connect with clients. Chat servers found during the study are already shown in
Table 3. It is also worth mentioning that server’s addresses are common in each list. However, the order of these addresses may be changed. To validate the server IP addresses of different services, firewall rules are applied to check the response of the app on the client end, because the firewall always provides an edge to control and monitor the ongoing activities within the network.
5.2. Blocking Chat Servers
In this section, we verify the chat servers’ IPs by blocking them using firewall rules on the pfSense software [
34]. Blocking these servers would let us know the connectivity pattern of the client app. For example, once the chat servers are blocked, the chat messages should not be delivered to another end. For experiments, we blocked the traffic of IP addresses of chat servers (on
Table 3) for LAN and WAN through firewall rules. As a result, the client app of Signal was not able to connect to these servers and the messages were not sending as depicted in
Figure 19. Signal servers use port 443 to connect to the client which cannot be blocked as other apps used 443 for communication purposes, (443 is an SSL communication port and highly resistant to eavesdropping and interception). Remote servers providing services using this port are trustworthy and verified without any doubt. Similarly, the web servers listen on this port that accepts and establishes secure connections for web browsers that desire strong communication security. Hence, blocking these servers would help to stop the Signal app services.
If we look at the Signal app chat behavior on a mobile device in
Figure 19, it tried almost 16 min to deliver the text message but failed. Meanwhile, the app tried to connect to all the above-mentioned chat servers. The Wireshark-based analysis shows the details of these servers in
Figure 20. When the client app tried to connect to these servers to deliver a message to another user, the mentioned servers did not respond or a 0 was sent in response to the client requests. It was noted that all services were working correctly without the restriction of IP addresses on the firewall. However, firewall rules used to block/delay the chat activities are mentioned in
Table 6. With the help of the rules and list of servers, the chat service was disrupted. The device of target User A tried to send messages to other servers but failed to do so. Similarly,
Figure 21 shows that the client application tried to connect the main servers to send messages. However, the connection was denied with each of them as highlighted in black.
5.3. List of Call Servers and Blocking
Through experiments, many trace files were captured and saved during call activity. As discussed earlier, the IP address of the call receiver can be captured during call activity, unlike messages activity. It is worth noting that at least three servers were involved. The initial server authenticates the client and the DNS query response, where two more IP addresses of the servers were sent to the client for further communication. The client connected with the first server (i.e., Google’s) which helps to find the receiver’s IP address. The second server helps to relay the communication between the caller and receiver. After collecting the above, the client was connected to the desired receiver as shown in
Table 7.
5.4. Crime Scene Reconstruction
The details of results found during this study can help in investigating the cases involving live traffic monitoring and capturing based on the Signal app. The forensics examiner can deduce the results of on-going calls, messages, and typing patterns. The communicating parties can also be identified with the IP addresses during the real-time call as well. The forensic examiner needs to obtain access to the target network with the help of the specific organization. Traffic dumps of the ongoing traffic are required to be captured at a certain time of the suspect’s activities. With the help of the results deduced in the section Results and Analysis, specific traffic can be filtered out among the ongoing traffic. Specific activities with the help of predefined patterns can be distinguished. Further, the information of the users with captured IP addresses can also be verified from ISPs.
5.5. Use Case (Hypothesis)
As we noticed, encrypted network traffic analysis can help in proving evidence of device forensics. For example, an investigator may correlate the evidence collected from device and network packets to infer or validate the results. Therefore, the proposed strategy may help to analyze the real-time/offline application behavior (from encrypted network packets) which can be useful for the investigator.
Use Case: For example, if sensitive official digital media was leaked in the organization using Signal app, to trace the activity, firewall logs can help investigate the activities assuming that the logs collection/retention policy is implemented in a way that an investigator can obtain information about devices, source IP, destination IP, payload sizes, services, protocols, and timestamps. Investigators can use the study of the behavioral analysis of the application to interpret the logs. Therefore, the proposed strategy for analysis of encrypted traffic of apps (such as Signal) can support in deducing the activities and involved users. For example:
If documents were leaked and received by users on 5th April around some specific time window such as 21:20 PM to 23:00;
Traffic dumps for the duration of interest were taken.
Figure 22 depicts that the Signal app communicated with its DNS for that specific duration;
Identify and analyze the traffic for particular activity as shown in
Figure 23 that depicts media message patterns were observed in a particular timeline with a specific source IP;
Finally, the investigator identified the source IP address from DHCP list;
A temporal analysis of a particular incident can be seen in
Table 8.
As in our lab, we had a controlled environment to study and analyze the behavior of the application ports and IP ranges. However, assumptions are as follows:
Controlled environment, admin has control over network and firewall;
Logs are being maintained according to the specific time duration;
After an incident, investigators can obtain the firewall logs to analyze the performed activities.