1. Introduction
In recent years, edge computing has advanced rapidly. This progress has led many large enterprises and institutions to adopt edge-based models [
1]. Domestically, industries such as government, energy, finance, and small to medium-sized businesses have moved much of their infrastructure to the edge. Edge computing uses virtualization technology to combine resources like computing, storage, networking, and applications over the internet. It provides on-demand services to users. This model improves resource utilization, reduces operational costs, and increases overall efficiency.
Container technology is widely used in edge computing infrastructures. It is employed for packaging, isolating, and reusing applications due to its lightweight isolation mechanisms. Containers are increasingly replacing hypervisor-based virtual machines (VMs). They offer faster startup times, lower resource consumption, and better I/O performance [
2]. In this context, container security becomes crucial.
The Linux kernel’s
namespace and
cgroup mechanisms form the foundation of container technology [
3]. These mechanisms provide lightweight isolation and resource control. They allow containers to run applications independently while being isolated from the host and other containers. However, the kernel is shared with the host, making this isolation weaker than expected. This leaves containers vulnerable to threats such as container escape [
4,
5]. Container escape occurs when an attacker exploits vulnerabilities within the container to take control of the host system. For instance, the CVE-2019-5736 vulnerability enables attackers to execute code under specific conditions by exploiting a flaw in runc, granting them control over the host machine. This incident highlights the inherent risks posed by the shared kernel between containers and the host. Another notable example is CVE-2021-22922, which pertains to file permission configurations in Docker, allowing attackers to access sensitive host data via malicious operations in containers. Such cases illustrate that container escapes can result in not only data breaches but also significant service disruptions and other security incidents.
This is a serious security threat. Once the attacker controls the host, they can access sensitive data, modify applications, or launch attacks like Distributed Denial-of-Service (DDoS). These activities pose severe risks to the host’s integrity. Therefore, improving container security and preventing container escape is essential.
Container escapes can be classified into three categories. They can result from insecure configurations, vulnerabilities in related components, or kernel vulnerabilities. However, most research only focuses on one or two types of escape. This leaves gaps in comprehensive detection coverage. Further research is needed to develop mechanisms that detect all types of container escapes.
This paper proposes a container escape detection method based on a dependency graph. This method addresses all three types of container escapes. First, we introduce a method for identifying container processes on the dependency graph through label generation and propagation. Second, we propose a dependency threat model. Finally, to overcome the limitations of existing detection methods, we present a container escape detection approach based on file access control within the dependency graph.
2. Related Work
Several methods have been proposed to detect specific container escape attacks. Zhiqiang Jian et al. [
6] observed that after a container escape, the process operates in a different namespace from its parent. They used this as a basis for detection. Ke Xu et al. [
7] suggested mitigating container escape damage by using mandatory access control mechanisms. These mechanisms restrict escaped processes from accessing files illegally. Smith [
8] conducted a comprehensive study on container escape vulnerabilities in Docker environments. The study focused on analyzing these vulnerabilities and proposed detection methods to mitigate container escape threats. He et al. [
9] analyzed cross-container attacks in edge environments using eBPF, demonstrating how attackers could exploit eBPF to bypass container isolation. They proposed a new permission framework to address these security vulnerabilities. M. Reeves et al. [
10] studied 59 CVEs across 11 container runtimes. They recommended using user namespace-enabled containers to prevent attackers from exploiting vulnerable host components. M. Abbas et al. [
11] used a dependency graph to detect container escape attacks. Their approach flagged read/write operations from low-privilege namespaces to high-privilege namespaces as illegal. This method extended beyond the Docker environment to Kubernetes. Tao Zhang et al. [
12] modeled container escape behaviors caused by kernel vulnerabilities. They selected key process attributes as observation points and used privilege escalation as the detection criterion. They minimized the dependency graph size by recognizing container boundaries and built a heterogeneous observation chain based on the Open Provenance Model (OPM). VS D P et al. [
13] examined precaution levels and mitigation strategies for container security, offering insights into potential vulnerabilities and the current state of research. They also identified key areas for future exploration, particularly in server-based and serverless containers.
Despite these advances, there are two main limitations in current Docker container escape detection research. First, container escapes can be categorized into three types: those caused by insecure configurations, component vulnerabilities, or kernel vulnerabilities. Existing studies typically address only one or two types, failing to cover all three. Second, in real-world environments, container escapes often involve multi-stage, continuous attacks. Each stage may have different objectives and impacts. However, most current methods focus on detecting individual escape behaviors. They lack the ability to reconstruct the entire attack process.
Provenance-based Intrusion Detection Systems (PIDSs) [
14] offer a promising approach to address these issues. PIDSs detect intrusions using provenance graphs, also known as dependency graphs. These graphs contain various nodes and edges, representing diverse system behaviors. They can detect a broader range of attack types. Moreover, they capture the contextual relationships between system events during an attack. This enables attack sequence reconstruction using directed graphs. Most PIDS research currently focuses on detecting Advanced Persistent Threats (APTs).
Han et al. [
15] explored the opportunities and challenges of PIDSs. Li et al. [
16] proposed a PIDS framework with modules for data collection, data management, and threat detection. They evaluated recent approaches and discussed future research trends. M. Zipperle et al. [
14] provided a comprehensive literature review. They also emphasized the importance of benchmark datasets for future research. The earliest use of dependency graphs for intrusion detection dates back to 2003. Wong et al. [
17] performed threat modeling on the container ecosystem using STRIDE and surveyed existing mitigation strategies, assessing their strengths and weaknesses. L. Zhao et al. [
18] proposed a robust soliton distribution-based zero-watermarking method for securing semi-structured power data, ensuring data integrity and tamper detection in power systems. Y. Yang et al. [
19] introduced EPA-GAN, a model utilizing Generative Adversarial Networks (GANs) for anonymizing electric power data and balancing privacy and data utility. King et al. [
20] introduced Backtracker, which traced the origins of an attack using event dependency graphs. In recent years, PIDS has gained increasing attention from researchers. M. Du et al. [
21] developed DeepLog. This method uses multi-classifiers to predict subsequent events based on previous sequences. It applies LSTM to detect anomalies. M. Garchery et al. [
22] introduced ADSAGE. This system models application log sequences with Recurrent Neural Networks (RNNs). It predicts future events and uses Feedforward Neural Networks (FFNNs) to assess event validity and predict anomaly scores. Y. Song et al. [
23] developed a hierarchical dynamic risk assessment framework for the power data lifecycle, improving security through scenario-adaptive methods. M. Hossain et al. [
24] proposed SLEUTH. This method detects intrusions on a dependency graph through label propagation. It assigns trust and confidentiality labels to nodes. Predefined policies detect intrusions, such as when a low-trust entity accesses a high-confidentiality object. However, SLEUTH faces the issue of dependency explosion. The team later addressed this by introducing MORSE [
25], which reduces the impact of dependency explosion through a label decay strategy.
These dependency-graph-based intrusion detection methods do not account for container-specific provenance data during graph generation. Hassaan et al. [
25] proposed CLARION, a namespace- and container-aware solution. It identifies container boundaries using clone and unshare calls. It also detects container initialization patterns. However, this solution only supports Linux kernels up to version 5.7.
3. Methods
Container technology is widely used in edge computing environments. However, it faces significant security challenges, especially the threat of container escape. This paper proposes a container escape detection method based on a dependency graph to address these challenges. Current detection methods have issues such as limited coverage and low detection accuracy. These issues complicate the traceability of the full attack process and hinder detection.
To overcome these limitations, our approach uses a dependency graph. This enables comprehensive detection of the three main types of container escape attacks. The method not only improves detection accuracy but also enhances the ability to reconstruct complex attack chains.
This section presents a comprehensive overview of the method’s architecture and design, focusing on the global framework and the relationships between its key modules. It also details the construction of the dependency graph and explains the core principles guiding the design of the container escape detection process, aimed at providing an innovative and effective solution for container security.
3.1. Overall Architecture
The overall architecture of this solution is shown in
Figure 1. It consists of two core components:
Container process identification based on label generation and propagation within the dependency graph: First, by analyzing the process behaviors of the container, relevant container attributes (such as container-id, container-dir, etc.) are generated and propagated to the corresponding process nodes. Then, by generating nodes and edges, a dependency graph for the container is constructed. This process provides the foundational data structure for subsequent security threat detection.
Container image vulnerability detection based on file access control: This component constructs a security threat model for the container and uses the dependency graph to track the associations between file nodes and process nodes, thereby detecting potential security threats. This module primarily analyzes file access behaviors inside and outside the container to determine if there are potential escape behaviors or other security vulnerabilities.
3.2. Dependency Graph Design
3.2.1. Container Process Behavior Analysis
The Linux kernel uses a Namespace mechanism to isolate container processes. It restricts the resources that containers can access, such as processes, file system mount points, and network stacks. The cgroup mechanism further limits resources like CPU, memory, and network bandwidth. Additionally, security mechanisms such as AppArmor, Seccomp, and SELinux apply restrictions to container processes to ensure security.
Aside from these mechanisms, container processes do not fundamentally differ from other system processes.
By analyzing the container startup process, it is observed that containers follow a fixed procedure. The behavior of processes within the container falls within a defined range of activities.
The container startup process is shown in
Figure 2, and the specific steps are as follows:
The client sends a request to the daemon to create a container.
After receiving the request, the daemon (dockerd) completes operations such as configuring the container working directory and sends instructions to the container runtime engine (containerd) via gRPC.
containerd starts a containerd-shim process for each container, which is responsible for creating the new container.
containerd-shim invokes the runc process to initialize the container. The parameters passed to runc specify the configuration path of the container (i.e., the location of config.json), and the root path of the container is also prepared. The container startup process formally begins.
The runc child process, runc:[0:PARENT], replicates another child process, runc:[1:CHILD], which creates a new namespace via the unshare system call.
runc:[1:CHILD] spawns further child processes, runc:[2:INIT], to complete container initialization, such as setting up /rootfs, /proc, and network stacks.
Finally, runc:[2:INIT] executes the execve system call to run the container’s ENTRYPOINT program (such as sh or apache), which becomes the container’s init process, i.e., process 1.
In summary, the container is started by containerd-shim, which invokes runc to launch the container. After the container starts, runc exits, and containerd-shim becomes the parent process of the container. It is responsible for collecting the container’s process status and reporting it to containerd. When the container’s initial process (i.e., process 1) exits, containerd-shim cleans up the remaining child processes within the container to prevent zombie processes. During the container’s runtime, the ps command on the host shows containerd-shim as the parent process for each container.
Based on the above analysis, in the dependency graph, the containerd-shim process node can be seen as the starting point of a container. Since it contains container ID information, we can use this node to assign container attribute labels to the containerd-shim process and its child processes. This helps in identifying container processes. The specific method will be introduced in the following sections.
3.2.2. Node and Edge Design in the Dependency Graph
Based on the analysis above, we can distinguish which process nodes in the dependency graph belong to a specific container by following the behavior patterns of Docker container processes. The process nodes contain the attribute labels shown in
Table 1. This table summarizes key attributes used to identify and track process nodes in a dependency graph, with each attribute playing a specific role in distinguishing processes and recording critical metadata. The
Type attribute differentiates between node types, such as processes, files, or network sockets. Name records the name of the process, while
Pid and
Ppid store the Process ID and Parent Process ID, respectively, allowing for the identification of processes and their hierarchical relationships.
Uid and
Gid log the user and group associated with the process, providing insight into access control and permissions. The
Exe attribute captures the executable file that initiated the process, linking it to specific binaries. For containerized environments,
Container-id tags the container to which a process belongs, and
Container-dir indicates the container’s file directory on the host system. Finally,
Command-line records the command executed by the process, offering a detailed view of the process’s actions. These attributes are essential for tracking processes within containers, ensuring accurate monitoring and detection of potential security risks or anomalous behaviors in edge computing and containerized environments.
Table 2 presents the key attributes for file nodes within a dependency graph.
Type is used to distinguish node types, such as process, file, or network socket nodes.
Inode records the file’s inode, a critical identifier in the file system that stores metadata about the file.
Path logs the file’s location in the directory structure, allowing for easy tracking of where the file resides.
Permissions captures the access rights associated with the file, such as read, write, or execute permissions (e.g., 0644). These attributes are essential for monitoring file behavior and ensuring proper access control in the system.
Table 3 lists key attributes for network socket nodes in a dependency graph.
Type distinguishes the node type, while
IP records the socket’s IP address and
Port captures the socket’s port number, providing essential information for network-related processes.
- 2.
Edge Design
Edges represent relationships between nodes. This paper focuses on four types of edges, corresponding to the Open Provenance Model (OPM):
used,
wasTriggeredBy,
wasGeneratedBy, and
wasDerivedFrom. In addition to the type attribute, edges also have the attributes
eventId (event ID) and
time (the time of the event) to determine the event’s sequence. The attributes
syscall (system call that triggered the event) and
operation (operation corresponding to the event) are used to record the system call and operation of the event, as shown in
Table 4.
3.2.3. Label Propagation in the Dependency Graph
Each containerd-shim process represents a Docker container. Any processes running inside the container are derived from the containerd-shim process. Based on this, the paper proposes a method for generating and propagating container attribute labels. These labels are used to identify container processes during the generation of the dependency graph.
First, the method generates a container-id label for each containerd-shim process node. This label indicates that these containerd-shim processes represent different containers. The container-id is a 64-character string composed of lowercase letters and digits. It can be obtained by querying the command-line of the containerd-shim process and matching it using regular expressions.
Once the container-id is obtained, a container-dir (container directory) label is generated for the process node. This label indicates the file directory of the container on the host. From the host’s perspective, Docker container files have specific paths. These paths can be viewed by executing the command docker inspect <container-name> or docker inspect <container-id>.
Finally, processes derived from
containerd-shim inherit its container attribute labels. This indicates that these processes originate from the same container.
Figure 3 summarizes the label generation and propagation process.
In the dependency graph, when a new process node is added, its container attributes are determined. If the process is a containerd-shim process, the container-id (i.e., the container it belongs to) can be obtained from the command-line property. The container-dir (container directory) can be retrieved using the command docker inspect --format = {{.GraphDriver.Data.MergedDir}} <container-id>.
If the process is not a containerd-shim process, the method checks if it has a parent process. If no parent process exists, the new process node does not have container attributes. However, if the parent process has container attributes, the new process inherits those attributes.
3.3. Container Escape Detection
3.3.1. Container Escape Model
Currently, there is no unified definition of container escape. However, this paper defines it with two main objectives: (1) Gaining command execution capability on the host, and (2) gaining access to files on the host.
After escaping, container escape behavior typically involves the escaped process accessing files on the host. As shown in
Figure 4, container process 1 gains the ability to execute commands on the host after escaping. When running processes (escaped processes) on the host, it needs to load the binary files of the process.
Similarly, container process 2 gains access to files on the host after escaping. It then performs operations, such as opening files, to steal data from the host.
3.3.2. Container Escape Detection Method
This section proposes a container escape detection method based on file access control within the dependency graph. The method determines if container escape has occurred by detecting whether processes from the container have accessed files outside the container. Specifically, when a container process node in the dependency graph is associated with any file node (via an edge of type used or wasGeneratedBy), the method checks if the file and the process belong to the same container. If they do, the association is considered legitimate. If they do not, it is considered a container escape.
Figure 5 shows a simple dependency graph with three process nodes and four file nodes. Process 2’s parent is Process 1, and Process 1’s parent is the
containerd-shim process. According to the container attribute label generation and propagation method described earlier, all three processes are labeled as belonging to the same container. Four files are associated with Process 2.
Among these, container-internal file 1 and container-internal file 2 belong to the same container as Process 2. Therefore, these associations are legitimate. However, container-external file 1 and container-external file 2 do not belong to the same container as Process 2. Their association is illegal and represents container escape behavior.
Linux Audit records do not indicate whether an event occurred on the host or inside a container. Even if the event happened inside a container, the audit record cannot directly identify which container was involved. For instance, if the /etc/passwd file is read inside a container, the audit record of type = PATH shows name = “/etc/passwd”. However, it does not specify whether the file belongs to a container or the host. To resolve this, the paper uses the inode (index node) provided by the Linux Audit record to determine the file’s location.
An inode is one of the most important disk structures in the Linux file system. It describes metadata such as file size, permissions, and the location of file blocks. Unfortunately, an inode does not directly provide the file name or path. While it is possible to find the inode using a search (e.g., with the shell command find <path> -inum <inode>), this method is inefficient. However, the inode can be retrieved directly by using the file name.
Based on this, when a container process node in the dependency graph is associated with any file node, and the container-dir property (the container’s directory on the host) and the file node’s path and inode are known, the process is straightforward. It is only necessary to check whether “<container-dir>/<path>“ exists and whether its inode matches the file node’s inode. This method determines whether the file and process belong to the same container, and thus, whether container escape has occurred.
There are exceptions for default Mount Namespace mount points within the container. These can be viewed by running the command
cat/proc/self/mountstats inside the container, as shown in
Figure 6. A whitelist can be created to exclude directories like
/proc,
/sys, and
/dev, as well as files such as
/etc/resolv.conf,
/etc/hostname, and
/etc/hosts from path checks. This whitelist can be customized based on the actual scenario.
4. Evaluation
In the previous section, we provided a detailed explanation of the overall architecture of our solution. This included the design and generation of the dependency graph, as well as the detection of container escape and the reconstruction of the attack process. The use of a dependency graph makes the detection process, especially the attack reconstruction, more structured and comprehensive. This approach allows for the detection of all three types of container escapes. It also enables the full reconstruction of multi-stage attack processes. This section focuses on evaluating the effectiveness of the detection and reconstruction capabilities claimed by our method.
4.1. Experimental Environment
The purpose of the experiments is to verify the following aspects of our method:
The ability of the dependency graph generation method to identify container processes and its performance overhead;
The detection capability for all three types of container escape;
The effectiveness of the attack reconstruction method.
To validate these capabilities, we set up the experimental environment with the following specifications: Ubuntu 22.04 as the operating system, a 5.19.0-38-generic kernel, Docker 18.03.1 as the container runtime, Kubernetes 1.23.1 for orchestration, and Neo4j 4.1.1 as the graph database. Additionally, Linux audit rules were configured as “-a always,exit -F arch = b64 -S fork -S vfork -S clone -S execve”. Due to the high kernel version (>=5.7), kernel modules do not support namespace recognition, so we did not consider the unshare and setns system calls.
4.2. Container Escape Detection Experiments
To test the effectiveness of our container escape detection method, we replicated six common container escape attack techniques in the experimental environment. We then verified the method’s effectiveness.
As this paper focuses on container escape threats, we assume that the experimental environment contains only risks related to container escape. All other components, including the Linux Audit system and all code (both native SPADE and our modifications), are assumed to be secure. Additionally, it is assumed that Linux Audit logs remain intact and uncompromised. The Neo4j database is also assumed to be untampered with. Vulnerabilities in other layers, such as web applications or hardware, are beyond the scope of this study.
Container escape attacks typically fall into three categories: kernel vulnerabilities, insecure configurations, and vulnerabilities in container-related components. The detection experiments for these three types are described below.
4.2.1. Escape via Insecure Configurations
Insecure configurations, such as dangerous mounts and permissions, can lead to container escape. Unlike software vulnerabilities, these risks are often caused by human error during the container setup. In development environments, developers or system administrators may use improper configurations for convenience. Attackers can then exploit these misconfigurations.
This section lists one common example of container escape due to insecure configurations, along with the test results.
Table 5 shows a common insecure configuration leading to container escape.
The privileged mode was initially designed to enable Docker-in-Docker functionality, but due to its extensive privileges, it poses significant security risks to the host. In privileged mode, all capabilities are enabled for the container, and security mechanisms like AppArmor, Seccomp, and SELinux are disabled, exposing host devices to the container.
A privileged container can mount the host’s disk, bypassing the isolation of the file system and enabling access to host files.
We replicated a privileged container escape in the experimental environment, generating the provenance graph shown in
Figure 7. (a) contains all the events on the host during the experiment, while (b) focuses on container-related events.
Since the original graph generated by Neo4j is too complex to show the attack process clearly, we manually extracted the core steps and marked where the container escape detection rules were triggered, as shown in
Figure 8. Subsequent experiments follow a similar approach.
The experiment shows that privileged container escape can be detected. After mounting the host disk, access to host files triggered the detection rules.
Figure 8 provides a clearer view of the core steps of the container escape. In steps 1–6, the current directory in the container was inspected, and a host folder was created. Steps 7–11 show how the host device was mounted to the host folder using the privileged container. Steps 12–17 demonstrate file access, including reading files on the host. In step 17, the cat process in the container accessed the host’s
/etc/passwd file, triggering the detection rule.
4.2.2. Escape via Component Vulnerabilities
Container clusters in production environments involve many components, which may have vulnerabilities, including container-related software. This section lists one common component vulnerability, as shown in
Table 6. We replicated this vulnerability and tested the effectiveness of our container escape detection method.
CVE-2019-5736 is a well-known vulnerability in runc, where an attacker can overwrite the runc binary on the host to execute arbitrary commands.
The vulnerability allows the attacker to obtain the file descriptor (fd) of the runc process via the /proc/[PID]/exe file and inject a payload into runc, which is executed the next time runc is run. The attack requires interaction between the container and the host’s runc process.
We replicated this vulnerability, and the resulting provenance graph is shown in
Figure 9. (a) shows all events, while (b) focuses on container processes. The dense links between nodes in (b) represent the exp process trying to write payloads into
runc.
The experiment shows that CVE-2019-5736 can be detected. In steps 1–10, shown in
Figure 10, the exploit is downloaded. Steps 11–13 execute the exploit, rewriting
/usr/bin/sh inside the container. Steps 14–15 involve capturing
runc’s
PID, and step 16 involves writing a malicious payload to
runc. Steps 17–22 show the execution of
runc, now modified to run the payload, resulting in file creation on the host in step 22 (touch
/tmp/pwn-success).
Steps 16 and 22 triggered the detection rules: step 16 represents the container’s exp process overwriting runc on the host, and step 22 represents the execution of the payload.
4.2.3. Escape via Kernel Vulnerabilities
Kernel vulnerabilities are highly impactful, especially those affecting shared resources like containers. Containers and the host share the same kernel, meaning any kernel vulnerability affects all containers running on the host. However, not all kernel vulnerabilities can be exploited for container escape. This section replicates two harmful kernel vulnerabilities, shown in
Table 7, and tests their detection.
CVE-2022-0847, or DirtyPipe, is a file overwrite vulnerability affecting Linux kernel 5.8 and above. It allows any user to overwrite arbitrary files, similar to DirtyCOW.
DirtyPipe allows the modification of the file cache via a pipe’s buffer, enabling file overwrites. While DirtyPipe cannot directly cause container escape, it can be combined with CVE-2019-5736 to overwrite runc and achieve escape.
We replicated the
DirtyPipe vulnerability by overwriting
runc, generating the provenance graph shown in
Figure 11. (a) shows all events, while (b) focuses on container processes. The dense links between nodes represent the exp process writing payloads into
runc using
DirtyPipe.
The experiment demonstrates that using CVE-2022-0847 to overwrite
runc for container escape can be detected. Although the process of overwriting
runc via the kernel cannot be detected, the subsequent execution of
runc triggered the detection rules. Steps 1–7, shown in
Figure 12, involve executing the exp process in the container, which rewrites
/bin/sh. Steps 8–11 show the interaction with
runc. Steps 11–15 show the execution of
runc, now modified by
DirtyPipe, resulting in file creation on the host.
This section validates the effectiveness of the dependency graph-based method for detecting container escape attacks through experimental results, illustrating its ability to identify and reconstruct various attack types. Initially, the experiments evaluated the method’s capacity to generate dependency graphs, showcasing its outstanding performance in recognizing container processes, particularly within insecure configurations. The method’s flexibility and scalability allow it to dynamically adapt to emerging or evolving escape techniques. Specifically, in scenarios involving collaborative attacks that exploit kernel vulnerabilities to obscure the runc process, the method can update the dependency graph in real time, monitor potential anomalies, and swiftly modify detection strategies, thereby reducing reliance on traditional detection systems. This adaptability enhances the response to novel attacks and provides effective support against increasingly complex cybersecurity threats.
In testing component vulnerabilities, the method successfully reproduced CVE-2019-5736, capturing the process by which attackers exploit the runc vulnerability for container escape. This finding affirms the method’s detection capabilities and highlights the security risks associated with real-world deployments, urging development and operations teams to prioritize security updates for software components. Additionally, experiments focused on kernel vulnerabilities, particularly CVE-2022-0847, underscore the significant implications of these vulnerabilities for container security, emphasizing the critical need for heightened awareness of kernel security when implementing container technologies.
The findings of this section confirm the application of dependency graphs in container security detection, indicating that this method excels in detection while also providing structured support for attack tracing and reconstruction through its inherent flexibility and scalability, thereby establishing a foundation for future research in container security.