In the absence of Raspberry Pi-based gateways, IoT devices have no defense against cyberattacks. Therefore, in this work, we embedded an ML-based IDS into a smart thermostat built with an ESP32 microcontroller. The ESP32 microcontroller has 440 KB of ROM and 520 KB of SRAM for program storage and execution. To implement the IDS, the smart thermostat extracts features from data traffic using the lwIP library. The smart thermostat communicates with an HTTP client hosted in the cloud through HTTP POST requests.
In our previous work [
19], we evaluated the feasibility of embedding RF, XGBoost, DT, and ANN-based IDSs in terms of memory, inference time, and accuracy. Our results showed that XGBoost-based IDS outperformed DT, ANN, and RF for binary classification. In this work, we evaluated the performance of optimized CatBoost-based IDS on a smart thermostat for binary and multi-class classification and compared the results with our previous work. In this section, the implementation of IDS with CatBoost is discussed. Initially, we present the simulation results of CatBoost for binary and multi-classification of attacks on a smart thermostat. We also discuss the impact of FS on the performance of CatBoost implementation with reduced features using the FS technique. Later, we will discuss the implementation of CatBoost IDS on an ESP32-based smart thermostat and compare it with XGBoost.
4.5. Quantization of CatBoost-Based IDS
Quantization in ML is used for compressing large model data. The ESP32 has a limited memory and processing power; therefore, running a large ML model may not be feasible or may require a large inference time. The weights and parameters of trained CatBoost models are saved in double format for higher precision. The floating point numbers can be represented in half (16-bit), single (32-bit), and double (64-bit) precision. The hardware acceleration in ESP32 is only supported for single precision, while double is implemented using software that requires more memory and takes longer to complete the operation. Therefore, in this work, we utilized post-training quantization in which the model parameters and weights of the CatBoost model were stored using single precision. Quantization may result in a small decrease in model accuracy due to reduced numerical precision. However, in CatBoost models, where tree-based learning creates discrete splits, the impact is generally minimized. This trade-off is often acceptable for models that do not rely heavily on high precision in calculations, particularly for inference tasks on ESP32. Since the ESP32 supports hardware acceleration for single precision (32-bit float), converting the parameters of CatBoost from double to float results in lower memory consumption and faster inference time.
Therefore, to support our aforementioned discussion, we also compared the detection accuracy of CatBoost (Depth = 6, Trees = 200) with and without quantization for binary classification in
Figure 3 and the results illustrate that the accuracy did not decrease. In
Figure 3, the graph shows the detection of benign (0) and attack (1) for 50 samples. The CatBoost with quantization results are the results of the model running on ESP32 and the CatBoost without quantization results are the results of the model running on the local machine. Therefore, we can conclude that applying quantization for limited memory-based ESP32 did not compromise the accuracy of the detection.
4.6. IDS Implementation on Smart Thermostat for Real-Time Intrusion Detection
The implementation of IDS on a smart thermostat is challenging due to its limited memory and processing power. In our previous work, we implemented an XGBoost-based IDS for binary classification without using an FS technique. In this section, we evaluate the implementation of a CatBoost-based IDS on a smart thermostat for both binary and multi-class classification. Additionally, we compare the performance of the CatBoost-based IDS with the XGBoost-based IDS on the smart thermostat.
The ESP32 has limited RAM and program storage, which prevents embedding an IDS with a large number of trees and deep models. In
Table 7, for Depth = 6, a CatBoost model with 200 trees can be embedded for binary classification. Beyond 200 trees, the memory overflows on the ESP32. For Depth = 7, a CatBoost model with 120 trees can be embedded without using FS, and with FS, a model with 140 trees can be embedded. Similarly, the implementation results for the maximum possible number of trees and depths are shown in
Table 7. The results show that for binary classification, the CatBoost-based IDS achieved the highest accuracy of 98.71%, outperforming the XGBoost-based IDS developed in our previous work [
19], which had an accuracy of 97.66%. The results indicate that the CatBoost-based IDS is more accurate and faster than the XGBoost-based IDS.
For multi-class classification, the highest accuracy of 97.51% was achieved by the CatBoost-based IDS (90 trees, depth = 6). In comparison, the highest accuracy achieved by the XGBoost-based IDS (50 trees, depth = 7) was 96.70%, with an inference time of 2111 μs, whereas the inference time for CatBoost was only 267 μs.
4.7. Discussion
In this study, we developed an IDS for IoT devices in smart homes where gateways are not available for the deployment of IDS. The IDS was embedded in a smart thermostat for real-time intrusion detection without relying on gateways.
We evaluated the feasibility of implementing the IDS for binary and multi-class classification. The CatBoost model (Trees = 200, Depth = 10) achieved the highest accuracy of 99.03% for binary classification both with and without FS. However, due to the limited program storage and RAM available on the ESP32, embedding a CatBoost model with a depth greater than eight was not possible. The IDS performance comparison of XGBoost and CatBoost for binary and multi-classification on smart thermostat is shown in
Figure 4. The highest number of trees allowed for Depth = 7 was 140, and for Depth = 8, the maximum number of trees was 70. Therefore, the maximum accuracy achieved by the CatBoost-based IDS on the smart thermostat was 98.71% using a depth of six and 200 trees. The CatBoost-based IDS outperformed the IDS implemented with XGBoost in our previous work [
19] in terms of accuracy, inference time, and program storage. The CatBoost-based IDS improved accuracy by 1.06%, decreased inference time by 92.14%, and reduced program storage by 14.09%.
Similarly, for multi-class classification, the highest accuracy was 98.15% with the CatBoost (Depth = 7, Trees = 200)-based IDS. On the smart thermostat, the highest accuracy of the CatBoost-based IDS was 97.51%. The CatBoost model outperformed the XGBoost-based IDS in terms of accuracy, inference time, and program storage. The CatBoost-based IDS improved accuracy by 0.83%, reduced inference time by 87.35%, and reduced program storage by 11.32%.
The increase in the number of trees improves accuracy by capturing complex patterns, but this increase also raises the computational burden; as a result, inference time and memory consumption increase. A deeper CatBoost/XGBoost model captures complex patterns and increases accuracy. However, deeper trees consume more memory and increase inference time. As shown in
Table 3, increasing tree depth generally improves accuracy. However, in memory-limited devices such as ESP32 microcontrollers,
Table 7 demonstrates that as depth increases, fewer trees can be used before memory overflow occurs. For example, at a depth of six, the maximum number of trees is 200 for binary classification and 90 for multi-classification. When depth is increased to eight, the maximum number of trees drops to 70 for binary classification and 25 for multi-classification.
The FS method improved the accuracy for binary classification by using a reduced number of features.
Table 4 shows that with a reduced number of features, the accuracy improves. However, on the microcontroller, the FS technique did not reduce inference time or program storage. The inference time depends on the number of trees and the depth of the trees. Since the number of trees was either the same or greater, the inference time was not significantly improved.
CatBoost and XGBoost are both based on the gradient boosting framework, using decision trees as base learners. However, CatBoost is particularly well-suited for constrained devices like the ESP32 due to its native support for categorical data, eliminating the need for preprocessing. CatBoost’s ordered boosting technique efficiently handles training and inference without storing large amounts of data, making it ideal for memory-limited environments.
Additionally, CatBoost prevents target leakage by splitting the data into multiple parts and using only past data for predictions. This approach not only maintains high accuracy but also reduces storage requirements and prevents overfitting as depicted in
Figure 4. Unlike XGBoost, CatBoost uses symmetric trees, which require fewer computations for predictions, contributing to faster inference times. This is critical for implementing IDS on resource-constrained devices like the ESP32, where minimal inference time is crucial.
Furthermore, CatBoost’s symmetric tree structure allows it to use fewer trees and shallower depths while maintaining performance, which further enhances inference speed as shown in
Table 7. These characteristics make CatBoost a more efficient choice than XGBoost for IDS deployment on microcontrollers.
In smart homes, IoT devices have limited computational resources and are used in sensitive applications like door locks, CCTVs, and smart thermostat, etc. During the interaction with the cloud, an adversary can stop the operation of these devices with a DoS attack or can intercept the sensitive information through an MITM attack. A lightweight IDS that can quickly and efficiently detect these attacks with minimal computational load can enhance security for homeowners. Rapid detection of DoS attacks is especially crucial, as it allows for mitigation before the device is overwhelmed. In this work, we demonstrate that a CatBoost-based IDS can detect these attacks in under 276 microseconds, without sacrificing accuracy.