1. Introduction
IaaS (Infrastructure as a Service) is a new type of computing platform that provisions and manages hardware over the cloud [
1]. As IaaS can quickly scale up and down resources according to user demands, it avoids the expense and complexity of buying and managing each user’s own physical infrastructure. The billing of IaaS is determined based on the amount of resource usage, and thus IaaS is suitable for applications whose computing demands fluctuate dynamically [
2]. Although this is the merit of IaaS, users cannot instantly recognize the cost of the resources used. Some IaaS consumers need to know the expected cost of the services as soon as possible and adjust their cloud utilization policies accordingly. However, cloud service providers (CSPs) do not inform the billing information instantly, making it difficult for users to coordinate their policies proactively [
3]. In reality, evaluating the IaaS cost at the CSP level is a large-scale batch task, and thus it takes several hours or even a couple of days for IaaS users to be informed about the charge of the services.
To determine the IaaS cost, the usage of resources in the cloud should be measured and then evaluated [
4,
5]. The three representative resource types that need to be evaluated in IaaS are CPU, network, and storage. In the case of CPU, the basic price of a model is determined according to the computing power of the model, and the cost is charged based on the unit price of the model and the service time. In the case of network resources, the amount of data transferred over the network is evaluated and charged accordingly. In the case of storage resources, the combinations of the storage volumes used and the number of I/O operations are evaluated and charged. Although the cost of IaaS is determined based on the aforementioned resource usage, this is totally the job of cloud service providers and users cannot examine the billing procedure; they just pay the cost requested by CSPs.
In this article, we present an IaaS cost estimation model for forecasting the costs of public cloud resources. Specifically, our model estimates the cost of IaaS instantly by monitoring the usage of resources on behalf of virtual machine instances. As this is performed by the user-side metering daemon we developed, it is very precise and thus similar to the resource usage evaluated at the CSP side. To validate our model, we run a PC laboratory service for 50 students in two classes by making use of a public cloud during a full semester. Experimental results show that the accuracy of our model is over 99.3% in comparison with the actual charge of the public cloud.
The remainder of this article is organized as follows.
Section 2 briefly summarizes the related work of this study.
Section 3 explains the cloud PC laboratory platform we developed for IaaS services.
Section 4 describes the cost estimation model for IaaS in public cloud. In
Section 5, we apply the cost estimation model to our cloud PC laboratory platform.
Section 6 quantifies the accuracy of our model by applying it to two classes in the 2018 fall semester. Finally,
Section 7 concludes this article.
2. Related Works
There have been a number of studies for the cost model of IaaS. Mazrekaj et al. perform the comparison studies of IaaS models and their pricing schemes from different CSPs [
6]. Martens et al. present a formal mathematical cost model with the viewpoint of the TCO (total cost of ownership) in cloud computing [
7]. They argue that the analysis of relevant cost types and factors in cloud computing is an important pillar of decision-making in cloud computing management. Belusso et al. propose a cost model based on a simple linear regression method [
8].
Hedonic approaches for the IaaS cost model assume that the price should reflect embodied characteristics valued by some implicit nonfunctional features such as QoS (quality of service) [
9,
10]. Mitropoulou et al. show how to calculate and predict the cloud price accurately and how to avoid the price estimation bias [
9].
Hinz et al. develop a cost model reflecting the usage of the processing power [
11]. Aldossary et al. introduce a cloud system architecture and evaluate an energy-aware model that enables a fair attribution of a PM’s energy consumption to homogeneous and heterogeneous VMs based on their utilization and size, which reflect the physical resource usage by each VM [
12]. They also propose an energy-aware cost prediction framework that can predict the resource usage, power consumption, and estimate the total cost for the VMs during the operation of cloud services.
Sadeghi et al. propose a cost model considering hybrid could architectures [
13]. They define cost factors for hybrid clouds and propose a resource allocation model that considers their cost model. Tang et al. try to solve the resource allocation problem with their cost model considering fairness [
14].
Some research groups propose the price models for multi cloud environments. Wang et al. propose a taxonomy of pricing problems in the cloud resource and service markets, and demonstrate how relevant theories from the game theory, the control theory, and the optimization theory can be leveraged to address smart pricing problems for resource and service allocation of the cloud markets [
15]. Xu et al. address the problem of multi-resource negotiation with the considerations of both the service-level agreement (SLA) and the cost efficiency [
16].
Recently, IaaS cost estimation with the dynamic cost model have been proposed [
17,
18,
19,
20,
21]. Agarwal et al. present a method for predicting spot prices using techniques of artificial neural network and show the experimental evaluation results on various instances of Amazon Elastic Compute Cloud (EC2) [
17]. Meng et al. propose a parametric pricing approach in order to formulate pricing variables, which represent pricing factors and are calculated as well as a regression relation between the pricing variables and price [
20]. They demonstrate the effectiveness of the proposed methodology with the real-world data of an organization in China. Their experimental results show that the proposed method achieves significant generalization performance with the best mean squared error (MSE) and reliable results in randomness of ensemble learning.
Most existing cost estimation models focus on the estimation of the cost that will be charged for a long-term period in the future based on the previous resource usage already reported. As this is the prediction of future resource usage, the model is complicated, and thus it is difficult to be implemented as an instant monitoring module. In contrast, our approach estimates the current charge of the cloud services promptly by injecting the estimation module in the user-side metering daemon, which incurs minimal overhead. Thus, the main focus of our design is in the minimization of the real-time estimation overhead by simplifying the formulation, which is the unique aspect of our cost estimation model that distinguishes itself from existing models.
3. The Cloud PC Laboratory Platform
Managing a PC laboratory for a specific programming class is not an easy matter as PCs in a laboratory are usually shared by other classes and users. In particular, requirements of multiple classes and users are difficult to be satisfied with the same PC because operating systems and the software stack of a PC should be setup with predefined configurations and then be fixed.
To resolve these issues, various ways have been attempted, including multi-booting, operating system streaming, and PC virtualization. However, none of these offers a complete solution for cost and management aspects. A public cloud can be an alternative solution to provide a unique environment for each student or class by supporting virtualized PC environments in a template-based fashion [
22].
In this article, we develop a PC laboratory solution based on a public cloud called CLABO (CLoud LABOratory) [
23]. In CLABO, each student is assigned a virtual machine with one’s own computing environment and configurations, which can be used regardless of locations. Instructors can build a customized virtual machine template for a class by defining virtual machine images and administrative attributes including access permissions. Once a template is created, instructors can generate and distribute a bulk of virtual machines for students in the batch. CLABO also supports an easy installation of libraries and applications required for classes, which can simply be mirrored to the virtual machine of each student.
If a virtual machine of a student is turned on but idle for a long time, CLABO stops it automatically to avoid unnecessary costs of public cloud. Maximum time for each student to use his/her virtual machine can also be set according to the cloud utilization policy. CLABO provides the estimated cost of public cloud for all students’ virtual machines instantly based on the IaaS cost estimation model proposed in this article. We have developed and operated the CLABO service for three semesters at Ewha University and it is now publicly available on the web [
23].
4. An IaaS Cost Estimation Model
In this section, we describe the cost estimation model for IaaS in public cloud. Each virtual machine in a cloud is called
instance and the state of an instance can be either
active or
inactive. Thus, we estimate the cost of an instance differently by considering the state of the instance, i.e., the active instance cost
and the inactive instance cost
. An active instance cost
is the sum of the CPU cost, the storage cost, and the network cost as described in Equation (1). The CPU cost is determined by the instance type
and the usage time
of the instance. The storage cost is determined by the instance type
, the usage time
, and the storage usage function
, which evaluates the total number of I/O activities for a given time. The network cost of an instance is determined by the usage time
and the network usage function
, which evaluates the total amount of data transferred over the network.
Now, let us see how the cost of each resource type can be defined. First, the CPU cost
is determined by multiplying the unit price of CPU in instance
T, i.e.,
and the usage time
.
Second, the storage cost
is composed of the storage volume cost
and the storage I/O cost
.
is the cost of the storage space, which is expressed by multiplying the usage time
and the unit price of the storage volume in instance
T, i.e.,
.
is the cost for storage I/O activities, which is determined by multiplying the unit price of storage I/O, i.e.,
and the number of I/Os during the usage time
, which is accumulated by
that counts the number of I/Os at each monitoring interval.
The network cost
is determined by multiplying the amount of data transferred and the unit price of the network
, which consists of in-bound and out-bound costs. The costs depend on the unit price of the in-bound and out-bound data, i.e.,
and
, respectively.
The active instance cost
can finally be expressed as Equation (5), where
and
are fixed once the instance type
T is determined, and
and
are also predefined by CSPs. Thus, the active instance cost can be determined by the usage time
t and the amount of storage and network resources used, which are accumulated during the usage time.
Now, let us see how the inactive instance cost can be estimated. As the storage volume is the only resource that will be charged for an inactive instance, the cost of an inactive instance
can be estimated by multiplying the inactive time
and
as expressed in Equation (6).
Finally, the total cost of an instance,
is estimated by adding the active instance cost and the inactive instance cost as shown in Equation (7).
5. Validating the Estimation Model
To determine the cost of an instance, metering of the instance’s resource usage is necessary. This is the core process for billing IaaS, and thus CSP performs the metering and periodically reports the results to users [
24]. However, as CSP does not perform real-time monitoring for metering, users cannot recognize the cost of IaaS instantly, making a quick decision of services difficult. CLABO provides the real-time resource monitoring feature via metering daemons running on a guest operating system.
The accuracy of CLABO’s real-time usage monitoring can be validated by comparing it with the “AWS cost and usage report” [
25]. To do so, we execute various instance scenarios on AWS and compare the resource usage of each scenario monitored by CLABO and that of the AWS usage report. Specifically, we generate 19 instance utilization scenarios as shown in
Table 1.
Each scenario is represented by the four classifiers, i.e., the operating system, the application characteristics, the instance access method, and the instance type. In the case of the operating system, “L” and “W” represent Linux and Windows, respectively, as listed in
Table 1. In the case of the application characteristics, our scenario considers five commonly used applications in a software laboratory class: GNU development tools “g”, visual studio “v”, web browser “w”, eclipse “e”, Android studio “a”, terminal tasks “t”, and idle session “i”. The instance access method depends on how VDI is established, that is, “x” for no VDI connection, “n” for windowed VDI, and “f” for full-screen VDI. The instance type of a virtual machine is denoted by “m” for t2.micro, “s” for t2.small, and “M” for t2.medium.
We experiment each scenario on AWS for 20 min and compare the results of our estimation model and the “AWS cost and usage report”.
Figure 1 shows the instance usage time monitored by CLABO and that extracted from the “AWS cost and usage report”. In this graph, the results from the Windows platform are excluded as AWS EC2 charges the instance usage of the Windows platform on an hourly basis. As shown in the figure, the usage time monitored by CLABO and that of the “AWS cost and usage report” are very similar. Specifically, the accuracy of our model is shown to be 99.2% on average.
Figure 2 shows the instance usage time as time progresses for the Lixm scenario. The x-axis in the figure implies the execution scenario for the given time, and it may incur some variation of time while the user executes it. For example, booting, configuring, or shutdown of an instance can make variation of the usage time. In contrast, the time reported by AWS is the exact usage time of the instance, and the usage time monitored by CLABO may incur some inaccuracy as it is measured by the metering daemon every 20 s. As shown in the figure, the usage time monitored by CLABO is consistently similar to that of the AWS’s report. Specifically, the accuracy becomes better over time; it was 99.7% when the execution time is 20 min but became 99.9% as the time is 70 min. In some cases, over-estimation by CLABO occurs as the detection of instance termination is delayed due to the periodic execution of the instance status checker. However, this error is very small compared to the total instance time and can be negligible as time progresses.
Let us now see the network resource usage.
Figure 3 and
Figure 4, respectively, show the in-bound and out-bound network traffic for each scenario. As shown in the figures, the difference between the network traffic monitored by CLABO and that of the AWS’s report is very small. In the case of the in-bound network traffic, the monitoring result of CLABO exhibits a little more than the AWS reports. Lwfs and Wwfs incur a large network traffic of 100 MB or more but the difference is still small enough.
Figure 5 and
Figure 6 show the number of storage I/Os in terms of the read and write operations, respectively. Unlike other resource cases, storage usage monitored by CLABO and that reported by AWS have a certain amount of gap. This is because most CSPs use an I/O count as the cost metric of storage activity, but it is difficult to count the exact number of I/O operations at an instance level. In particular, recognizing the initial I/O activities is difficult in CLABO as its monitoring routine starts after the initial I/O activities. Moreover, I/O requests issued by an instance are usually merged by the storage-layer policies.
We consider this by using an additional parameter to estimate the number of I/O operations that occur before CLABO monitoring starts, and accumulate them. This is derived by performing some dummy scenarios, which have almost no storage activities at the instance level, and compare the results of CLABO and AWS report. Even though the estimation of storage activities is not as accurate as other resource cases, the accuracy is greater than 95% when the usage of storage resources is over a certain level. In all Windows platform scenarios, the number of storage operations tends to be large because Windows platforms have more I/O requirements of complex software stacks and incur additional paging I/Os due to the large memory usage. All numerical results obtained throughout experiments about CLABO monitoring comparison to AWS reporting are summarized in
Table 2.
6. Adopting the Model to Real-World Applications
CLABO can promptly track the cost of all created instances and also provide the cost breakdown analysis such as cost per user or cost per class. To quantify the accuracy of our cost estimation model, we utilized CLABO in two classes that require cloud PC laboratory platforms during the 2018 fall semester. The number of students was 50, and we generated 50 instances on AWS and compared the total cost charged by AWS and the cost estimated by our instant model.
Figure 7 shows the total costs charged by AWS for October, November, and December of 2018. As shown in the figure, CPU accounts for the largest portion of the cost for all months. The cost of the storage volume is varied significantly for different months, which accounts for 15% to 49% of the total cost. Note that the cost of the storage volume mostly resulted from the inactive instances, as the storage volume is charged although an instance is not active. Our analysis shows that over 95% of the storage volume cost occurred when the instances were inactive. As shown in the figure, the costs of the network and the storage I/O are not significant. The overall cost of the 50 virtual machines is less than 25 USD as the price of a t2.micro Linux instance is very low and CLABO automatically turns off the instances that are not used for a long time.
Figure 8 compares the total instance cost estimated by CLABO and the costs charged by AWS as time progresses. As the AWS usage report provides the billing information by the precision level of 1 h unit, we also compare the billing information provided by AWS and the estimated cost for each hour. As shown in the figure, the IaaS cost estimated by our model and the real cost charged are very similar. The error rate of the estimation is less than 1% for all cases although our estimation was performed instantly, and the accuracy is 99.3% on average compared to the actual charge of AWS.
We conduct some additional experiments to validate the proposed model. In particular, we execute two types of workloads, rendering and web service, and compare the estimated cost of our model with the real AWS charge. Note that rendering is a computing-intensive workload, whereas web service is an I/O-intensive workload, which are the two representative types of workload that can be executed on cloud.
Figure 9a depicts the accumulated hourly costs of our model and AWS charge when web requests were generated by the Apache benchmark. As AWS adopts the auto-scaling functionality, the number of web servers increases or decreases according to the evolution of workloads. However, as shown in the figure, the errors of estimated costs by our model (denoted as Web) are less than 1% for all cases.
Figure 9b shows the results for the rendering workload. This experiment was conducted by using eight rendering servers, which are set to terminate automatically when each assigned job is finished. Rendering is a computing-intensive workload and the cost is mainly dependent on the instance running time, and thus the cost graph is almost linear as time progresses before the completion of the rendering job. As shown in the figure, we can again see that our model estimates the real AWS charge precisely and the error rate is below 0.9% for all cases.
We perform another experiment to compare our model with the existing model that estimates the cost of a certain period in the future based on the resource usage of the previous history. As existing cost models predict future resource usage by adopting complicated modeling, it is difficult to be implemented as an instant monitoring module, so prediction should be performed periodically.
Figure 10 compares the estimated cost of the proposed model in comparison with the existing model that estimates the cost every 3 h. In this experiment, eight web servers were serviced during 12 h. As shown in the figure, our estimation model performs better than the existing model as our model estimates the resource cost instantly, but the existing model overestimates or underestimates the IaaS cost depending on the previous usage of the resources even though it can be adjusted by periodic usage report. Note that this experiment does not imply the superiority of our model as it monitors the resource usage instantly, but the existing model predicts future resource usage beforehand, and thus the goals of the models are different.
7. Conclusions
Cloud is an attractive computing platform as resources can be scaled up and down promptly according to the user’s computing demands. However, unnecessary cost can be incurred due to excessive idle instances and/or temporarily spiked usage [
7,
24]. Thus, customers of cloud services must conduct the monitoring of the charge continuously. Unfortunately, this is difficult as informing the charge of the IaaS requires at least several hours or even a couple of days. This article presented an instant cost estimation model for IaaS resources, and adopted it to the real system called CLABO, which monitors the resource usage of public cloud by making use of a metering daemon. To validate our model, we ran PC laboratory services on AWS for a full semester. Experimental results showed that the accuracy of our model is over 99.3% in comparison with the actual charge of AWS.
One constraint of our model is that it is applicable only when the real-time resource usage monitoring is possible. If any of the metering components for IaaS cost estimation is not available, instant cost estimation will be difficult. However, as we have shown through AWS, monitoring of instance-level resource usage is quite accurate and feasible to track by utilizing the usage report of CSPs. In particular, major CSPs gradually offer resource metering information such as AWS CloudWatch and Azure Monitor. Thus, we plan to extend our cost model by using CSP-level metering facilities and further validate it via other public cloud platforms such as Azure and GCP.
As our approach is based on the user-side metering daemon, it incurs a slight overhead, which can be considered as a weakness of our model. However, we further observed that our metering daemon incurs less than 0.1% of the total cost as our design focused on the minimization of the estimation overhead by simplifying the formulation.