1. Introduction
A core component of Cyber-Physical Systems (CPS) are sensors to obtain information about the physical world [
1]. They enable capturing highly relevant environment data, including geo-spatial context information, to successfully operate an embodied CPS. Light detection and ranging (LiDAR) technology has been used to create detailed scans of physical objects, which can become the basis for digital representations of these objects. LiDAR technology is, e.g., used to capture real-time data concerning the surroundings of autonomous vehicles to detect and recognize objects [
2]. It is also applied in the context of indoor navigation of autonomous systems for the complex task of simultaneous localization and mapping (SLAM) [
3]. Scan data has also become a crucial building block for the creation of digital twins [
4].
Digital twins (DTs) are traditionally seen as simulated representations of a physical object or a set of objects (cf. one of the earliest definitions of the term provided by the NASA in [
5]). Currently, digital twins are often used in the context of industry to represent objects that are part of production (production systems, material, products, …). Especially in the context of Industry 4.0 and smart manufacturing, digital twins are seen as a promising technology—cf. [
6] with application areas pertaining to product design, production, and prognostics and health management. Digital twins can serve as a design and engineering prototype for physical objects and allow for testing of different functionality [
7]. In the context of CPS, executable behavior specifications in the form of subject-oriented process models were proposed for exploring and validating CPS behavior in the form of a digital twin thread [
8,
9].
Digital twins aim to serve as a single hub of information about the real-world object(s) for a user, linking systems and realizing the structuring, monitoring, and exploitation of data [
10]. Therefore, structuring the data produced by a CPS during its operation is a crucial aspect of digital twins to enable the monitoring of the system and its environment while it executes its behavior in the respective CPS context, e.g., business processes. To this end, the information needs to be structured in a form that makes it accessible to the relevant stakeholders, leading to usage-driven deployment of digital twins and thus user acceptance [
11].
Although geo-spatial information is considered to provide useful context data, one challenge concerns deriving relevant information from the created raw scan data (i.e., 3D point clouds). In this context, semantic segmentation, i.e., classifying every point in a scene and assigning it a label to give the data semantic meaning, has been tackled extensively in the literature through a variety of approaches, recently propelled forward by deep learning (see [
12]). Performing such segmentation tasks is often a part of the early stages in the digital twinning process (see [
13,
14]). Different formats for 3D models have furthermore been proposed in a variety of different contexts to structure the relevant produced information (e.g., data formats for digital city models (see, e.g., the City Geography Markup Language (CityGML) at
https://www.ogc.org/standards/citygml, retrieved 8 June 2022) that highlight buildings, vegetation, …).
A further challenge concerns the (real-time) visualization of created digital representations (cf. [
15]). Depending on the intended DT use, the produced geo-spatial context needs to be potentially integrated with other information to serve a specific purpose (see the DT created in [
16], which incorporates data from different auxiliary sources). In the context of CPS, the produced geo-spatial data needs to be linked with other information, e.g., be embedded with other sensor data, since one of the core added value propositions of DTs lies in making use of linked information [
10].
In this article, we explore the structuring of geo-spatial information, as derived from 3D point cloud data, in order to consolidate it with relevant information required for designing and running digital twins as digital threads (cf. [
17]). Hence, we presume that the raw 3D scan data has already been processed to be further utilized as geo-spatial context information. Focusing on geo-spatial context integration requires an appropriate representation and visualization of this information [
18,
19,
20].
In cyber-physical settings, scans conducted with LiDAR technology are used to capture the operational environment of CPS, and to enable a CPS to accomplish its tasks as part of a business case or process [
21]. We showcase how a wider view of the concept of digital twins, where not only data about the physical object, but also its behavior is captured, can facilitate the structuring of scan data produced alongside the operation of the system in a way that it becomes better accessible to relevant stakeholders. Such a conception of digital twins has recently emerged in CPS research (see [
8]) and has potential for structuring a variety of data produced during operation. Process modeling (which is the basis for such a kind of behavioral twins) was also identified as a currently emerging research direction in a recent bibliometric analysis of the field of digital twins [
22]. We contribute to this new research trend by providing a digital twin architecture based on subject-oriented process models [
23].
We exemplify a logistics showcase of a digital process twin architecture and demonstrate its models’ potential for context-sensitive geo-spatial applications. The behavioral specification of the CPS is created utilizing subject-oriented process models. These (executable) models are used as a basis to structure scan data alongside the conducted behavior of the CPS (components). Compared to simply structuring scan data in the form of a time series, using process models has the advantage of associating produced information directly with the behavior of a system at a certain point in time. This additional context enriches the digital twin with a behavior perspective for each system component and the components’ interplay representing the overall system behavior. Once this information can be conveyed to relevant stakeholders, the environmental context at different points in time can facilitate the integration of cyber-physical systems into business operation. This context can be particularly considered for supporting decision makers and process monitors.
The article is structured as follows:
Section 2 introduces the vaccine logistics use case, giving insight into the general workflow, and how and when LiDAR technology can be applied to obtain geo-spatial context information. Based on this scenario,
Section 3 provides the proposed digital twin architecture and exemplifies models of LiDAR-relevant parts of the logistics process. In particular, the integration of behavior models with scan data is explored. Along with the architecture, a methodology is introduced to guide implementing the proposed architecture. An application for behavior monitoring is also outlined.
Section 4 highlights related work. First, the use of LiDAR technology and the produced scan data in digital twin research is examined. The proposed architecture is positioned in relation to existing approaches and its difference and potential are detailed. Secondly, methodologies for the creation of digital twins are explored to reflect on the presented methodological guidance and to identify topics of future research with respect to development procedures.
Section 5 discusses in how far the knowledge base on subject-oriented design and engineering could be enriched. Finally,
Section 6 briefly summarizes the presented work and discusses the results. Limitations and possibilities for future investigations are outlined.
2. CPS-Based Vaccine Logistics
The use case that we utilize throughout this article was inspired by issues concerning the distribution of vaccines in the context of COVID-19, with different types of vaccine requiring different transport and storage conditions. It pertains to supporting the handling and packaging of vaccines as part of the distribution process (during the “last mile”, i.e., local redistribution) through automation and IoT technologies. The result is a CPS encompassing different components. To support the monitoring of transport conditions, transport boxes are dynamically equipped with sensors, depending on the specific requirements of the payload (i.e., the vaccine). Examples of potential sensors include temperature sensors for vaccines that require constant cooling to a certain temperature (as is the case with the COVID-19 vaccine of Pfizer-BioNTech), humidity sensors to detect potential spills, and shock sensors to indicate potential damage. Transport boxes equipped with such sensors can be used for real-time monitoring of conditions, with abnormalities triggering immediate interventions, such as notifications of the relevant stakeholders. They can similarly be used for auditing purposes, to showcase the integrity of transported payloads. The tasks necessary for preparing the boxes are automated and carried out by a robot. This robot is equipped with an arm unit for interaction with its environment and a scanner to sense and assess its surroundings. For showcasing the proposed digital twin architecture and the related concepts, the specific type of robot to be used is not important (due to the used abstraction mechanisms that will be detailed further below), as long as it can execute the required behavior and provide the required data. For a prototype set-up that is in the works at the time of writing this article, Boston Dynamic’s robot dog Spot (see
https://www.bostondynamics.com/products/spot, retrieved 8 June 2022) is used, due to availability and since there exist options for the required periphery (see
https://www.bostondynamics.com/sites/default/files/inline-files/spot-arm.pdf and
https://www.bostondynamics.com/sites/default/files/inline-files/spot-eap.pdf, retrieved 8 June 2022). It should be noted that in an industrial warehouse setting, there exist potentially cheaper options that can provide the necessary features for implementing such a set-up, such as various Automated Guided Vehicles (AGVs). The environment the robot operates in includes three types of shelves that, respectively, house vaccine packages, the transport boxes, and available sensors.
2.1. General Workflow
The set-up of the scenario and the minimal number of involved steps for transport preparation can be seen in
Figure 1. The process starts with a package containing vaccines arriving at the intended input location (in our simplified set-up this would be the door of the room). A photoelectric barrier sensor that is part of the room registers the arrival. This triggers the further handling of the package through the robot.
(Step 1) The robot moves to the location of the package and reads a label attached to it, which provides the necessary information for the following steps (i.e., the type of vaccine and the specific transport assignment, and following from that the sensors that are required). The robot picks up the package.
(Step 2) The robot moves the package to the shelf on the left side of the room for intermediate storage.
(Step 3) The robot moves to the shelf on the upper right side of the room, which contains a selection of sensors to equip a transport box with. Based on the vaccine-specific information read from the label previously, the robot picks up a sensor (due to the capabilities of the arm only one sensor can be carried at a time).
(Step 4) The robot moves over to the shelf housing the transport boxes on the lower left of the room. The robot uses the arm unit to insert the sensor into the intended cavity on the side of the transport box. Upon successful insertion, the sensor is activated and begins actively sensing its environment. Sensor information can be accessed by the transport box, which can execute different actions based on the read values and additional information concerning the payload and transport assignment. In case further sensors are required, the robot returns to the shelf containing the sensors and repeats the preceding step, until the box is equipped with all the necessary sensors.
(Step 5) Once a box has been successfully prepared, the robot moves over to the shelf on the left side of the room and picks up the vaccine package intended for the transport container.
(Step 6) The robot transports the package to the shelf containing the transport boxes and puts it into the box it prepared in the previous steps. The robot connects to the box (which has network capabilities) to share information. This includes information about the payload and the transport assignment (influencing which actions the box takes in response to certain sensor readings) and the sensors that were equipped. When the box is notified that the preparation steps have been completed successfully and receives the corresponding information, it closes itself automatically and informs the robot with some confirmation to be ready for transport.
(Step 7) Finally, the robot picks up the box, moves it to the output destination, and places it there.
Abstracting the behavior of the robot, it can be seen as a loop repeating the following two actions for each step: move to way-point and interact with environment.
2.2. Using LiDAR Technology for Obtaining Geo-Spatial Context
Assuming the intended use case, the layout of the room is known in advance. Thus, the positions of the different navigation way-points in the room and the locations that objects are picked up from or placed down at are known (with the robot keeping track of the status of each shelf position). The movement of the robot between the different way-points can thus be pre-planned, allowing for the avoidance of complex SLAM tasks.
For situations requiring the robot to actively be aware of contextual information pertaining to its surroundings, the scanner equipped on the robot is used to obtain the current geo-spatial status of its operational environment. This allows to adapt the robot’s behavior if objects are not positioned as they should be (or missing altogether) or if obstacles appeared in its path.
During the movement between the different way-points an unexpected obstacle could block the robot. In this case, the robot is required to navigate autonomously around it, or to notify the person in charge of overseeing the process so that they may intervene. Similarly, when reaching a way-point, such as standing before a shelf, adjustments in the robots positioning could be needed to ensure that it is in the right position to interact with an object that is not placed properly. These adjustments can then be computed based on scan data. Different object detection approaches for point clouds have been proposed in the literature and can be adapted for such a task (see, e.g., approaches such as PointNet [
24] and its successor PointNet++ [
25], VoxelNet [
26], or the approach described in [
27]). Once objects in the robot’s environment have been detected and classified, their distance and position in relation to the robot can be computed to determine its actions for re-positioning itself to be in the right place to interact with the object. In case re-positioning proves impossible (due to an object being placed in such a way that the robot cannot align itself to execute the intended interactions), a human needs to be called for assistance and restoring the operational environment.
3. A Digital Twin Architecture Linking CPS Behavior and Produced Operational Data
We now use the case described in the previous section as the basis for specifying the proposed digital twin architecture. The CPS of the illustration scenario encompasses a variety of different components. These components showcase different behavior and interact with each other as part of the transport preparation process. In this section, we first introduce the used conception of behavior-centered digital twins that can capture these aspects of a CPS. We briefly outline subject-oriented modeling as an approach for the creation of digital behavior models. Following that, we showcase models for a part of the process of the illustration case. Then, we detail the stakeholder-centered structuring of environmental context data through the process models. For the sake of run-time engine independence, we provide an abstract description of the required run-time behavior to execute the digital behavior models. Subsequently, we integrate the introduced concepts into an architecture proposal and introduce a methodology as guidance for realization. We also discuss some of the associated challenges and considerations. Finally, we exemplify an application and reveal benefits when using the architecture in practice for the case.
3.1. Behavior-Centered Digital Twins
We build upon the behavior-centered conception of digital twins of CPS as it was outlined in [
8], with subject-oriented process modeling [
23] as an approach for creating behavior specifications. This particular understanding of digital twins is depicted in
Figure 2. As outlined in [
8], besides the digital model of the CPS, run time support connected to the cyber-physical components is available for model execution. This allows for changes to the digital model during operation that propagate to the components executing the behavior (and thus the model). The authors refer to this aspect as “design-integrated engineering”, entailing that the models of the CPS are subsequently refined to the point that they can be executed. As can be seen in the figure, data (such as collected sensor data) also flows back from the CPS to its digital representation. Therefore, this digital twin conception also provides an infrastructure for the monitoring of CPS component behavior and is used as the basis for integrating geo-spatial context.
Now we specify the central elements of subject-oriented process models, as they are relevant for the digital twin architecture that utilize them. Subject-oriented modeling accounts for some of the unique characteristics of CPS, such as allowing to depict heterogeneous components in a unifying way by focusing on their behavior [
8,
28]. Subject-oriented modeling originated from the structure of natural language, specifically the components of sentences, i.e., subject, predicate, and object [
23]. This allows for modeling inspired by natural language sentence formulation. Furthermore, the minimal number of symbols used by the associated standard modeling notation makes it more easily accessible. This is indicated by a recent research effort [
29] to empirically examine the properties of both a control flow modeling paradigm (e.g., BPMN) and a communication modeling paradigm (e.g., subject-oriented process models).
The subject-oriented approach uses two types of models to specify processes [
23]: Subject Interaction Diagrams (SIDs) and Subject Behavior Diagrams (SBDs). SIDs depict different subjects and their interactions with each other as part of a process. These interactions are modeled as messages that are exchanged between the different subjects. In this kind of depiction, a subject is understood as an encapsulation of a certain behavior that it executes as part of the process, so each subject in the SID has an associated SBD. Different subject behavior is executed in parallel, with messages serving to synchronize behavior. The instantiation of the subject is furthermore left open during the modeling process. Whether a subject behavior is executed by a human, an organization, a software component, a cyber-physical component, etc. is decided at a later point in time. SBDs now depict behavior as a sequence of states, including function states (performing some local action), send states (sending a message to another subject), and receive states (receiving a message from another subject). Furthermore, one state needs to be marked as the start state of the sequence and there needs to be at least one end state (multiple end states are possible). States are connected through transitions (usually notated as arrows labeled with the result of the previous state). Different tools have been proposed that support the creation, validation, and execution of subject-oriented models. Such run-time engines originate from both the commercial (see the suites Metasonic (
https://www.metasonic.de/en/, retrieved 9 June 2022) and Compunity (
https://compunity.eu/, retrieved 9 June 2022)) and academic sector (e.g.,: [
30,
31]).
Subject-oriented business processes are embedded in organizations within the framework of Subject-oriented Business Process Management (S-BPM) [
23]. Thereby, the processes are considered in different phases, which are called activity bundles. These activity bundles can be run sequentially in an iterative way, as shown in
Figure 3. However, since some activity bundles can also be skipped or used within an arbitrary sequence due to the self-contained nature of each step, the methodological frame has also been termed “open control cycle” [
23]. The activity bundles address the analysis, modeling, validation, optimization, organization-specific implementation, IT-implementation, and monitoring of processes. In the course of analysis, all process-relevant information is captured, while in modeling this information is brought into a subject-oriented process model. Consequently, the activity bundles of analysis and modeling are essentially about which subjects perform which activities on which objects utilizing which tools, and in which way the subjects interact to achieve the desired process goals and results. Validation verifies that the specified process is effective, i.e., that it produces the expected output in the form of a product or service. Optimization means finding an optimal design of a process with regard to process parameters such as duration, costs and frequency. Validated and optimized processes are embedded in the organization during organization-specific implementation, and this includes any adaptation of the process and organizational structure that represent the social environment. The IT implementation of a process means mapping it as an IT-supported workflow by integrating a suitable user interface, the flow logic and the IT systems involved. Ongoing monitoring gathers measurement data during process execution and can be used, for example, to calculate the actual values for the key performance indicators defined during analysis and modeling [
23].
3.2. Sample Model Variants for the Logistic Use Case
The general modeling approach is depicted in
Figure 4 based on the use case scenario described in the previous section. Each active component is modeled as its own subject. The connections between the subjects contain messages to implement the entire CPS behavior.
As part of the proposed digital twin architecture, subject-oriented models are used to depict the individual behavior of CPS components and their interactions with each other as part of different processes.
Figure 5 shows the subject interaction diagram of the components involved in the transport preparation process on a high level of abstraction. All models were created using the Compunity suite, which uses slightly different terminology, e.g., component interaction diagram instead of subject interaction diagram and step instead of state. For the sake of understandability, we chose to utilize the more general terms common to subject-orientation consistently throughout this article.
Figure 5 shows which messages flow between the different subjects that are part of the process, with the notification of the room triggering the robot’s behavior, and the robot issuing commands to its attached add-ons (here modeled as separate subjects). The content of the messages is detailed in terms of an abstract data structure that needs to be derived from the respective use case. It can also be aligned with existing data models. Depending on the chosen level of abstraction, the behavior of the robot and its add-ons could be unified into a singular subject, or separated even further. As was mentioned in the previous sub-section, if the digital behavior models are used in the sense of design-integrated engineering, a refinement to a fine-grained level will be necessary to put them into operation. Considering that the digital twin should meet the requirements of its potential users, a high level of abstraction will still be useful in cases where detailed information on behavior is not needed and might overwhelm a CPS stakeholder.
To demonstrate the modeling of component behavior we detail the part of the transport preparation process concerning the selection of a sensor through the robot (step 4). The behavior of the robot, as described above, was now modeled in a SBD, as seen in
Figure 6. It follows the general activities of “move to way-point” and “interact with environment” as outlined above. According to the SID in
Figure 5, the arm unit and scanner were modeled as separate subjects with their encapsulated behavior. The behavior is again depicted on a high level of abstraction.
3.3. Stakeholder-Centered Structuring of Environmental Context Data
In the described use case, the robot uses the scanner periodically as part of its tasks to assess its surroundings. The data provided by the scanner is used as a source for deriving contextual information to dynamically adjust the robot’s behavior to its environment. In the proposed digital twin architecture, another core purpose of these data are to save and integrate it with other information to create a representation of the CPS in a way that it supports process stakeholders. The results of scans can be periodically saved to create a timeline of how the operational environment of the CPS has changed, with the different objects (the robot, vaccine packages, sensors, transport boxes) moving/being moved and leaving/entering the area. From this information, a digital twin of the CPS environment is created that can be used to document and assess these environmental states and changes for different purposes (monitoring, auditing, …). We have already described situations in
Section 2.2 where process participants may need to access this information. There exist other information sources as well. The room outlined as part of our proposed set-up is equipped with different sensors, with the integration of the produced information promising to create an even more complete picture of the operational environment at certain points in time.
We propose the use of the behavioral twins of the CPS components outlined above (i.e., the executable, synchronized process models) to structure both the captured geo-spatial context and the information produced by other components. This integration is realized through associating the information provided by certain scans with the concrete steps in the process model during which these scans occur (or during which they are used) as an instance of the behavior model is executed synchronously with the system component showcasing the behavior.
We describe the envisioned structure of information for process model instances independently from details of concrete execution engines. The focus is on the basic elements of subject-oriented process models (specifically considering behavior diagrams) and using generic concepts for documenting the execution of processes. In the context of process monitoring (see [
32] for the formal definitions of the terms used in this paragraph), various information regarding the execution of a process instance is generally recorded in the form of an event. An event was defined in [
32] as a tuple
, with
a being the activity name (in the case of subject-oriented models this would refer to a state of a SBD),
c is the case identifier (i.e., referring to a process instance),
t is the timestamp (in [
33] both start and end timestamps are listed) and
(where
) denote event attributes, with their names and values. The events that are generated by a process instance are referred to as a trace, with a so-called event log storing all completed traces pertaining to a process model.
These concepts already provide a framework for documenting relevant process data, such as timestamps for start and end times (cf. [
33], where end times from event logs are used for remaining time prediction). The event attributes allow for the storage of various data produced during a process. Completed traces stored in the form of an event log document previous process instance executions. Furthermore, traces that are in the process of being recorded document an ongoing process instance execution.
By the activity name stored with an event, event data is set in relation to the model from which the process instance was created. It is assumed that states across the different SBDs of a process model have unique identifiers. Looking at the particularities of subject-oriented modeling, a few topics need to be considered for uniquely relating recorded events to states of SBDs. The same subject behavior can, e.g., be instantiated multiple times as part of process execution (Multi-Subjects, see
https://i2pm.net/wp-content/uploads/2020/04/20200223-Standardbuch-PASS.pdf#page=22, retrieved 4 July 2022). In cases, where multiple instances of behavior modeled in a SBD are executed in parallel, this leads to potentially multiple event tuples with the same values for
a,
c, and
t. The addition of a SBD instance identifier allows to still uniquely relate events in such a case. It also supports the generation of traces not only on the level of whole process instances, but also on the level of instances of individual subject behavior part of the process instance. This way, the sequence of states that were performed (documented through events in a trace) can be showcased in relation to the options permitted in the models (one state in a SBD can occur multiple times as part of execution in case of a loop and certain modeled states might not occur in case of exclusive paths). A graphical representation of current or completed process model instances with their performed states is still required for a stakeholder-centric use of event data.
Considering the digital twin thread concept outlined in
Section 3.1, during each state instance (i.e., a state in the behavioral model that is currently being executed by a CPS component as part of a process instance), data can be provided by the system (component). So each state instance can potentially have associated data entries, as visualized in
Figure 7 in the form of a simple UML class diagram. In this diagram, the data entry class is not further specified and left abstract, to reflect that different data can come from the system based on the scenario. For a CPS this can be sensor data (or other relevant data produced during the behavior of the system component), either captured or utilized during the respective state instance.
Utilizing the general concepts for documenting process executions outlined above, a state instance can be documented through an event, with the attributes of the event tuple holding the various data entries (with d denoting the data entry and v storing the data within the entry).
A very basic data entry containing a description of the data entry type, a timestamp, and the associated data needs to be put into relevant context, to depict data in a specific structure. This can be realized through inheriting and specifying the abstract class, with
Figure 8 showcasing a few possible data entry types concerning sensor data from the illustration case (here depicted in UML through extending the abstract data entry class).
Concerning the enrichment of the behavioral model instances and their documentations with LiDAR-provided geo-spatial context information, we distinguish between different possibilities based on how much other context information concerning the CPS is directly integrated with the geo-spatial representation. The data structure highlighted in
Figure 8 shows the lowest level of integration, where the raw 3D point cloud data is stored as part of one specific data entry type, and different types of context data have their own separate entry types. A higher level of semantic richness of geo-spatial context information can be achieved by storing a semantically segmented point cloud as part of a data entry. On higher levels of integration, data from previously separate data entry types is incorporated directly into a 3D representation of the operational environment, such as sensor readings being displayed next to the digital model of the physical sensor producing the data. Creating such a representation would require more computational effort. As part of the illustration case, processing of raw point cloud data is already required as part of CPS component behavior, since positional adaptions need to be realized based on detected object positions. The results can therefore be given to the digital twin and stored. In cases where point cloud data is not already processed in some form as part of the behavior of the physical twin, the digital twin itself would have to provide the necessary functionality to generate representations meeting the requirements of DT stakeholders.
The organization of data outlined in this section allows a process stakeholder to view important information of the CPS associated with process model instances as they are currently being executed. A stakeholder can access data of the current state in the behavior model instance or of previous states that were already completed. In the case of geo-spatial context information, it also needs to be visualized in a way that facilitates the needs of process stakeholders. This aspect has been explored in the literature through the use of virtual reality technology and game engines (see, e.g., [
34]). Information concerning the completed process model instance executions can furthermore be saved to document the system’s previous behavior and the environmental states and data associated with it.
3.4. Integrating the Introduced Concepts
The overall proposed approach to the creation of digital (process) twins, whose key concepts and elements were introduced in the previous sub-sections, is presented in
Figure 9 in an integrated form. The graphic delineates the different involved elements, from the initial models depicting a number of business-relevant, technology-supported processes, to the created instances and their assigned cyber-physical behavior carriers producing event data. The event data is used to keep instances up-to-date with the state of the real-world processes. Is is furthermore processed and integrated with the originally modeled processes to create different visualizations of the operational environment at different points in time, depending on stakeholder needs.
For the realization of the introduced architecture, we furthermore propose a number of development steps, intended to serve as a general guideline. This methodology is based upon the general subject-oriented approach discussed in
Section 3.1 and extends it towards incorporating the necessary steps to implement the architecture. It is also used as a framework to point out some of the challenges and considerations related to the required technology and design decisions. The basic assumption is that the system for which the architecture should be implemented is already known and specified to a certain degree.
First, it is necessary to determine the set of organizational processes for which the twin could be implemented. Candidates are all processes that are, at least partially, enacted/supported by cyber-physical components.
Next, it will be necessary to determine the overall goal and the envisioned use cases of the digital twin. This involves gathering stakeholder needs and requirements. These are used as input for the following steps and will determine various aspects of the final twin, such as behavior specification granularity and the data that needs to be gathered from the twinned system. Already, some attention should be given to the technical capabilities of components with regard to the provision of data (performance, network capabilities). Based on this information, processes are chosen from the candidate list.
Subsequently, it is required to create the subject-oriented specification of the selected processes. The processes are already put in place and are executed as part of the organizational environment. Thus, if certain process documentations are already present (even if the used notation is not subject-oriented), they can be used as input. An example would be a natural language description, such as the illustration case outline that was used as input for creating the sample model variants in
Section 3.2. Similar artefacts, such as documentations of control software and system models, can be used to gain knowledge about the behavior of cyber-physical components. There also exist techniques for the elicitation of stakeholder knowledge with regard to business processes. Due to the heterogeneous nature of CPS, this step will require inputs from many different domain experts to specify the process. This constitutes a major challenge that one needs to be aware of during architecture implementation. Once sufficient knowledge about the processes has been gathered (sufficiency will depend on many factors, such as the required level of process granularity), the modeling can start. Generally, one of the first steps of subject-oriented process modeling is to determine the subjects and the messages that they exchange. This is a question of decomposing a system and finding the appropriate level of abstraction. With regard to CPS, this entails that choices need to be made with regard to representing different components through their relevant behavior. Once subjects have been determined, their individual behavior is detailed in the form of an SBD. Finally, the models need to be validated through the stakeholders holding the relevant knowledge. Like the modeling itself, this step can be supported by tools that allow for static checking and interactive enactment of the models to ensure syntactic and semantic correctness. Any of the activities associated with this step may be repeated to refine the models until they are deemed appropriate, given the outputs of the previous step. The final result of this step should be one SID and a set of SBDs for each chosen process. Ideally, the models should be in a format that allows for standardized data exchange (e.g., the semantic exchange standard proposed in [
35]). However, existing modeling and run-time environments from the commercial sector often make use of their own formats.
Following model creation, the next step will be to set up a run-time environment for the subject-oriented process models and deploy them to it. If the previous steps already made use of such tools, then they can simply be re-used. A key requirement for the run-time environment is the ability to keep the running model instances synchronized with the state of the twinned process. This can be accomplished through a dedicated feature of the run-time environment itself, or through it offering the freedom of executing arbitrary code as part of a SBD state, allowing the implementation of features to request or receive status information of the twinned system to control instance execution (as sketched in
Figure 9).
The next step concerns establishing the infrastructure for receiving and storing the required information from the twinned process and the physical behavior carriers. The data also needs to be provided to future consumers (visualization). Based on the identified goals of the DT, the explored use cases, and the gathered requirements, the data from the system that is needed is determined. Once the required data is known, a decision needs to be made with regard to how it should be stored. The general format for the event log was outlined in the previous sub-section and an example is shown as part of
Figure 9, but, e.g., the type of database(s) to be used should consider the expected types of data and their other characteristics (e.g.,: volume, velocity). Access to the database(s) can be encapsulated through various services built on top of it.
Next, the cyber-physical components that enact the modeled behavior need to provide the needed data to the storage infrastructure that was set up. This means that the components that need to provide data require some form of network connectivity. Furthermore, adaptions and extensions to their existing behavior will be necessary. Here, the impact that the additional communication would have on the performance of the components needs to be carefully considered. It might be likely that not all the data that was initially identified as potentially interesting for the twin can actually be provided without impacting the system in a significant manner. This is the reason, why these considerations were already mention at the beginning steps of the proposed methodology. Furthermore, the technologies used for realizing communication need to consider Quality-of-Service requirements that might exist. To summarize, one of the most important technical considerations during implementation relates to getting the relevant data from each component.
Once the infrastructure for getting the required data is in place, a user interface needs to be constructed that makes use of this data to provide various visualization features to DT users. Alongside event data, the created models can provide process context information, as shown in
Figure 9.
Before the system can be put into active operation, it needs to be evaluated with regard to the various requirements that were established. This step requires incorporating the DT’s intended users and other involved stakeholders to ensure that it supports them as envisioned. Depending on the results of the evaluation, some of the previous steps might need to be repeated to further fine-tune the digital twin.
Once evaluation has been passed successfully, the created twin passes into the phase of active operation and utilization.
The last step that we outline concerns maintaining the created digital twin system. If the twinned process changes, adaptions to the models and twin will be necessary. Similarly, new and changed stakeholder requirements might come up that require, e.g., additional data from components and new visualization features.
The steps described above are again summarized in
Figure 10. The next sub-section will outline some of the possible usages and benefits of the behavior-centered approach using the illustration case.
3.5. Exemplifying Use Case Scenarios
As outlined in the previous section, process model execution information is integrated with the (geo-spatial) context provided by the system at certain relevant points of the process. This supports the tracking of the logistics process and makes the system observable for the stakeholders.
Figure 11 illustrates this for the communication between the robot and the transport box upon successfully completing the outfitting of the box.
Different cross-cutting concerns with regard to the processes, the system, and the environment can be assessed by stakeholders. This goes beyond the monitoring of standard behavior. Considering the outlined vaccine use case, it would be possible to show the structural integrity of vaccine packaging (by, e.g., recognizing surface deformities) at different points in the process, to help rule out that damage occurred during transport preparation. The two following examples should help illustrate other possible applications of the proposed twin:
Digital Process Twin Application Example 1: After the successful transportation of a box to its destination, the organization in charge of transport preparation receives the complaint that a sensor was not installed that should have been installed. The person in charge of supervising the process accesses past process model instances and looks at the one of the package in question. Both the model execution history and the scan results associated with states show that the sensor in question was installed properly, showcasing that it went missing after the package left the care of the organization. The behavioral twin of the box itself (enriched with the produced sensor readings) shows that it was operating as intended during the transport process as well, indicating that the sensor was probably removed at a later point in the process.
Digital Process Twin Application Example 2: The person in charge of supervising the transport preparation process receives a notification on their smartphone from the robot that it cannot continue with its normal behavior due to unexpected changes in the environment that it cannot compensate for. The supervisor looks at the process model instance that is being executed and sees that the robot is currently in the process of selecting a sensor for installation. Upon accessing and viewing the scan data associated with the current point in the process, it can be seen that the shelf was moved and sensors had fallen to the ground due to a minor earthquake or some other disturbance. Subsequently, the supervisor can react and restore the original environment based on what is known about the set-up (both a priori and from earlier scans as part of the process).
Instead of having to view the environmental data as a time-series of scan data without additional context, stakeholders can use the created process model instances (and the past model instance documentations stored) to immediately access the points in the process that they are interested in. The saved data can be used for both manual and automated analysis to gain insights regarding the CPS-supported business process and the CPS itself. These aspects of the proposed digital twin of the CPS are akin to process monitoring and auditing, as it is enabled through workflow execution engines in the context of business process management (the field from which subject-oriented modeling originates). Furthermore, considering the proposed DT model, one can opt to only create “snapshots” of the operational environment at crucial states in the process and store them, instead of continuously saving point clouds throughout the whole operation of the system, to minimize the data that needs to be stored.
5. Discussion
In the previous section, we reviewed relevant related work pertaining to existing DT approaches that use geo-spatial context provided by LiDAR technology and methodologies for the creation of DTs. We discussed our research contribution in light of the existing approaches. Taking into concern DT-related approaches not considering process twins and subject orientation for geo-spatial CPS development so far, the presented approach extends the existing knowledge base of subject orientation and the application of this behavior-centered approach for this type of CPS and DTs (which were outlined in
Section 3.1).
Specifically, our approach was built on the digital twin conception presented in [
8]. The authors of this article focused on designing a subject-oriented DT as a means of developing a CPS through behavior specification. They detailed how to depict certain aspects of CPS structure and behavior (such as variability of behavior) through subject orientation and associated modeling techniques and exemplified this through evolving models for a traffic management scenario. Data exchange between the DT and the CPS were established as a core aspect of the concept, without going further into details on how this data can be structured. With our contribution in this article, we detailed the initial concept of a behavior-centered DT in terms of how this integration between the data of a CPS and the process models at the core of this DT-type can look like and work effectively in the course of system design and engineering. We described this integration specifically centered around geo-spatial context data, a type of data with considerable importance in the CPS context, as was outlined in the introduction. With the architecture depicted in
Figure 9 and the guiding methodology steps in
Figure 10, including the associated considerations, we specified how an infrastructure set-up for this type of twin can be realized. We also discussed some of the modeling-related aspects and provided sample model variants for the illustration case, in the respect that they are relevant to the proposed architecture and methodology.
As was also identified in the previous section, a point that is still open for exploration concerns the automatic deployment of subject-oriented models. This is also meant in the sense of a run-time environment ensuring automatic execution of models through the CPS and its components, as well as them providing the relevant data to the environment without further configuration and engineering activities (as they are currently required in the proposed methodology). Furthermore, this also represents a core step towards the idea of design-integrated engineering presented in [
8], where the CPS is developed and adapted dynamically through the models in the vein of Model-Driven Engineering and based on the idea of the DT as a virtual entity connected with the twinned entity in such a way that changes to one are reflected in the other.
6. Conclusive Summary
LiDAR technology has the potential to provide valuable information for the operation of Cyber-Physical Systems. This information is considered useful for stakeholders in charge of monitoring the system as it supports and realizes related business processes. A big part of the vision and potential of digital twins lies in linking relevant information from multiple sources to support different tasks and users. A digital twin of a CPS should therefore integrate and provide geo-spatial context information in a form that facilitates stakeholder accessibility.
In this article, we showed the potential of a behavior-centered digital twin conception that utilized subject-oriented process models to structure the data provided by a CPS. We referred to concepts from process monitoring to document the past behavior of a system (and its components), including the associated context information. We demonstrated that geo-spatial information can be integrated with other data provided by the CPS to create a detailed snapshot of the system and its operational environment, which can be associated with a concrete behavior state of a CPS component. We proposed a corresponding architecture concept and methodology guiding realization.
Utilizing an illustration case from the logistics domain, we explored some of the benefits such type of a twin brings to stakeholders in charge of monitoring processes and the CPS. Stored information can be accessed directly from the relevant points in the process, ideally supported through a visual representation of both process instances and the associated context information. The logged data can also be used as input for automated inspection. Finally, we examined related work concerning the combination of digital twins and geo-spatial information provided by LiDAR technology. We could position our approach as not only advancing model-based engineering towards behavior-driven development, but providing it up-front as the primary core of the twin.
Most of the existing research focuses on using the LiDAR-generated data to create a digital representation of an asset (sometimes through integration with other relevant data). For scenarios, where CPS are used to support and help realize business processes, the behavior of active agents is of crucial importance compared to monitoring a mostly “static” asset. It is for such cases that our DT conception offers contextual knowledge and stakeholder value. To that respect, the body of knowledge of subject orientation in the areas of CPS and behavior-centered DTs was enriched.
Through examining existing methodologies for the creation of DTs, we also identified some shortcomings of our research in terms of the supported development scenario and automation support of methodological steps. A limitation of this work is that the proposed DT architecture, as presented conceptually, requires future work on implementation. Its efficacy for operational CPS scenarios needs to be evaluated involving process stakeholders from relevant application domains to provide empirical evidence. Potential for future research also exists with regard to the visualization of process model instances and the connected geo-spatial information. Extensions to subject-oriented run-time engines for this purpose are required to ensure a suitable presentation of DT information. Finally, support for automating the deployment of models in cyber-physical settings and establishing the required data exchange through subject-oriented run-time environments is a desirable feature to ease the adoption of the outlined concept. The adaption of automated process monitoring and mining techniques to the presented CPS context is also of interest to optimize and further develop systems and processes based on DT data.