1. Introduction
The paradigm of the manufacturing industry that started with craft production has evolved from mass production to mass customization [
1]. Craft production can be customized to individual orders, but it relies heavily on skilled craftsmen and manual labor in the production process. Because this method requires specialized skills and experience, it is difficult to mass produce due to its low efficiency and high cost. Mass production, on the other hand, is a standardized production process that allows you to produce products of the same quality on a large scale [
2]. In 1913, Henry Ford introduced the world’s first moving assembly line, and this revolutionary production method was considered a major turning point that made mass production possible [
3]. Mass customization, on the other hand, allows for the mass production of goods that meet consumer needs by creating products that match customer preferences through a set of options. However, as personalized needs increase, the manufacturing industry of the future is expected to shift beyond mass customization to mass personalization [
4]. Similarly to mass customization, this approach aims to provide products that fit the user’s demands. While bespoke production has a predetermined set of standardized options to choose from, mass personalization allows for the production of tailored products that perfectly meet individual requirements [
5].
The manufacturing paradigm is shifting from mass production to customized production, and a more flexible production system is required to cope with this. Research on robot-based production and inspection systems has been a primary focus to respond to these changes, with the aim of improving flexibility and efficiency. For example, Ulrich et al. [
6] presented a method to shorten inspection cycles through path planning optimization in an inspection system. Kim et al. [
7] addressed the problem of the optimal scheduling of cluster tools using dual-arm robots and proposed a flexible scheduling approach to improve processing efficiency. In addition, Foumani et al. [
8] proposed a scheduling approach to maximize productivity by utilizing a stop–resume processing mode while a multifunctional robot performed tasks in two mechanical robot cells. This research can be utilized as an essential approach for flexible production in complex manufacturing environments and can contribute to increased efficiency. However, these studies still rely on fixed rules and predefined scenarios, which limit their ability to meet the specialized needs of individual users.
Transitioning to a new manufacturing paradigm requires the integration of advanced ICT technologies such as digital twins and metaverses. The metaverse is a service that blurs the boundaries between the real and virtual worlds, thereby expanding our scope of activities [
9]. In restricted situations, such as the COVID-19 pandemic, the metaverse has been extensively utilized in service platforms such as gaming and social networks. The industrial metaverse applies this concept to enhance efficiency in industrial settings through the interaction of virtual and physical objects. It extends not only the assets of the manufacturing site but also the manufacturing space and processes into the virtual realm, supporting operational optimization through real-time monitoring and analysis. In this context, digital twins (DTs) play a crucial role by serving as a data conduit between the metaverse and the real world, facilitating seamless integration and interaction.
The federated digital twin (fDT) is a set of digital twins that enhances the capabilities of individual digital twins through the interconnection of homogeneous or heterogeneous digital twins. This enables the assessment, prediction, and optimization of specific service statuses. A federated digital twin can form a set of digital twins that can perform the functions required by a service and optimize the service through scheduling between them. One example of a federated digital twin is the software-defined factory’s optimal manufacturing process, which consists of a large number of reconfigurable manufacturing robots that can adapt to dynamic requirements. The individual capabilities and deployment order of manufacturing robots can change to meet new requirements, such as changing the design of a product or adding features. Process optimization is achieved by configuring and scheduling robot DTs to perform specific functions in an optimal sequence.
Traditional factories, which can be described as hardware-defined factories, rely on fixed hardware equipment to perform static manufacturing functions. These weaknesses limit the adaptability of the production processes required by the future manufacturing paradigm and make it difficult to respond to the individual needs of users. To fulfill this need, the software-defined factory (SDF) is proposed as shown in
Figure 1. The SDF represents an advanced form of a smart factory, providing an intelligent production system that can offer customized manufacturing services by reconfiguring the manufacturing functions defined in software to meet user needs. This allows for the intelligent reconfiguration of equipment functions based on user requirements, resulting in flexibility that traditional static factories cannot provide. This shift in manufacturing paradigms is essential for manufacturers to maintain competitiveness and meet customer expectations, with the SDF playing a core role in this process. In particular, the SDF is expected to establish itself as an innovative platform that maximizes production efficiency and effectively responds to user requirements through data-driven decision-making and flexible scheduling.
The scheduling process of digital twins is critical in fDTs that provide services through a federation of individual digital twins [
10]. In recent research, reinforcement learning (RL) has been heavily used to deal with the many variables and exceptions that occur in the real world [
11]. However, they often rely on fixed rules and predefined scenarios, making it difficult to dynamically respond to the specialized needs and situations of different users. Furthermore, the complexity of data relationships in industrial metaverses with a complex structure of physical and virtual resources requires new scheduling approaches to efficiently manage and analyze them. In particular, adaptive scheduling techniques that can instantly respond to dynamic changes in data and user demands are essential. Large language models (LLMs) can effectively address the limitations of RL-based scheduling, which often struggles with fixed rules. LLMs excel at understanding and processing the semantics of natural language, allowing them to derive optimal decisions from complex data inputs. This capability enables an analysis of intricate data relationships and facilitates customized scheduling that accounts for various KPIs. In doing so, LLMs enhance the flexibility and adaptability of the manufacturing process, optimizing interactions between digital twins and supporting decision-making.
In this paper, we propose a federated digital twin scheduling method using LLMs and deep reinforcement learning (DRL). The literacy module of this study uses LLM to analyze user requirements and weight data for scheduling. These data are then used by a DRL-based scheduling module to perform user-centric scheduling that can flexibly respond to the requirements. The contributions of this research are as follows. First, we propose a software-defined factory that enables the realization of the mass personalization paradigm through the intelligent reconfiguration of manufacturing functions designed by software. Second, we implement a literacy module based on LLM for user requirements analysis to enable an in-depth understanding of requirements. Then, we develop a DRL-based scheduling module with dynamic compensation for the characteristics of scheduling data to support optimized scheduling in real-world environments.
In
Section 2, we define the industrial metaverse through a comparison of common metaverses and introduce the software-defined factory, which is a future manufacturing plant utilizing the industrial metaverse. We then discuss the importance of scheduling problems with federated digital twins and review research on existing scheduling methods based on reinforcement learning.
Section 2 concludes with a generational overview of the evolution of LLMs. In
Section 3, we propose a literacy DRL method for understanding and analyzing different user requirements and describe how it works.
Section 4 describes the experimental environment and analyzes the results. Finally,
Section 5 concludes this thesis with future work.
3. Proposed Methods
3.1. Federated Digital Twin Scheduling Based Literacy DRL
Traditional RL-based scheduling has fixed rules that make it difficult to respond flexibly to user needs. However, literacy DRL-based federated digital twin scheduling is a user-centric method that uses reinforcement learning to schedule tasks based on user requirements analyzed by an LLM-based literacy module.
Figure 4 is a simple schematic illustration of how a literacy DRL could work in a future manufacturing plant, such as the SDF. User requirements, including product specifications, features, designs, costs, etc., are analyzed by the literacy module. The literacy module (
Figure 5) understands the user requirements expressed in natural language and allows the most relevant digital twin attributes to be prioritized in the scheduling process. While traditional scheduling is inflexible due to fixed rules, the proposed method is based on analyzing user requirements, which can be expected to improve the flexibility and efficiency of the production process. Literacy DRL scheduling also offers the possibility of scaling flexibly to production environments of different sizes. To achieve this, the system dynamically adjusts its scheduling approach based on resource availability and job demand. If production requirements change unexpectedly, such as equipment failure, the framework can reevaluate task prioritization and resource allocation to effectively meet the new demand. This adaptability enables robust performance across a wide range of manufacturing sizes and operational scenarios and demonstrates the flexibility of how dynamic manufacturing conditions are handled. The digital twin attributes are fed into graph neural networks (GNN) along with the state of the machines and operations to be scheduled, as shown in
Figure 6. Traditional AI, which requires structured inputs, is difficult to apply to scheduling problems with high complexity and unstructured characteristics, such as those in the real world. However, GNNs are powerful tools for modeling complex interactions and relationships using a graph structure of nodes and edges, which can be used to extract unstructured data into a structured set of features [
25]. For example, machines and operations are represented as nodes in a graph, and the interactions and dependencies between them are represented by edges.
The reinforcement learning policy network in
Figure 5 takes the feature sets of the machine and operation as the input and generates various actions that the agent can perform. It also performs scheduling by selecting the best action for the environment from among these actions. After executing the selected action, the agent receives a reward from the environment and learns to maximize this reward. During this process, the agent adapts to different situations and conditions and gradually improves its performance through iterative training. The end result is flexible with efficient scheduling that effectively reflects user requirements.
3.2. LLM-Based User Requirements Analysis Module
Key performance indicators (KPIs) are used to effectively analyze user requirements. KPIs allow organizations to set important performance metrics to achieve their goals and continuously evaluate and adjust their progress. They foster collaboration across teams and serve as an important tool to drive performance improvement through data-driven decision-making. In the manufacturing industry, manufacturing costs and productivity are key factors in maximizing cost efficiency and staying competitive. Lowering manufacturing costs allows for the more efficient use of resources while increasing productivity allows for the more efficient production of products, which enables sustainable growth [
26]. LLM is used to more deeply understand the meaning of requirements and select the most relevant KPIs. LLM performs well at understanding general words and sentences, but shows difficulty in understanding specialized words and sentences used in specific domains. This is where fine-tuning can improve the model’s performance [
27]. The fine-tuned language model in the literacy module in
Figure 5 gains a better understanding of the terms and expressions used in a particular domain, which enables it to more accurately assess the relevance of user requirements to KPIs.
The literacy module performs an analysis that takes into account the structure and meaning of sentences rather than just keyword matching. Cosine similarity is a method of measuring similarity using the angle between two vectors without considering the size of the vectors, which allows you to quantitatively evaluate the relevance of user requirements to KPIs. The user requirement is not simply compared to the KPI words but to the sentence that defines the KPI. A high similarity score indicates a high degree of relevance between the requirement and the KPI, which is used to select the most appropriate KPI. Because this method compares similarity by vectorizing context and meaning, it is useful for improving the accuracy of requirements analysis over traditional methods such as word matching and frequency analysis.
3.3. Assigning Digital Twin Attribute Data Weights
Digital twin-attributed data consist of metadata, which describe the structure, format, and semantics of the data, and actual data, which include status information, measurements, and more. For user-centered scheduling, digital twin attribute data are selected based on KPIs and comprehensively reflect the data required for scheduling. The selected digital twin attribute data are weighted according to the data that must be considered when scheduling. For example, attribute data that are closely related to KPIs are assigned a higher weight and considered more important in the scheduling process. The data are fed into the GNN along with machine and operational feature data to be reflected in the action set for the agent. As a result, digital twin attribute data weighted by user requirements enable flexible, user-centric scheduling.
3.4. The Reinforcement Learning-Based Scheduling Module
Graphs are tools that have the advantage of modeling complex relationships. In scheduling, the operations performed by a job are represented by nodes and directed combinatorial arcs that represent the priority between two operations, and undirected disjunctive edges that represent the relationship between operations that can be performed on the same machine. A scheduling problem is an optimization problem that determines in what order to process a given set of jobs, which means converting an undirected graph into a directed graph [
28]. In traditional scheduling problems, operation nodes, machine nodes, and processing times are represented by edges. The GNN generates feature embeddings for each node, which are then fed into the policy network. This allows the policy network to make informed decisions based on a comprehensive understanding of the scheduling state rather than relying solely on local information. Using GNNs improves the ability of the model to generalize across different problem sizes and configurations, resulting in better scheduling performance.
In this study, we applied digital twin attributes extracted from the analysis of user requirements instead of the processing times considered in the existing methods. This approach can reflect more complex requirements than traditional scheduling, which only considers processing time. For example, the optimal schedule can be generated by considering a variety of factors, such as machine health or performance data and the priority of each operation. This does not mean simply placing jobs in chronological order but rather deriving an optimal sequence of jobs that reflects the complex variables of a real-world production environment. The approach proposed in this paper is a scheduling technique that enables more flexible scheduling than traditional methods, which only consider the processing time and can effectively reflect the various variables and conditions that occur in real-world manufacturing environments.
3.5. Dynamic Environments Based on Large Language Models
An environment in reinforcement learning, which consists of states, actions, rewards, and state transitions, interacts with an agent and returns with rewards as a response to actions. The reward provided by the environment is feedback on actions and is an important factor in helping the agent achieve its goal. An LLM that generates appropriate responses based on user input has a similar interaction pattern to an environment of reinforcement learning. As such, LLMs can play a role in reinforcement learning environments by interpreting states and rewarding agents for their behavior. LLM-based environments provide the flexibility to dynamically adjust the reward function as requirements or the environment change. This dynamic adjustment allows LLM to reflect changes in user requirements in real-time and generate appropriate feedback based on them, helping agents make optimal decisions. The LLM approach can design a reward structure that is more effective than the traditional fixed environment, further improving the learning efficiency of the agent.
The digital twin attributes that affect scheduling are of different natures. For example, smaller manufacturing cost data will perform better, while larger throughput data will perform better. Different data characteristics require different reward structures. LLM can help agents learn optimal behavior by understanding different data features and dynamically applying the appropriate reward function for each situation. The user-based data selected by the literacy module are applied to an environment of reinforcement learning using LLM, which applies to a reward structure that matches the characteristics of the data. This ensures that the agent is optimally rewarded according to user requirements and trains the optimal actions to perform well in various production situations. The dynamic reinforcement of the learning environment allows the agent to flexibly adapt to complex and changing environments. This dynamic reinforcement learning environment is much more flexible than a simple rule-based system and can adapt to complex real-time changes. As a result, agents can continuously improve their performance in different production scenarios and make more efficient and accurate scheduling decisions.
5. Conclusions
The software-defined factory (SDF) emerges to produce personalized products demanded by the evolving manufacturing industry. It represents an autonomous production system that delivers customized manufacturing services through the digital transformation of manufacturing functions and processes. This enhances competitiveness through production flexibility that is difficult to support in traditional factories where manufacturing software is tightly coupled with hardware. The industrial metaverse of the SDF allows users to create tailored products, reflecting their specific requirements for performance, design, and characteristics. The manufacturing services for production are provided through the reconfiguration of open assets. Open assets are hardware or software components that can be easily added, replaced, or updated to meet requirements in the SDF. This adaptability allows them to remain flexible and not be tied to a fixed configuration, thereby maximizing the efficiency of the production process. In addition, open assets can require hardware adjustments as necessary.
To produce personalized products, it is essential to have manufacturing planning that dynamically satisfies user requirements. However, the current fixed, single-rule methods struggle to accommodate diverse needs and varying characteristics. To address this problem, this study introduces literacy DRL-based federated digital twin scheduling that combines LLMs and DRL. The literacy module based on LLMs analyzes requirements in natural language and translates them into scheduling factors by weighting digital twin attributes. In addition to fine-tuning, the literacy module can further enhance domain expertise and understanding through retrieval-augmented generation (RAG) to better reflect user requirements. The scheduling module utilizes DRL to perform optimal scheduling by selecting the job–machine pairs that receive the best rewards in environments. We also demonstrated that the proposed method could respond to various requirements more flexibly than traditional DRL-based scheduling methods by incorporating diverse requirements such as manufacturing costs and manufacturing productivity. Additionally, the environment of DRL using LLMs can set the reward function according to the characteristics of the data. The proposed method can be effectively applied to large-scale and diverse tasks such as the SDF, where the relationship between tasks and requirements is very complex.
While the proposed method has demonstrated greater flexibility in addressing different requirements compared to DRL-based scheduling approaches that consider factors, such as manufacturing costs and productivity, it is important to recognize its limitations in real-world applications. The computational overhead associated with LLM-based systems may lead to challenges regarding both performance and scalability in dynamic manufacturing environments. This could make them particularly difficult to apply in scenarios where rapid decision-making is critical. It is also expected that implementing such systems into existing production systems will require significant effort. Future research will focus on developing strategies to facilitate the effective integration of LLMs with existing resources to improve computational efficiency. This could include methods for lightweighting AI models to reduce computational demand or novel algorithms that can reduce the computational resource burden while maintaining the performance of the LLM [
29,
30]. By addressing these challenges, we expect that the techniques proposed in this thesis will not only contribute to the hyper-personalized manufacturing paradigm but also lay the foundation for future innovations in the manufacturing industry.