It can be shown for JSP articles that there is usually no detailed information about the test data. The following sections explain how product models can be structurally described using characteristic values. Therefore the JSP is first described as a graph (see
Section 3.1). Then topologies in general (see
Section 3.2) and in particular JSP-context are explained (see
Section 3.3), in order to make structures of product models visible and to make the associated problem complexity (see
Section 3.4) comprehensible. These structures are quantified with the help of indicators of social network analysis (see
Section 3.5 and
Section 3.6).
3.1. Structural Representation of Jobs and Tasks
The graph theory found its origin in the 18th century with Leonard Euler, as a branch of discrete mathematics [
63]. The graph theory represents a decisive cornerstone for network analysis [
64]. With the help of graph theory, objects and the relationships of objects to each other can be seen. A priori no exact definitions of the objects and the meaning of their relation to each other have to be made [
63].
A graph contains nodes which are connected by directed or undirected edges (see
Figure 1) [
64]. Directed edges create a connection between nodes that points only in one direction [
63].
There are several application areas for graphs. Thus they can be used outside of problem solving or mapping networks [
64]. In the context of JSP, it can be applied by interpreting tasks as nodes. Predecessor and successor relationships of tasks are represented as directed edges.
Edges can connect two nodes on the one hand, but can also loop back to the same node on the other. Two nodes can also be connected to each other by several edges. This phenomenon is called multiple edges and is contained in a multigraph. If a graph has one of these features, it is no longer called a simple graph [
63,
65].
Graphs can be displayed in the form of diagrams or as matrices or lists [
66]. However, diagrams can be used to show optically logical relationships in a clearer way. Matrices such as adjacency and incidence matrices can be used for the formal representation of networks. It should be noted that the modified representation does not communicate additional information, but that the representation of a graph as a matrix can be more suitable for further processing of the information [
67,
68]. The relationship between edges and nodes is called adjacency [
69]. Adjacency matrices represent the interconnected nodes in the form of a
matrix, where
n is the number of nodes (see
Figure 2).
3.2. Topology
The relational position of objects to each other is described with topologies. As different as the positional relationships between objects can be, so extensive is the possibility of representation through topological patterns. In product models, the topology reflects the logical processing sequence of tasks.
The basic patterns of this network structure include the lines-, ring-, tree-, star-, and bus-topology, as well as the meshing and the full meshing [
70].
In the tree topology, there is only one path between two nodes of the net. All edges—over any other nodes—lead together to a single point. The ring topology reflects a form of connection in which each element has a predecessor and a successor [
71]. In the bus topology there is an edge to which all elements of the network are connected. The line topology has a start node and an end node. All elements between these two nodes have exactly one pre- and one successor. If an element exists at which all other elements are connected directly, i.e., directly, it is called a star topology [
71]. The tree topology is already a hybrid topology, since it is structured like a star topology, but further nodes connect to those nodes in level 1 [
70,
72]. If elements of a network are connected several times by further elements, it is a meshing. If there is a direct connection between every single element, it is a full meshing [
70]. Other basic patterns can be derived from the topologies mentioned, most of which are based on the tree topology.
Topologies can also occur in the form of a grid or hypercube [
72]. Another example would be the cell topology, which is mainly used in wireless networks [
73].
3.4. Complexity of Product Models
Although product models have different topologies, they are serialized in a machine allocation plan as part of a specific scheduling process. This means that there is a binding sequence in which the individual tasks are assigned to a specific time slot on a specific machine.
For products that are created from several tasks running in parallel, first a decision must be made to resolve this parallelism. If, for example, there are two parallel preparatory operations A and B, whose output is then processed by task C to produce a final result, the actual start may begin with either A or B. Depending on resource requirements and other jobs, even those two options can lead to radical changes in the optimization result.
If each work step has two predecessors (that is, seven instead of three work steps in total), the number of valid serializations increases from two to 80 variants (see
Table 6). If one more production stage is added, there are already more than 21 million possible combinations, which are caused by only one product of a plan. If, on the other hand, only linear product models are used, there is always only one valid sequence for each product size (seet
Table 7). With this extreme simplification, however, they do not cover the real variety of product models.
If only the number of tasks a job consists of is specified for test data, this is not a clear statement of the actual complexity of the problem (see
Table 6). With the flow shop scheduling problem, the problem complexity of individual product models is low. If only sequential task sequences can be assumed, there is only one valid serialization variant for each product model (see
Table 7).
While the structure of the product models primarily determines the dependencies between individual work steps, the dominance of individual work steps is determined by their processing time. Short work steps are more likely to find a gap in machine schedules. The probability that allocation conflicts will occur on a machine increases with the length of a work step. However, the absolute amount of time is irrelevant. Whether a work step must be regarded as long or short can be determined relatively and only by comparing it with the times of the tasks in other product models.
Minimum and maximum time limits are also specified. Sometimes there are products which have to cool down, dry or mature after one work step. In this case, a certain waiting time must elapse before the next work step can be carried out on the product (minimum waiting time). There are also working steps after which the product may only wait a certain period of time before it can be further processed, e.g., before it unintentionally hardens, contaminates or disintegrates (maximum waiting time).
In the overall view, there are further factors influencing the complexity of the JSP. These include the production plan, which compiles a set of product models and production quantities. On the other hand, it also includes the constraints specified by the production system itself: The number of machines, the capacity of the individual machines, the number of personnel, and the setup times required to prepare a machine for the next work order. However, the definition of these influencing factors is independent of the product models. In practice, however, it is a thoroughly relevant question which modifications to these constraints can most favour a fixed sequence of tasks in execution. These constraints also include the option that a work step can be executed on alternative machine types. The efficiency of the machining process can vary depending on the machine.
All these time specifications and constraints have no influence on the size of the solution space (problem complexity), but on the occurrence of capacity bottlenecks in the concrete task planning in a machine scheduling plan. The solution space is only defined by the number of orders and their number of work steps, product model structure and machine assignment. In order to be able to distinguish basic production types from each other, only the structure of product models is considered at first.
3.5. Metrics of Social Network Analysis
Social Network Analysis (SNA) is applied in divergent domains, both scientific and practical. The method can be applied interdisciplinarily and is useful for the analysis of networks of different types [
74].
If actors are connected through relationships, they form a network which can be examined with the help of social network analysis. It is not necessary for the actors of a network to be individuals [
75]. Likewise, an analysis of other related objects can also be performed, e.g., tasks within a product model.
Data about actors in a network can be displayed in tabular form, as a series of rows, with the corresponding properties in the columns. They can also be arranged in a matrix, in which the rows represent senders and the columns receivers of information. Thus, each cell always contains information about the relationship between two actors in the network [
76].
In the following, metrics of the SNA are described, which are used for the analysis of product models. In this context, product models are regarded as networks. Tasks are represented by nodes. Predecessor/successor relationships between tasks are represented by directed edges.
- Mean degree:
Also called average degree, is a measure that reflects the interconnection of all nodes of a network by using the arithmetic mean. Here not only directionally connected nodes but all adjacent nodes are considered. The mean weighted degree differs from the mean degree: Its calculation only considers directed, connected nodes and not adjacents [
77]:
The mean degree
is determined by the average of the degrees of the edges
k, which is adjacent to the sum
N of the nodes. This corresponds to dividing the double of the edge sum
L by the sum of nodes
N.
- Network diameter:
The network diameter describes the minimum number of edges between the most distant nodes in a network [
78]. The network diameter can be determined with a breadth search for larger networks. This is an algorithm which is called the “
breadth-first search algorithm” [
79].
- Edge density:
This is the ratio between the number of existing edges and the number of potentially possible directed or undirected edges in the network. The value is between 0 and 1 [
80]. The edge density
D can be represented by a division of the connections present in the network and the connections possible in the network (see formula
2):
Here
m denotes the number of edges and
N the number of nodes in the network.
- Modularity:
It describes the simplicity of breaking down a network into individual cliques. A high degree of modularity reflects the highly complex structure of the network [
81]:
The modularity
Q is determined by the decomposition of the network into
m cliques, the number of
L edges, the number of
edges between nodes of the
s-th clique and the sum of
degrees of nodes of the
s-th clique [
81].
- Strongly Connected Component:
In a network, a strongly connected component is a group of nodes in which all nodes are directly or indirectly connected to each other. The length of the paths is not decisive here. In contrast to strongly connected components, the direction is not considered for weakly connected components [
82]
- Mean path length:
This is the arithmetic mean of the minimum length of all paths between points in a network [
83]:
where the mean path length
is described by the formation of the arithmetic mean of the minimum distance
d between each two nodes
i and
j of all nodes
N of a network [
79].
3.6. Quantitative Systematization through Metrics
Topological features and metrics of the SNA are used to establish further quantitative distinctiveness of product models. This allows product models to be assigned to classes, which enables them to be sorted, compared and grouped. On the other hand, the differences or similarity between two product models can be precisely quantified using the metrics of the SNA.
To illustrate this connection, the SNA metrics are applied to extended basic topological forms and compared in tabular form (see
Table 8). With almost the same number of tasks, there are some extreme differences in the SNA results. But even smaller differences are sufficient to recognize and prove topological differences.
The value of Weakly Connected Components must be 1 in product models. Otherwise it means that there are several separate submodels. Strongly Connected Components, on the other hand, provide feedback on how many different cycles are present in the model. The smaller this value is, the larger are the existing cycles. The maximum number of components corresponds to the existing number of nodes. This implies that there are no built-in feedback loops in the product plans. If the Average Weighted Degree is greater than or equal to 1, this means that parallel or cyclic subprocesses exist. Consistently linear or tree-like structures always have a value below 1, because only a minimal number of edges are used to connect all nodes to each other. A high Network Diameter is an indication for long partial sequences in the model. The value for the Average Path Lenght increases with the Network Diameter. This increase is accelerated by existing cycles.
It should be noted that some special features of structures cannot be derived from the SNA metrics. The metrics do not differ from each other if the models to be compared have the topology of a tree running apart and a tree running together. In order to be able to distinguish these structures from each other, the models must be examined for basic topological forms.
With the exception of the differentiation of tree structures, all topological patterns can be distinguished from each other by the combination of SNA indicators (see
Table 9). This is only true for models of the same size. If now two models are compared, an absolute evaluation of structural properties can be made on the basis of the pure metrics (e.g., as for the models in
Table 8). Or a relative differentiation can be determined whether, for example, the second model is more similar to another topological basic pattern compared to the first model. In the context of JSP, concrete statements can be made as to which real production scenario these product models can be assigned to (e.g., flow production, recycling, distributed production, variant production, product improvement).
A further problem is that the metric values change with increasing number of nodes but constant topological basic form (see
Figure 4,
Figure 5 and
Figure 6). Inferences to a concrete topology on the basis of these indicators are therefore not trivial. It becomes even more difficult or impossible if a product model combines several basic topological forms and is analysed as a complete model. However, this does not diminish the purpose to prove the difference of two product models.
These metrics can bring benefits if test data is generated in large quantities by random generators. By simultaneously calculating the corresponding metrics, they can be used to form similarity measures for the generated test data. A subsequent cluster analysis makes it possible to recognize patterns in the models’ composition [
84]. The composition of the models can be checked for equal distribution by means of the metrics.