1. Introduction
The general idea of an assistant system, independent from its field of employment and intended function, is the support of their users in certain activities and objectives. As an additional requirement, such a system should also provide an easy and natural interaction to allow for an easy application of the assisting capabilities. Such a system, either as a device, tool or service is a staple of modern human–machine interaction (HMI) research, as seen in Biundo et al. [
1]. The current state-of-the-art, especially in the area of virtual assistance systems which are the main exemplary focus of this work, are capable of providing (simple) automations and data-retrieval tasks. Regarding these developments, we see an evolution from simple text-based settings, being mainly related to desktop systems, toward systems which can be applied in real environments (as seen in Biundo et al., Hasegawa et al. and Deng et al. [
1,
2,
3]). Such embodied devices allow for a more direct interaction since they have a physical representation (e.g., in Ötting et al. and Marge et al. [
4,
5]). Unfortunately, there is still a gap between this representation and the underlying concept of behaviour (as an overall control paradigm)—the main aspect of the current manuscript—and communication skills. According to Ötting et al. [
4], two indicators for behaviour can be distinguished, namely
task performance and
cooperation. Current systems are mainly focussed on task performance, which indicates how well a technical system and its user handle a task or perform together as a team, as seen in Blackler et al. [
6]. Therefore, such devices are often optimised for voice control or similar human-like interactions for an easy
understanding and
integration into the lifestyle of the user. Additionally, there is the idea of a real (semi-)autonomous system capable of providing continuous support and oversight of their user activities, as well as a more seamless integration into their user’s lifestyle without the need for constant manual activation, as seen in Wendemuth et al. [
7]. This idea is directly linked to the second indicator, which is mentioned in Ötting et al. [
4] and also is the focus of the manuscript: cooperation. For this, the approach requires much more autonomy on the part of the system, more than most current state-of-the-art systems/devices can provide, as it represents a step from the current reactive approach to a more proactive paradigm of capabilities and structures. Such a system would fill the niche of a
true companion system, or a true peer as seen presented by Biundo et al. and Weißkirchen et al. [
1,
8], and would continuously and pre-emptively care for their user, specifically without the overt input necessary from the user. This not only subsumes the typical personal assistance systems, but also general assistance systems in industrial environments (e.g., smart manufacturing and service stations). Current systems instead focus more on integrated Smart Home Solutions as examined in Thakur et al. [
9], or more efficient voice-controlled applications in a personal environment as presented in Valli et al. [
10]. Additionally, even with this approach, with the current developing area of new and improved interface options, such as eye-tracking-based systems or even brain–computer interfaces, this can lead to extensive further improvements for interface technologies as examined in Vasseur et al. and Simanto et al. [
11,
12]. The requirements for a responsible and sufficient control instance increases with this further integration of human and technical system. As a result of the inclusion of new human–machine interface technologies, studies imply a greater impact of interactive media and information on the mental state of the user, which can be shown through higher attentiveness and brain activity as shown in Katona et al. [
13,
14,
15]. Without allowing a technical system to analyse and rate its own impact, this can lead to potential harmful influences on the user. These systems by themselves, as a result, do not lead to a direct improvement of the capabilities of the assistance itself, which are then constrained by their reactive capabilities.
We have already proposed an approach capable of providing this kind of human-like assistant capability (see
Section 3.1 for further details, as well as [
8]), specifically a system equipped with a human-like decision-making process and their own set of priorities. Such a system works alongside a user, in the sense of collaboration and cooperation. This system provides “peer”-like capabilities (cf. [
8] and
Section 3.1) as an assistant, since it aims to continuously search for possible ways to assist (in the sense of partner at eye level) the designated user. Further, this kind of system tries to solve potential problems before they arise, or at least before they pose an imminent impairment. This is tackled through a combination of a comprehensive situational awareness, an adaptive and trainable representation of the most likely aims and priorities of the user, and—most importantly—the independent objectives of the system itself which actively control the way the system may solve potential impasses between the other aspects of the system. Based on these aspects, we establish the (cooperative) behaviour of a peer-like system or True Artificial Peer as shown in Weißkirchen et al. [
8]) as a meta-level overseeing the main goals, achievements, and actions, providing an overarching strategy, being a consequent adaptation of the ideas presented in Schulz et al. [
16] who argue for a (biologically inspired) ability of strategy changes as “a core competence of [technical, adaptive] systems”. Therefore, the behaviour triggers underlying concepts like sensor and activation control, dialogue management, etc. that handle the specific tasks in a very particular manner, ideally in an adaptive fashion.Regarding the “proactivity levels” presented in Meurisch et al. [
17], which are separated into reactive, proactive or autonomous levels, we are dealing rather with the autonomous part of the range. In this sense, we aim for an extension of the system capabilities towards more system-individual capabilities, which is more than the current view on a rather task-oriented adaptability which was already discussed in Chaves et al. [
18]. This meta-information can also be included at the level of report necessary for each decision, depending on the level of trust the user affords the system.
The advantage of this kind of method is not only related to a better integration into the daily lives of the user, but also its employment of a perceived empathy towards the interlocutor, often lacking in contemporary applications.
In contrast, in combination with the internal objectives and characteristics of the system, it allows for the system to experience empathetic reactions, and furthermore the same from the user towards the system. This is achieved by effected and affected actions during an HMI, resulting in anticipatory decisions from the system’s side as discussed in the works of Thun et al., Valli et al. and Vinciarelli et al. [
10,
19,
20].
Before going into details regarding True Artificial Peers and our realisation of behaviour in the mentioned context, we provide an overview of our understanding of the relevant terms:
Efficiency is usually linked to the time necessary to solve or complete a task. In terms of communication, this refers to the number of turns present in an interaction to obtain the expected information as explained in Mitev et al. [
21]. From our perspective, in HMI, the combination of both mentioned aspects leads to a holistic understanding of collaboration in settings where humans and technical systems interact with each other. Given interlocutors collaborating as partners and peers, a team can be compiled being more likely focussed on the task, which of course depends on the current setting and task as also discussed in Hancock et al. and Mitev et al. [
21,
22].
The
satisfaction of the user is defined, according to Ötting et al. [
4], “as the extent to which user responses (physical, cognitive, and emotional) meet user needs and expectations”. For True Artificial Peers, this is directly linked to the aspect of how the system’s internal goals and objectives are related to the interlocutors expectations.
From our perspective, user satisfaction is also connected to
acceptance. Venkatesh et al. [
23] define acceptance as “the attitudinal antecedent of usage”, thus arguing that the mere usage is an intrinsic acceptance of the (system’s) capabilities and limitations. Taking this into account, an accepted interaction is considered as any communication which is not broken down after a few turns.
Satisfaction as well as acceptance are coupled to indicator
trust. In relation to Lee et al. and Ötting et al. [
4,
24], trust is the user’s belief that the technical system will help to achieve common and shared goals “in a situation of uncertainty” as cited by Ötting et al. [
4]. From this, we argue that trust can be achieved in either a highly capable and highly secure way—in the sense of being application-oriented—or in an interaction to establish common goals. In this manuscript, we foster the latter approach, allowing the system to also benefit from own and shared goals.
In our work we also mention the term
“empathy”, while there is a variety of definitions in human–human and human–mechanical interactions, for example, in Cuff et al. [
25]. We use it specifically concerning the traceability of actions. The empathy factor in this case is the ability of a human user, or interaction partner, to assume what the system may do next. The same empathy also includes the ability of a technical system to assume a human decision-making process. This is of course a small aspect of true human empathy but is still an improvement on typical technical user profiling which often just approximates repeating actions and the assumption of a user that the system may latch onto these repeating actions, which reflects the current trend for improvement of assistance systems and smart environments as also examined in Murad et al. [
26].
The general development is towards a system capable of understanding human emotions, intentions and decisions as described in Yoon et al. and Pelu et al. [
27,
28]. For “empathy” of user states, it is more important to recognise emotions or mental states, while “empathy’’ for an interacting peer is more focussed on retracing the human decision-making process. While the detection of emotional states with good results is already a staple of machine-learning-based classifiers as, for example, carried out in Schuller et al. [
29], the interpretation of these emotions into a human-like “empathy’’ is more complicated as it requires technical alternatives for the understanding of the recognised emotion classes. Emotional classes range in that case from discrete classes, such as fear or happiness, to more indirect representation, such as a valence-arousal axis representation as presented in Russel et al. [
30] and can give helpful indicators for the general state and satisfaction of the user.
This research primarily concerns itself with which different options a system may engage the user with. With this, a system can, for example, control interactions in a more efficient and expedient manner. One of the main aims of the behaviour control is the choice of optimal interaction strategy for each situation. Especially during the initialisation and self-adjustment to their user, it is instrumental to generate an overview over the priorities and specificities of the user and the situation. In contrast, the system has to provide a certain amount of satisfaction, allowing a continued use of the system, specifically to impede a potential breakdown of interaction due to inconsequential actions. These two aims can be opposite in their implementation, as the data-generation process can be repetitive and error-prone, while a higher user satisfaction is often coupled with fast and correct decisions from a potential assistant system.
The current state-of-the-art, as well as the base for this work is given in
Section 2. This includes explanations for our “peer”-like concept, and the conceptual similar BDI architecture. Given this introduction and motivation of our work, the main contributions are briefly summarised here, and will be discussed in detail in
Section 3:
- C1:
Providing an extension of the True Artificial Peer concept.
- C2:
Providing a perspective to situation-adaptive characteristics of technical systems in interactions.
- C3:
Providing a modelling approach for behaviour of technical systems, combining and extending concepts of BDI (cf.
Section 2.1) and ACT-R (cf.
Section 2.2).
- C4:
Providing a framework for the realisation of autonomous behaviour of technical systems in general and True Artificial Peers in particular.
Finally, an outlook and conclusion are provided in
Section 4.
3. Results
To establish a True Artificial Peer with its very own characteristics, objectives and goals, we propose the behaviour of these peers in the current section. For this, we define the term “behaviour” as follows:
Behaviour describes the actions, and more importantly the reactions, the system takes concerning their human interaction partner or partners in the case of multi-user applications. These actions include primarily the validation by the user of the steps taken till now, as to assure that the user is sufficiently informed about the system’s intention, as well as the internal shift of one set of objectives to another.
In the following, we emphasise and elaborate the theoretical concepts and methods, aiming for behaviour in True Artificial Peers.
3.1. True Artificial Peers
The idea of a True Artificial Peer is based on our former work presented in Weißkirchen et al. [
8]. There we extended the concept of a companion technology in Biundo et al. [
1] which not only follows the passive role of a command interpreter, but is also equipped with their own set of objectives and priorities. These objectives are held concurrent to those from the human interaction partners.
Depending on the particular design, this can be used to provide a certain amount of stability for the interaction between humans and the system. For instance, orders or applications in conflict with a former specified set of safe priorities will either be ignored or actively communicated to the user in the case of a user mistake. This reduces the real problem of wrongful activations and misunderstood command recognition. As seen in the article of the Verge [
61], this is a really occurring current problem which leads, for example, to false marketplace activation in an assistant system with negative consequences for a user. This may only become more serious if the functionality of an assistant system is even more deeper integrated in the daily lives and such a system may control further personal assets.
Another important aspect is the decision under uncertainty or decisions without direct user input, as current systems rely primarily on the constant outside influx of orders and commands to continue working. Even under assumed infallibility of the interpretation of these inputs, the user still would need to constantly interact with the system to ensure a satisfactory result of the delegated task as the system lacks the capability to perform several independent steps for an overarching task. While this current model may reduce the workload of certain tasks to a degree, by providing an efficient and fast user interface, for example, to provide crucial information in a short time or to remind the user of important events and dates, it most often simply shifts an interaction from one control method to another. By providing an assistant system with their own objective set and the additional ability to generate further adapted sets based on an interaction, the system receives the ability to act during minimal information situations, either to engage the user for clarification or even to solve problems based on former experiences. This not only solves problems during communication but also allows the system to provide assistance when it is not specifically asked for. In this sense, the system would be reasonably assured of the intention of the user using a kind of “empathy” (cf.
Section 3.2).
The last aspect was already described in Weißkirchen et al. [
8]; it builds the foundation of the current research, and is the issue of traceability of the decision-making process. Every new task or objective for an assistant system can be interpreted as a handing over of responsibility from the human user to the technical system. While this can be a relatively small responsibility for current systems, it may inadvertently increase for more integrated applications, which control further aspects of the user lifestyle. This, of course, can and should lead to certain wariness, especially if the system is faulty in its examination of the user objectives. To reduce this uncertainty, it is imperative that the user is still the informed partner of the interaction, since the final responsibility of all actions remains with him/her as described in Beldad et al. and Mohammadi et al. [
62,
63]). However, the full informational breakdown of every decision the system makes, even for repeating actions, would simply shift the current dynamic of a user from constantly engaging the system (to give information) to constantly receiving information, which is still a high workload in this case to supervise the system continuously.
To reduce this constant alienation, we propose a specific set of “behaviour” on the system’s side which may change based on engagement of the user, level of information and relative level of habituation, helping to facilitate an improved level of trust from the user towards the system during any interaction.
3.2. Perspectives on Empathics
To explain the underlying principle of our behaviour approach, we introduce the concept of “empathy” and distinguish two main “emphatic” aspects in HMI: First the “empathy” of the system itself, concerning their human user. This includes aspects of a typical user profiling, but advances upon the idea of typical and untypical situations and reactions. Specifically, it assumes that a system should be able to recognise and react to situations which are without direct precedent or opposed to the typical profile. It not only allows that the system recognise the active influx of commands; it rather actively designs a user profile including typical actions and reactions either through recording given information or preferably by engaging the user in an interaction designed to extract or generate new information. Through this, the system may be able to map the decisions and preferences of the user on a continuous representation instead of a list of singular occurrence which may be recalled in case they reappear. This results in the continuous learning aspect not only recognising the action itself, but also the surrounding situation necessitating the action. For example, a typical user interaction on a current assistant system may include the search for a specific restaurant based on distance, price and available cuisine. Based on that and also former search results, a current system may be able to generate a preference for a particular restaurant; for example, a short distance from home was the primary search criterion from the user. A changing interest from the user or a specific occurrence, which changes these priorities, cannot easily be included in this kind of representation of the user. While a system may rewrite or over-write a set of information, it would have problems differentiating a singular occurrence or a shift of priorities. This leads either to further specification by the user or non-optimal results by the system.
In our approach, the system would include questions and separate options to measure the dependence between distance, price and available cuisine to enrich the representation as well as include the information of specific re-occurrences which may influence the decision process of the user. For example, such changing (single) events could be an anniversary or bank holiday. Importantly, it would also recognise a change of typical behaviour, such as a missed lunch, a traffic jam or similar obstructions to the usual profile. It would then engage in solving the apparent problem of a potential hungry user as well as, if possible, the source of the obstruction. This would be based on architectures designed to approach human decision processes, as, for example, ACT-R or similar systems. The particular procedure is visualised in
Figure 2 and
Figure 3.
The second aspect is the “empathy” from the user concerning the system. This has to be separated from the natural adaptation most users undergo during the use of technical systems. Such adaptation can be easily recognised when using voice commands; users approach the system most often in a comparably unnatural and stiff way. To reduce faulty detections and wrong interpretations from the system’s side, they reduce contractions and clearly list their command grammatically, so that the technical interpreter can parse the information more easily.
In contrast, we envision “empathy” concerning the decisions and actions of the system. The decisions taken by the system are based on reproducible information and this information can be presented to the user at any time. Moreover, the users themselves may recognise arising problems or situations of lacking information before the situation occurs. In a practical application, this means that the user would “know” which commands have to be used and which could be abridged. A recurring order for a specific dish as takeout food could be phrased as “give me the usual” instead of a clear listing of commands. The empathy in this case would be the assurance of the user that the system would recognise “the usual” the same way the user would intend it to. More in depth, it also includes the assurance to the user that the system reports and explains the decision-making process. In a situation where the system is lacking information, it will additionally not stop working, but will actively engage and solve this problem. In the process, it will also display the current state the system is in, which further includes the user in the interaction. This allows not only for a more natural interaction, but also presents a better integration of the technical system into the user’s lifestyle. This we termed “peer level” in Weißkirchen et al. [
8] in the True Artificial Peer as it approaches a human–human level of interaction.
To facilitate both these empathic dynamics, the system has to assure at the same time: (1) the ability to change the inner architecture based on new situations and (2) remaining mostly static in their decision-making process so as not to lose the understanding from its user. To allow these contrasting requirements, we propose “behaviour” as a controlling aspect of the internal dynamic adaptation and the external static appearance, which make the system understandable.
3.3. Types of Behaviour
We distinguish three primary types of behaviour the system can employ at any moment, as can also be seen in
Figure 2. Each of these contain their own set of priorities during any (inter-)action between the system and the user, these will be characterised in the following.
The first “behaviour” is grounded on a rule-based approach, alternatively called a pre-designed approach. This comprises most state-of-the-art systems, which are (mainly) pre-programmed to react to each stimulus every time in the same manner. Specific users or specific situations usually do not influence the system’s action. Rather, users have to define preference by themselves to account for particular characteristics or actions, often during the user profile generation. In this case, the behaviour is objectively traceable but may pose problems when the current user is not aware of the underlying rules. Therefore, it leads to the aforementioned adaptation of the user to the system since the user adapts his/her interaction towards the desired reaction, which may lead to unnatural expression on the user’s side. To improve adaptation on the human side of the HMI, which may frustrate the user in turn, the system should be capable of communicating its internal rules implicitly or explicitly. This is especially important if unnatural or overly acted behaviour from the user side is otherwise necessary. While this approach may allow for better acceptance, it still does not allow for “true peer” level adaptation of a system. As an example for a typical rule-based interaction, the user would need to specify the desired action, the specific parameters, and acknowledge the process after call back from the system. A command could be something like: “System, order a pizza with salami toppings at 3.00 p.m. to my address at the nearest pizzeria.” Which translates to: “(Addressee: System), (action:order) (parameter object: a pizza with salami toppings) (parameter time: at 3.00 p.m.) (parameter location: to my address) (action location: at the nearest pizzeria).” This would be parsed towards a specific set of rules concerning the task under these parameters.
Second, we distinguish the rule-based behaviour from the “opposite behaviour” based on an
exploratory approach. The exploratory behaviour allows for system actions during situations of high uncertainty and low information. The awareness of the situation and generated conclusions may rapidly change to react to the novel generated information during this state. Unless the system achieves enough information to generate a new rule, which requires an adequate understanding of the situation. While this method allows the latter generation of new rules, it may incur a level of heightened load on the user, as the active engagement rises significantly. To soften this impact, the system either informs the user from its change into the exploratory stage or otherwise minimises the taken steps to such an extent that the user may still be able to intervene. Without direct information of the behaviour change, some users may (subjectively) rationalise this behaviour, potentially assuming objective rules where there are none yet generated. Generally this behaviour may result in a decrease of trust and empathy from the user, the longer the system employs this approach, as it lacks the clearly defined rules of the other approaches. While the taken steps reduce the impact on the trustworthiness, this behaviour may not be followed indefinitely, before the user may stop the interaction. This reflects a form of exploitation/exploration dilemma, as the system has to find a way to optimise the employments of exploratory and rule-based behaviour in a way to lengthen the possible interactions. The main method to alleviate some of the problems is the explicit way in its communication to reduce possible decisions which are opposed to the desire of the user. As an example, the system would, based on its knowledge of the user behaviour and former activities, ask: “Are you hungry? Would you like to order something to eat?” The intention of the system to engage with this question is not only based on a timer from the last known meal of the user, but also includes typical daily routines, known interjections, which might have led to a skipped meal, and measurable behaviour changes of the user which might imply hunger on their part. The system needs to anticipate and engage the user, ideally before a problem or situation becomes imminent, to convey its empathy towards the users’ decision-making process. This might lead either to a rejection or an acknowledgment by the user. In particular, the acknowledgment would result in further interactions, clarifying the type, time and further parameters of the ordering. This in turn would follow a known pattern, which the user recognises as desire of the system to learn more about the user and his/her situation. Importantly, the system would also engage in questions like: “Do you generally prefer this kind of food?” or “When is the usual time you like to eat?”. These questions are used to generate a deeper understanding of the user decisions for the system. Therefore, a continuous feedback evaluation is necessary to stop these questions as soon as the user appears to be dissatisfied as described in Nielsen et al. [
64]. In case of a denial by the user, the system would attempt to re-evaluate its knowledge of the user and his/her situation. Practically changing its belief towards a more correct version, which better represents the needs of the user. Importantly, this approach not only employs typical user profiling, but also recognises changes from the norm, taking positive, negative and new situations into account. As a result of the continuing learning process, the time when the system engages this behaviour will give way to the last type of behaviour, as new rules are generated specifically for the user.
The third “behaviour” is grounded in a
data-based approach. It is an intermediate level combining perspectives of the two previous behaviours since the system generates new rules (rule-based behaviour) based on the exploratory phase (exploratory behaviour). While it mirrors the rule-based approach, the rules here are directly based on typical user profiles, situational awareness and personal preferences of the user. This allows for a far more integrated and adapted system than a typical off-the-shelf (as given by a generally trained system) method would achieve. it also allows the system to generate own priorities and objectives for external situations, which are not directly tied to the user, but which are necessary for the continued functioning of the system. The important part is the implicit generation of rules without the direct supervision of either the original designer or the specific user. This fits our understanding of “peer” level, as the user may subjectively understand the rules perfectly, since they are based on the generated data during common interactions, while for a different human observer, not directly being involved in the exploratory interaction, the decision making process may still appear random or non-deterministic. We assume that a system, operating in this way, would elicit more trust and acceptance from the user, based on the completeness and correctness of the situational awareness generated during the common exploratory phase. This completeness can be measured as the amount of additional data which supports the achieved decision by the system, while the correctness would primarily depend on the feedback from the user. In fact, we did not perform user studies, yet, but can however, argue based on literature dealing with similar aspects. In Ötting et al. [
4], the authors validate several hypotheses considering issues of HMI and how this is influenced by the autonomy and adaptivity of the technical system (cf. especially hypotheses 2 to 4). Given the respective findings, we see that adaptability has a positive effect on satisfaction and acceptance of the user. For autonomy, the study reveals no negative influences on acceptance. Unfortunately, for trust a positive effect could not be confirmed; howeverm this “need[s] to be interpreted with caution because [it is] based on a small amount of effect sizes”, see Ötting et al. [
4]. In contrast, the meta-analysis of Hancock et al. [
22] shows that for trust additional characteristics (e.g., appearance) might influence the human expectations and assessments. In this sense, finally the entire setting—and not only the system’s behaviour—influences the overall trust in the technical device, which is though beyond the scope of the manuscript. Additionally, the user’s expectations influence the way the system is used, trusted and perceived. In their study—related to health care and well-being with AI systems’ support—Meurisch et al. [
17] report that they “revealed significant differences between users’ expectations in various areas in which a user can be supported”, where expectations can be in a range from technically feasible to unrealistic based on the user’s technical understanding and knowledge. For mental health support, for instance, “most users tend to prefer reactive support or no support at all” see Meurisch et al. [
17]; rejecting adaptive systems is usually correlated to expectations on privacy violation. Since the stud comprises data from European and North American countries, the outcomes reflect (somehow) cultural differences. This should be kept in mind also for the approach; we suggested, as, for instance, participants from Canada preferred less proactive systems as shown in Meurisch et al. [
17]. From Figure 5 in [
17], we see rather negative feelings of the users the more the system tends to autonomy, which slightly contradicts the findings of Ötting et al. [
4]. However, the negative view “can be partly explained by their attitudes and beliefs towards the particular AI system” of the study’s participants, where generally an openness toward a novel system can be seen.Given these studies performed by Hancock et al., Meurisch et al. and Ötting et al. [
4,
17,
22], we conclude that our approach can contribute to the discussion on trust in autonomous systems and might lead to a better understanding as well as levelling of expectations and (system) outcomes. This is also related to the particular (difficult) selection of parameter settings, especially in the timing of behaviour changes. Therefore, the transition time between different behaviour stages should be chosen to minimise the additional load on the user, as each change requires additional mental and cognitive loads. The resulting interaction would be as described on the peer level, for example, by either the user requesting “the usual (stuff)” or by the system asking “if the usual (stuff)” shall be provided. On top of this direct interaction, there is also the possibility of the system providing the assistance without the given command. Both the system and the user are sufficiently sure what “the usual (stuff)” means, comprises and entails, providing the highest user satisfaction with the least necessary input. As the system generally learns the needs and wishes of the user, it will start to prepare “the usual” ahead of time, as well as recognise when it is not needed during a sudden change of situation. One used term in this research is “proactivity”, this term is used in different cases with different meaning behind it, for example, in Chaves et al. [
18]. Generally, in conjunction with AI tools such as assistant system, it describes different levels of independent decision making. This potential independence is taken from the view of a user, and goes from fully reactive support, proactive decision making after checking with the user and at last autonomous decision without direct user input as described by Meurisch et al. [
17]. The first step is fully part of the rule-based and break-down situation where every action is communicated by the user themself. The check-up with the user is mainly part of the exploratory, while it is also part of the data-based stage in conjunction with the ability to follow its own decision-making process. Additionally, the decision of the system may even go a bit beyond this paradigm of proactivity and even autonomously, in cases when the decisions are not directly part of a user-support process. The architecture allows the system to engage and solve problems even in the case where no user is present or when there are transient users, such as in supervision function in factory settings or in other open environments.
The term of proactivity is also used in conjunction with the less general capabilities of a dialogue manager, specifically a chatbot as described by Chaves et al. [
18]. Here proactivity also describes the effect of a system engaging a user without direct former input or signal. Importantly, while this also describes an interaction process, this aspect is subsumed into our general behaviour control. The interaction may as well be on a dialogue basis, but can also be based on actions and reactions, mimics or other indicators of user state, intention and non-verbal interaction. Here again this engagement can also start without the user being present, autonomously improving the situation for the system itself. This interaction is not reduced to only dialogue from the system, but includes, among other things, sensor platforms, technical appliances or even semi-autonomous drones/agents.
The currently used behaviour is decided by the “behaviour control” or “control unit” (as seen in
Figure 2). The specific methods to decide the change are given in
Section 3.4, but it generally either mainly follows an algorithm based on information and user satisfaction, while an override by the system also remains possible.
As visualised in
Figure 2, the particular behaviour is chosen and controlled by a
behaviour control unit, which is explained in detail in
Section 3.4. The control unit initially selects a rule-based paradigm, being a starting point for the exploratory paradigm when a new situation arises, given that the user is still cooperative. If the user shows a lower level of satisfaction, the system changes back to the less exploratory dependent but traditionally used rule-based approach. In contrast, given a suitable amount of analysable data, the system will switch to the data-based paradigm, which also contains rules, but is generated from specific user information.
Given the considerations in
Section 3.2 as well as in the current section, an advanced conceptual relation of True Artificial Peers can be achieved (cf. contribution C1). In particular,
Figure 2 conceptualises our perspective on an adaptive and situation-related technical system. This allows an argumentation concerning how those systems in general and True Artificial Peers in particular could be set up (cf. contribution C2). Moreover, it lays foundations for the adaptation of behaviour which will be discussed in
Section 3.4, realising the framework for the control unit.
3.4. Adaptation of Behaviour through the Control Unit
During the lifecycle of such a system, the behaviour may change dynamically as soon as the topic or situation changes during the interaction, based on the level of information and the reaction from the user. This is controlled by a delay between each change, so as not to impair the general satisfaction. This is observed and directed by the control unit, which declares the relevant behaviour for each time step at which the system is active. By default, the system may employ the rule-based approach as a final fall-back strategy and as the first approach during initialisation. During any interaction, the system recognises a new situation, either because of a command, which was not used before, or the user uses the command in an event which would imply a change of the general priority order. To resolve this situation the system applies the explorative behaviour to fill this lack of information, either by reaffirming the most likely solution, based on former interactions or by directly asking the user for clarification for new approaches. Depending on the level of complexity, the interaction may evolve to further topics. This can be seen in
Figure 3, where the possible developments are shown.
Depending on the level of uncertainty, for example, the inability of the system to apply any former information to the particular new problem, the system decides the behaviour based on the satisfaction of the user. Given a general agreeableness from the user, the system may change freely towards the most possible exploratory behaviour. If no intervention by the user is detected, the system automatically applies exploratory behaviour to collect appropriate new data to compensate for the lack of information. In case of imminent negative reaction of the user or detected user dissatisfaction, the system immediately reverts to a standard rule-based approach. As the system continuously generates novel information, these rules are replaced by adapted “rules” generated based on user-specific data. Given this adaptation process, the proposed system is much more grounded in the current situation and exceeds approaches which have only a general rule-based method.
This kind of interaction and behaviour can be described in a mathematical way. The used formula is chosen to mirror the typical activation description as employed in the ACT-R architecture as explained by Bothell et al. [
39] and Equation (
1). This is not only used as a visualisation but also as a potential option to combine both systems. By using the already given cognitive architecture, this could be added relatively easily in a modular fashion to a typical workstep of ACT-R. Information which contains similar patterns or is retrieved together often in short timeframes automatically receives an increasing connection weight. This not only allows for the system to retrieve one data segment but further in-depth information, which may contain relevant background or topical data.
Introducing the term
knowledge value for each specific topic
T allows to combine information in the current situation, interaction or similar occurrences:
where
are all beliefs or information directly concerning T, where n is the amount of available information,
is all the contextual information concerning T, where m is the amount of contextual information for each i,
is the weighted importance which connects the context to the original topic T.
Each of these are parts in the expanding memory; during exploration, new data are generated and connections are created as possible. By including not only the directly connected information, but also the contextual information, the system aims for a deeper understanding. This context is part of the exploratory process. Specifically, this includes the extended knowledge, for example, different preferences dependent on time, location or preliminary actions.
This leads to high
K values for topics
T for which many direct and contextual information items as described by Böck et al. [
60] are given. In contrast, low
K values were reached in cases where the topic is unknown.
In addition, the system generates a user (-satisfaction) score
which is based on the feature vector
for each user
i.
The function
f is dependent on the available external sensors which are used for user observation. These can be unimodal, like voice or visual, but also multimodal. The composition of this score depends on the specific system, but contains measures for the user characteristics such as shown in Böck et al. and Vinciarelli et al. [
20,
60]. In particular,
covers feature values for emotion, mental load and/or similar indicators of user satisfaction. Each of these values can either be taken directly from a connected sensor array, or indirectly by mapping extractable sensor information to the user states; this can be done with machine learning solutions as shown by Schuller et al. and Weißkirchen et al. [
29,
65]). The combination of these values for a general user satisfaction is then related to the personal perception of the user concerning different states, or alternatively the situation. For example, during a dialogue, user emotion is more important; during an assisted task, the mental load of the user is a more important indicator.
It also covers direct observations concerning the reactivity and interactivity with which the user replies to each inquiry. While the different aspects of the satisfaction score may change depending on typical user behaviour and type of inquiry, it assures that the user may not “give up” on the interaction. This includes the aspect whether the user agrees to continue with the current course of action. A high value indicates satisfaction and consent, while a low score implies dissatisfaction concerning the last actions of the system.
The knowledge value in combination with the user score spans a behaviour space, where each change of situation may lead to a transition of behaviour. As visualised in
Figure 4, the previously described behaviours can be matched to areas in the space, covering the reasonable areas. As both satisfaction and knowledge ranges are normalised in the range of (0,1], negative values are not possible.
Finally, we achieve behaviour values
which combine the characteristics of both the user and the situation concerning the topic
T for each interaction step
s:
and
are the same as explained but mapped to a specific time step. The estimation of
can be expressed as, for instance, membership functions known from Fuzzy Logic, as shown in
Figure 5. Each membership
selects the particular behaviour, executed by the system (cf. also
Figure 3 and
Figure 4). Importantly, the membership function overlap—being usually applied in Fuzzy Logic to allow flexibility in final outputs—enables the system to not necessarily switch directly to its new behaviour, but allows a smooth transition. Otherwise, this could be perceived as “fleeting” or jumping in behavioural expressions. The gradual change, in contrast, is smoother and further is expected to be more natural or less abrupt for the user. The smoothness is primarily based on the continuous interaction flow, since the stepwise change is of course discrete. When changing his/her behaviour, the user will recognise the increase in interaction initiated by the system. This curiosity primes the user for a more in-depth explanation. The system also remains in this state for a recognisable amount of time, allowing the user to adapt to this change. A typical system would simply choose the most likely option, declare the inability to parse a command (in the hope that the user may re-phrase the request more clearly) or simply not act at all. This aspect is essential since—as also known from psychology—behaviour is a longer lasting characteristic of both humans and future technical systems. In contrast, a device is able to select and react on sensor inputs, which might be acquired in short intervals (some milliseconds to seconds) that might result in abrupt changes. In Böck et al. [
66], this issue is discussed and respective approaches to handling this aspect are presented. These can be combined or also integrated in the suggested Fuzzy-like method, see
Figure 5. This will result in a more naturalistic interaction and communication with the user.
Regarding the main contributions stated in
Section 1, we summarise: The current section provides a basic concept concerning the adaptation of behaviour of technical systems. This is based on the current knowledge (cf. Equation (
3)) the system derived from its beliefs and the situation/context. For this, a scoring-like handle is achieved, combining main ideas of the BDI and ACT-R model, showing a theoretical concept (cf. contribution C3). In our research, the system’s behaviour was mapped to an adaptive behaviour value (cf. Equation (
5)), which can be interpreted and operationalised from a Fuzzy Logic perspective, see
Figure 5. Considering the theoretical argumentation in combination with
Figure 3, a concept for a framework is given, realising the behaviour of technical systems (cf. contribution C4). The assistance systems mentioned in the state-of-the-art, compare
Section 2.3, allow for a greater integration of technical systems into the daily lives of their human users, but at the same time (still) lack the ability to truly interact on a human-like, naturalistic level (in the sense of Valli et al. [
10]). This is a weakness, especially when such a system is faced with its own set of objectives and priorities which it has to accomplish concurrently or in spite of the task requested by the user. This potentially perceived distance between the assumed responsibilities of such assistance systems and their real capabilities to recognise and solve problems ought to be bridged. Therefore, systems need to engage their users more intensely without repulsing engaged (interaction) partners. Our solution allows (1) a system that engages as much as possible, trying to avoid the aforementioned critical state, and (2) the system to be better integrated in the decision-making process of the user.
4. Discussion
In the current manuscript, we sketched a procedure and method to establish behaviour in True Artificial Peers as described by Weißkirchen et al. [
8]. Those peers are technical systems or devices that extend current state-of-the-art systems, intended to be assistive devices in a general sense, see
Section 2.3; this is described in Cowan et al., Marge et al. and Weißkirchen et al. [
5,
8,
45]. From our perspective, True Artificial Peers do not have a passive role in an interaction; they rather take action by themself, following their own objectives and goals, see
Section 3.1; this is described by Weißkirchen et al. [
8]. Given this additional quality of technical systems, these systems need their very own behaviour that results also in a better understanding of the system’s characteristics by the user. In particular, a valuable interaction relates to a (grounded) understanding of the interlocutors’ characteristics as described by Thun et al. and by Marge et al. [
5,
19], which is encouraged by interpretable and consistent behaviour (cf. contributions C1 and C2 in
Section 1). Therefore, we aimed for such behaviour, allowing for (1) the active pursuit of the system’s own objectives and goals and (2) an interpretable understanding of the system’s reactions in an “empathic” way (cf.
Section 3.2). To reach this goal, three types of behaviour were considered (see
Section 3.3), namely rule-based, data-based and exploratory behaviour, which are controlled by a central unit called behaviour control (see
Figure 2 and
Section 3.4). The relation and transitions between the particular behaviour settings are visualised in
Figure 3. Any transition is based on the behaviour value
(cf. Equation (
5)) that can be realised and interpreted, for instance, in the sense of membership functions, being well-known from Fuzzy Logic (see
Figure 5). This enables the system, on the one hand, to derive suitable selections of behaviour and its characteristics, and also the human interlocutor, and, on the other hand, to interpret the system’s reactions by implicitly predicting the memberships (see
Figure 5). Further, this type of modelling results (usually) in a smooth transition between the respective behaviour types. The generation of behaviour in technical systems was, regarding
Section 3, elaborated in a theoretical way (cf. contribution C3) as well as sketched in the sense of a framework and algorithm, visualised in
Figure 3 (cf. contribution C4).
Since we presented mainly the theoretical considerations of our approach for the behaviour of True Artificial Peers and only discussed the beneficial implications in
Section 3.3 building on work of, for instance, in Ötting et al. [
4], we consequentially plan to integrate the approach for application-based research into (some) assistive devices, especially in (1) voice-based assistant systems and in (2) a setting related to ambient assisted living. We consider these two particular settings for the following reasons: A voice-based device allows direct communication between the interlocutors (cf.
Section 3.3 and also by Marge et al. [
5]); instructions, needs, desires, etc. can be negotiated; and finally, the “shared goal” can be achieved. In contrast, an ambient assisted living setting links the “interaction” partners in a different way. The “system”, compiled from various sensors and multiple actuators, needs more “empathy” towards the inhabitant(s) see also
Section 3.2, which further has to be combined with the objectives and goals to support the user(s), where rather an implicit exchange of information is in favour. Therefore, we have the ability to study the interplay of different behaviour types, as visualised in
Figure 3 and discussed in
Section 3.3. Furthermore, this allows investigations into the influence of the contextual information and history, as described in Böck et al. [
60], being used in the transition between the three behaviour types.
Generally, the interchange with practical achievements, implementing our concept, allows more fine-tuning of the underlying theoretical foundations as well as a validation of the method, also in the sense of Ötting et al. [
4], and Marge et al. [
5]. Therefore, respective user studies in both a lab environment (mainly in short-term interactions) and “in the wild” (long-term interactions) are planned.