Exploring Data-Driven Components of Socially Intelligent AI through Cooperative Game Paradigms
Abstract
:1. Introduction
2. Prior Work
2.1. Social AI in Human–Robot Interaction
2.2. Interactive Speech Systems
2.3. Virtual Agents and Virtual Reality
2.4. Mental Models of AI and Game Theory
3. Methods
3.1. Cooperative Game Environment
3.2. Experiments
4. Results
4.1. Speech Hierarchy Development
4.2. Interaction Component Development
- Responsiveness to human player communication (including direct questions);
- More natural sentence variation;
- AI awareness of own recent speech (e.g., not repeating itself too often);
- AI commentary about direct player interactions (e.g., sharing food), rather than just game environment;
- More suggestive speech from the AI (e.g., talking about future plans).
- Priority 1:
- a.
- Direct human questions: any ASR response to those, and
- b.
- Any “critical” speech that cannot be omitted based on social norms (not replying to “hello”).
- Priority 2:
- a.
- Fight-related content (attacking, defending), and
- b.
- Existence-related content (e.g., dying and starving).
- Priority 3:
- a.
- Any non-answer ASR response to human speech (i.e., comment not a direct question),
- b.
- Situational content not related to fighting or existence, and
- c.
- Anything else that does not fit in priority 1 or 2.
4.3. Deep Learning Models for Interaction Planning
4.4. Facial and Gestural Recognition
5. Discussion
5.1. Main Summary
5.2. Future Work and Broader Impact
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Völkel, S.T.; Schneegass, C.; Eiband, M.; Buschek, D. What is “intelligent” in intelligent user interfaces? A meta-analysis of 25 years of IUI. In Proceedings of the 25th International Conference on Intelligent User Interfaces (IUI), Cagliari, Italy, 17–20 March 2020; pp. 477–487. [Google Scholar]
- Gero, K.I.; Ashktorab, Z.; Dugan, C.; Pan, W.; Johnson, J.; Geyer, W.; Ruiz, M.; Miller, S.; Millen, D.R.; Campbell, M.; et al. Mental models of AI agents in a cooperative game setting. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI), Oahu, HI, USA, 25–30 April 2020; pp. 1–12. [Google Scholar]
- Brennan, S.E. The grounding problem in conversations with and through computers. In Social and Cognitive Psychological Approaches to Interpersonal Communication; Fussell, S.R., Kreuz, R.J., Eds.; Lawrence Erlbaum: Hillsdale, NJ, USA, 1991; pp. 201–225. [Google Scholar]
- Enfield, N. How we talk. In The Inner Workings of Conversation; BasicBooks: New York, NY, USA, 2017. [Google Scholar]
- Koutsombogera, M.; Vogel, C. Speech pause patterns in collaborative dialogs. In Innovations in Big Data Mining and Embedded Knowledge; Esposito, A., Esposito, A.M., Jain, L., Eds.; Intelligent Systems Reference Library; Springer: Cham, Switzerland, 2019; pp. 99–115. [Google Scholar]
- Knapp, M.; Hall, J. Nonverbal Communication in Human Interaction; Thomas Learning: Wadsworth, OH, USA; Boston, MA, USA, 2010. [Google Scholar]
- Tseng, S.H.; Hsu, Y.H.; Chiang, Y.S.; Wu, T.Y.; Fu, L.C. Multi-human spatial social pattern understanding for a multi-modal robot through nonverbal social signals. In Proceedings of the 23rd IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), Edinburgh, Scotland, 25–29 August 2014; pp. 531–536. [Google Scholar]
- Miller, T. Explanation in artificial intelligence: Insights from the social sciences. Artif. Intell. 2019, 267, 1–38. [Google Scholar] [CrossRef]
- Neff, G.; Nagy, P. Automation, algorithms, and politics| talking to Bots: Symbiotic agency and the case of Tay. Int. J. Commun. 2016, 10, 17. [Google Scholar]
- Crosby, M. Building thinking machines by solving animal cognition tasks. Minds Mach. 2020, 30, 589–615. [Google Scholar] [CrossRef]
- Honig, S.; Oron-Gilad, T. Understanding and resolving failures in human-robot interaction: Literature review and model development. Front. Psychol. 2018, 9, 861. [Google Scholar] [CrossRef] [Green Version]
- Oh, C.S.; Bailenson, J.N.; Welch, G.F. A systematic review of social presence: Definition, antecedents, and implications. Front. Robot. AI 2018, 5, 114. [Google Scholar] [CrossRef] [Green Version]
- Doyle, P.R.; Clark, L.; Cowan, B.R. What do we see in them? identifying dimensions of partner models for speech interfaces using a psycholexical approach. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI), Yokohama, Japan, 8–13 May 2021; pp. 1–14. [Google Scholar]
- Dennett, D. Intentional systems. J. Philos. 1971, 68, 87–106. [Google Scholar] [CrossRef]
- Thomaz, A.; Hoffman, G.; Cakmak, M. Computational human-robot interaction. Found. Trends Robot. 2016, 4, 105–223. [Google Scholar]
- Chesher, C.; Andreallo, F. Robotic faciality: The philosophy, science and art of robot faces. Int. J. Soc. Robot. 2021, 13, 83–96. [Google Scholar] [CrossRef]
- Bennett, C.C.; Sabanovic, S.; Fraune, M.R.; Shaw, K. Context congruency and robotic facial expressions: Do effects on human perceptions vary across culture? In Proceedings of the 23rd IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), Edinburgh, Scotland, 25–29 August 2014; pp. 465–470. [Google Scholar]
- Lisetti, C.L.; Brown, S.M.; Alvarez, K.; Marpaung, A.H. A social informatics approach to human-robot interaction with a service social robot. IEEE Trans. Syst. Man Cybern. Part C 2004, 34, 195–209. [Google Scholar] [CrossRef]
- Muthugala, M.V.J.; Jayasekara, A.B.P. Enhancing user satisfaction by adapting Robot’s perception of uncertain information based on environment and user feedback. IEEE Access 2017, 5, 26435–26447. [Google Scholar] [CrossRef]
- Leite, I.; Pereira, A.; Mascarenhas, S.; Martinho, C.; Prada, R.; Paiva, A. The influence of empathy in human–robot relations. Int. J. Hum. Comput. Stud. 2013, 71, 250–260. [Google Scholar] [CrossRef]
- Correia, F.; Alves-Oliveira, P.; Maia, N.; Ribeiro, T.; Petisca, S.; Melo, F.S.; Paiva, A. Just follow the suit! trust in human-robot interactions during card game playing. In Proceedings of the 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), New York, NY, USA, 26–31 August 2016; pp. 507–512. [Google Scholar]
- Fraune, M.R.; Oisted, B.C.; Sembrowski, C.E.; Gates, K.A.; Krupp, M.M.; Šabanović, S. Effects of robot-human versus robot-robot behavior and entitativity on anthropomorphism and willingness to interact. Comput. Hum. Behav. 2020, 105, 106220. [Google Scholar] [CrossRef]
- Pearl, C. Designing Voice User Interfaces: Principles of Conversational Experiences; O’Reilly: Sebastopol, CA, USA, 2016. [Google Scholar]
- Bernsen, N.O.; Dybkjaer, H.; Dybkjaer, L. Designing Interactive Speech Systems. From First Ideas to User Testing; Springer: New York, NY, USA, 1998. [Google Scholar]
- International Telecommunication Union. Parameters Describing the Interaction with Multimodal Dialogue Systems; ITU-T Suppl. 25 to P-Series; International Telecommunication Union: Geneva, Switzerland, 2011. [Google Scholar]
- Buisine, S.; Martin, J.C. The effects of speech-gesture cooperation in animated agents’ behavior in multimedia presentations. Interact. Comput. 2007, 19, 484–493. [Google Scholar] [CrossRef] [Green Version]
- Manson, J.H.; Bryant, G.A.; Gervais, M.M.; Kline, M.A. Convergence of speech rate in conversation predicts cooperation. Evol. Hum. Behav. 2013, 34, 419–426. [Google Scholar] [CrossRef] [Green Version]
- Reitter, D.; Moore, J.D. Predicting success in dialogue. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, 25–27 June 2007; pp. 808–815. [Google Scholar]
- Lai, C.; Carletta, J.; Renals, S. Modelling participant affect in meetings with turn-taking features. In Proceedings of the Workshop of Affective Social Speech Signals, Grenoble, France, 22–23 August 2013. [Google Scholar]
- Möller, S. Quality of Telephone-Based Spoken Dialogue Systems; Springer Science & Business Media: Bochum, Germany, 2004. [Google Scholar]
- Munteanu, C.; Clark, L.; Cowan, B.; Schlögl, S.; Torres, M.I.; Edwards, J.; Murad, C.; Aylett, M.; Porcheron, M.; Candello, H.; et al. CUI: Conversational user interfaces: A workshop on new theoretical and methodological perspectives for researching speech-based conversational interactions. In Proceedings of the 25th International Conference on Intelligent User Interfaces Companion (IUI), Cagliari, Italy, 17–20 March 2020; pp. 15–16. [Google Scholar]
- Anderson, A.H.; Bader, M.; Bard, E.G.; Boyle, E.; Doherty, G.; Garrod, S.; Isard, S.; Kowtko, J.; McAllister, J.; Miller, J.; et al. The HCRC map task corpus. Lang Speech 1991, 34, 351–366. [Google Scholar] [CrossRef]
- Slater, M. Place illusion and plausibility can lead to realistic behaviour in immersive virtual environments. Phil. Trans. Biol. Sci. 2009, 364, 3549–3557. [Google Scholar] [CrossRef] [Green Version]
- Gonzalez-Franco, M.; Lanier, J. Model of illusions and virtual reality. Front. Psychol. 2017, 8, 1125. [Google Scholar] [CrossRef]
- Edwards, C.; Edwards, A.; Stoll, B.; Lin, X.; Massey, N. Evaluations of an artificial intelligence instructor’s voice: Social Identity Theory in human-robot interactions. Comput. Hum. Behav. 2019, 90, 357–362. [Google Scholar] [CrossRef]
- Slater, M.; Antley, A.; Davison, A.; Swapp, D.; Guger, C.; Barker, C.; Pistrang, N.; Sanchez-Vives, M.V. A virtual reprise of the Stanley Milgram obedience experiments. PLoS ONE 2006, 1, e39. [Google Scholar] [CrossRef]
- Rauchbauer, B.; Nazarian, B.; Bourhis, M.; Ochs, M.; Prévot, L.; Chaminade, T. Brain activity during reciprocal social interaction investigated using conversational robots as control condition. Phil. Trans. Roy. Soc. Lond. B 2019, 374, 20180033. [Google Scholar] [CrossRef] [Green Version]
- Bennett, C.C. Evoking an intentional stance during human-agent social interaction: Appearances can be deceiving. In Proceedings of the 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN), Vancouver, BC, Canada, 8–12 August 2021; pp. 362–368. [Google Scholar]
- Hofer, M.; Hartmann, T.; Eden, A.; Ratan, R.; Hahn, L. The role of plausibility in the experience of spatial presence in virtual environments. Front. Virtual Real. 2020, 1, 2. [Google Scholar] [CrossRef]
- Abdul, A.; Vermeulen, J.; Wang, D.; Lim, B.Y.; Kankanhalli, M. Trends and trajectories for explainable, accountable and intelligible systems: An HCI research agenda. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI), Montreal, QC, Canada, 21–26 April 2018; pp. 1–18. [Google Scholar]
- Kolokoltsov, V.N.; Malafeyev, O.A. Understanding Game Theory: Introduction to the Analysis of Many Agent Systems with Competition and Cooperation; World Scientific Publishing: Hackensack, NJ, USA, 2020. [Google Scholar]
- Lim, S.; Reeves, B. Computer agents versus avatars: Responses to interactive game characters controlled by a computer or other player. Int. J. Hum. Comput. Stud. 2010, 68, 57–68. [Google Scholar] [CrossRef]
- Bianchi, F.; Grimaldo, F.; Bravo, G.; Squazzoni, F. The peer review game: An agent-based model of scientists facing resource constraints and institutional pressures. Scientometrics 2018, 116, 1401–1420. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Jesso, S.T.; Kennedy, W.G.; Wiese, E. Behavioral cues of humanness in complex environments: How people engage with human and artificially intelligent agents in a multiplayer videogame. Front. Robot. AI 2020, 7, 531805. [Google Scholar] [CrossRef] [PubMed]
- Correia, F.; Alves-Oliveira, P.; Ribeiro, T.; Melo, F.; Paiva, A. A social robot as a card game player. In Proceedings of the 13th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Salt Lake City, UT, USA, 5–9 October 2017; Volume 13, p. 1. [Google Scholar]
- Völkel, S.T.; Schödel, R.; Buschek, D.; Stachl, C.; Winterhalter, V.; Bühner, M.; Hussmann, H. Developing a personality model for speech-based conversational agents using the psycholexical approach. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI), Oahu, HI, USA, 25–30 April 2020; pp. 1–14. [Google Scholar]
- Nafcha, O.; Shamay-Tsoory, S.; Gabay, S. The sociality of social inhibition of return. Cognition 2020, 195, 104108. [Google Scholar] [CrossRef] [PubMed]
- Klein, R.M.; MacInnes, W.J. Inhibition of return is a foraging facilitator in visual search. Psychol. Sci. 1999, 10, 346–352. [Google Scholar] [CrossRef]
- Xia, K.; Huang, J.; Wang, H. LSTM-CNN architecture for human activity recognition. IEEE Access 2020, 8, 56855–56866. [Google Scholar] [CrossRef]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Siebert, J.; Joeckel, L.; Heidrich, J.; Nakamichi, K.; Ohashi, K.; Namba, I.; Yamamoto, R.; Aoyama, M. Towards guidelines for assessing qualities of machine learning systems. In Proceedings of the International Conference on the Quality of Information and Communications Technology (QUATIC), Online Conference, 8–11 September 2020; pp. 17–31. [Google Scholar]
- Bennett, C.C.; Doub, T.W.; Selove, R. EHRs connect research and practice: Where predictive modeling, artificial intelligence, and clinical decision support intersect. Health Policy Technol. 2012, 1, 105–114. [Google Scholar] [CrossRef] [Green Version]
- Ekman, P.; Friesen, W.V. Unmasking the Face: A Guide to Recognizing Emotions from Facial Clues; Malor Books: Los Altos, CA, USA, 2003. [Google Scholar]
- Blom, P.M.; Bakkes, S.; Spronck, P. Modeling and adjusting in-game difficulty based on facial expression analysis. Entertain. Comput. 2019, 31, 100307. [Google Scholar] [CrossRef]
- Mistry, K.; Jasekar, J.; Issac, B.; Zhang, L. Extended LBP based facial expression recognition system for adaptive AI agent behaviour. In Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–7. [Google Scholar]
- Möller, S. Perceptual quality dimensions of spoken dialogue systems: A review and new experimental results. In Proceedings of the 4th European Congress on Acoustics (Forum Acusticum Budapest 2005), Budapest, Hungary, 29 August–2 September 2005; pp. 2681–2686. [Google Scholar]
- Moore, R.K. From talking and listening robots to intelligent communicative machines. In Robots That Talk and Listen; Witz, J.M., Ed.; De Gruyter: Boston, MA USA, 2015. [Google Scholar]
- Bowden, K.; Wu, J.; Oraby, S.; Misra, A.; Walker, M. SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC), Miyazaki, Japan, 7–12 May 2018. [Google Scholar]
- Chakraborty, B.K.; Sarma, D.; Bhuyan, M.K.; MacDorman, K.F. Review of constraints on vision-based gesture recognition for human–computer interaction. IET Comput. Vis. 2018, 12, 3–15. [Google Scholar] [CrossRef]
- Riek, L.D. Wizard of oz studies in HRI: A systematic review and new reporting guidelines. J. Hum.-Robot. Interact. 2012, 1, 119–136. [Google Scholar] [CrossRef] [Green Version]
- Lee, H.R.; Šabanović, S. Culturally variable preferences for robot design and use in South Korea, Turkey, and the United States. In Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI), Bielefeld, Germany, 3–6 March 2014; pp. 17–24. [Google Scholar]
- Anagnostis, A.; Benos, L.; Tsaopoulos, D.; Tagarakis, A.; Tsolakis, N.; Bochtis, D. Human activity recognition through recurrent neural networks for human–robot interaction in agriculture. Appl. Sci. 2021, 11, 2188. [Google Scholar] [CrossRef]
- Lee, H.R.; Šabanović, S.; Chang, W.L.; Nagata, S.; Piatt., J.; Bennett, C.; Hakken, D. Steps toward participatory design of social robots: Mutual learning with older adults with depression. In Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI), Vienna, Austria, 6–9 March 2017; pp. 244–253. [Google Scholar]
Type | Utterance Category | Average | Median |
---|---|---|---|
ASR | “hello.*”: [ | 1.67 | 1 |
ASR | “.*food.*”: [ | 1.83 | 1.5 |
ASR | “.* monster.*near.*”: [ | 2.33 | 2.5 |
ASR | “.*attack.*”: [ | 1.67 | 2 |
ASR | “.*where.*go.*”: [ | 1.17 | 1 |
ASR | “.*night.*”: [ | 2.67 | 3 |
ASR | “.*campfire.*”: [ | 2.50 | 3 |
ASR | “.*help.*”: [ | 1.17 | 1 |
Self-Generated | “inform_morning”: [ | 3.00 | 3 |
Self-Generated | “inform_starving”: [ | 1.50 | 1.5 |
Self-Generated | “inform_defense”: [ | 1.67 | 2 |
Self-Generated | “inform_torch”: [ | 2.83 | 3 |
Self-Generated | “inform_near_light”: [ | 2.50 | 3 |
Self-Generated | “inform_only_axe”: [ | 2.67 | 3 |
Self-Generated | “inform_a_few_monsters”: [ | 2.33 | 2.5 |
Self-Generated | “inform_generic_expression”: [ | 2.83 | 3 |
RF | GB | SVM | DL | |||||
---|---|---|---|---|---|---|---|---|
Type | Acc | AUC | Acc | AUC | Acc | AUC | Acc | AUC |
Night | 0.81 | 0.9753 | 0.78 | 0.9182 | 0.75 | 0.7697 | 0.87 | 0.9224 |
Resources | 0.42 | 0.5117 | 0.46 | 0.5175 | 0.47 | 0.4792 | 0.59 | 0.6446 |
Monster | 0.79 | 0.8980 | 0.75 | 0.8562 | 0.69 | 0.7907 | 0.85 | 0.9214 |
Build Stuff | 0.89 | 0.9561 | 0.83 | 0.9539 | 0.82 | 0.9621 | 0.88 | 0.9033 |
Social Interaction | 0.52 | 0.5080 | 0.52 | 0.4844 | 0.55 | 0.6565 | 0.50 | 0.5015 |
External Events | 0.86 | 0.9656 | 0.81 | 0.9604 | 0.81 | 0.9632 | 0.84 | 0.9546 |
Average | 0.72 | 0.8025 | 0.69 | 0.7818 | 0.68 | 0.7702 | 0.76 | 0.8080 |
Name | Description | Weighted | Threshold | Accuracy |
---|---|---|---|---|
Criterion 1 | Choose Max Probability | x | - | 9.2 |
o | - | 29.7 | ||
Criterion 2 | Choose Max above Threshold | x | 60 | 13.7 |
x | 70 | 19.0 | ||
x | 80 | 24.5 | ||
o | 60 | 42.9 | ||
o | 70 | 43.0 | ||
o | 80 | 34.7 | ||
Criterion 3 | Crit. #1 + First/Last Frame Match | x | - | 34.7 |
o | - | 62.8 | ||
Criterion 4 | Crit. #2 + First/Last Frame Match | x | 60 | 44.9 |
x | 70 | 50.7 | ||
x | 80 | 56.5 | ||
o | 60 | 70.2 | ||
o | 70 | 70.2 | ||
o | 80 | 66.1 | ||
Criterion 5 | Combined Criteria Pred Model | - | - | 80.1 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bennett, C.; Weiss, B.; Suh, J.; Yoon, E.; Jeong, J.; Chae, Y. Exploring Data-Driven Components of Socially Intelligent AI through Cooperative Game Paradigms. Multimodal Technol. Interact. 2022, 6, 16. https://doi.org/10.3390/mti6020016
Bennett C, Weiss B, Suh J, Yoon E, Jeong J, Chae Y. Exploring Data-Driven Components of Socially Intelligent AI through Cooperative Game Paradigms. Multimodal Technologies and Interaction. 2022; 6(2):16. https://doi.org/10.3390/mti6020016
Chicago/Turabian StyleBennett, Casey, Benjamin Weiss, Jaeyoung Suh, Eunseo Yoon, Jihong Jeong, and Yejin Chae. 2022. "Exploring Data-Driven Components of Socially Intelligent AI through Cooperative Game Paradigms" Multimodal Technologies and Interaction 6, no. 2: 16. https://doi.org/10.3390/mti6020016
APA StyleBennett, C., Weiss, B., Suh, J., Yoon, E., Jeong, J., & Chae, Y. (2022). Exploring Data-Driven Components of Socially Intelligent AI through Cooperative Game Paradigms. Multimodal Technologies and Interaction, 6(2), 16. https://doi.org/10.3390/mti6020016