Next Article in Journal
Pruning Fuzzy Neural Network Applied to the Construction of Expert Systems to Aid in the Diagnosis of the Treatment of Cryotherapy and Immunotherapy
Next Article in Special Issue
The Supermoral Singularity—AI as a Fountain of Values
Previous Article in Journal
Mining Temporal Patterns to Discover Inter-Appliance Associations Using Smart Meter Data
Previous Article in Special Issue
Global Solutions vs. Local Solutions for the AI Safety Problem
 
 
Communication
Peer-Review Record

Multiparty Dynamics and Failure Modes for Machine Learning and Artificial Intelligence

Big Data Cogn. Comput. 2019, 3(2), 21; https://doi.org/10.3390/bdcc3020021
by David Manheim
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Big Data Cogn. Comput. 2019, 3(2), 21; https://doi.org/10.3390/bdcc3020021
Submission received: 28 February 2019 / Revised: 24 March 2019 / Accepted: 29 March 2019 / Published: 5 April 2019
(This article belongs to the Special Issue Artificial Superintelligence: Coordination & Strategy)

Round 1

Reviewer 1 Report

The paper needs to be rewritten in a more elaborative ways. At present, there is no clear motivation towards the ongoing work and how they achieved. Authors should modify the paper by adding (a) contribution (b) motivation and (c) objective in to the revised version of the manuscript. Further, the gaps of the existing work and their contribution is not highlighting in the manuscript. Conclusion section is missing in the paper. Authors should addressed the above points into the manuscript deeply to improve the manuscript.

 


Author Response

Dear Reviewer 1,

Thank you for your suggestions for clarifying the points of the paper, and on the structure. In addition to various textual changes, clarifications of individual points, and requests to include specific points that were requested by other reviewers, I have attempted to restructure the paper a bit and clarify the goals of the paper, as detailed below in the responses and specific actions taken.


>At present, there is no clear motivation towards the ongoing work and how they achieved.

The paper is a communication that provides an overview of an issue, rather than presenting ongoing work. I have modified the paper by adding sentences explaining this to both the abstract, and the body of the paper.


> Authors should modify the paper by adding (a) contribution (b) motivation and (c) objective in to the revised version of the manuscript. 

These are hopefully now made clear in the newly titled "Motivation and Contribution" section. There the paper states, "By outlining how and why these multi-agent failures can occur, and providing an overview of approaches that could be developed for mitigating them, the paper hopes to spur system designers to explicitly consider these failure modes in designing systems, and urge caution." I have also noted more explicitly some of the ancillary contributions.


> Further, the gaps of the existing work and their contribution is not highlighting in the manuscript. 

I was unfortunately unclear about the structure in the earlier draft. Because this is an overview, the ways in which the extant literature fails to address the previously unclear problems is deferred until after the explanation of those problems. The now-revised discussion section, clearly addresses the literature and how it fails to consider the failure modes and dynamics discussed in the paper. Hopefully this addresses your concern.


>Conclusion section is missing in the paper.

The final subsection has now been rewritten to more clearly discuss conclusions, and has been retitled to reflect that.

Reviewer 2 Report

The paper presents a theoretical work on the failure mode of the multi-agent systems, where the author identifies five failure modes, describes them and presents examples. The paper is well structured, and the language and grammar are good with only minor spelling mistakes. The multi-agent failure modes are described in the connection to the single-agent systems and the examples build on already existing literature to explain the failures of multi-agent AI systems.


My minor concerns about the paper are the following.

- Present all of the values in equations 1 and 2 (M, G) for Model 1.

- Same for the Model 3 in equation 4.

- The titles of Models are not consistent. Choose a style and use it throughout the paper.

- In line 277 you cite the literature, but do not provide the reference. Or is the paper itself meant to be the literature that you’re referring?

- The example in line 317 is more aligned with the model 3.1 where one agent performs the way to influence the workings of the opponent agents in a negative way.

- How does model 4.2 differ from model 4.1? Altering the distribution of input data is generating new input data, or is it? Please provide additional explanation for this.

- One part of model 5.3 seems the same as model 4.1 - manipulating the input data. If not, please provide additional explanation.

- There are some research papers done on the topic of safety in multi-agent AI systems, which the author did not cite. It would be interesting on how the existing research corresponds to the failure models identified in this paper.

[1] Shalev-Shwartz, S., Shammah, S. and Shashua, A., 2016. Safe, multi-agent, reinforcement learning for autonomous driving. arXiv preprint arXiv:1610.03295.

[2] Worley III, G.G., 2018. Robustness to fundamental uncertainty in AGI alignment. arXiv preprint arXiv:1807.09836.


Author Response

Dear Reviewer 2,


Thank you for your helpful suggestions. In addition to responding to them, I have retitled and re-organized some of the sections in response to feedback from other reviewers, and done further editing to clarify and streamline the text. 


To your specific points:

> Present all of the values in equations 1 and 2 (M, G) for Model 1.

I defined these in the introduction, but have now repeated the definitions of M_i and G_i for model 1 so that it is clearer. I hope this is sufficiently clear without needing to further repeat the definitions in each example.

> The titles of Models are not consistent. Choose a style and use it throughout the paper.

I have changed the formats of the examples to be uniform, and titled them so that the differences are (hopefully) clearer. 


>The example in line 317 is more aligned with the model 3.1 where one agent performs the way to influence the workings of the opponent agents in a negative way.

This has been clarified in the paper in both the example and regarding Model 3.1. The difference is admittedly small, but in financial markets canceled transactions are effectively spoofing, and are illegal, rather than bona-fide transactions that mislead other agents who incorrectly interpret them. As an example of how this spoofing occurs, if one system posts a buy order for 1m shares at $10 in a market without any shares available, when the market price is $5/share in markets that have shares available, but cancels it immediately, and another agent buys shares to fill the (already-canceled) order - perhaps even from the system that entered the false price, selling at now-inflated prices. The buying system in this case hasn't misinterpreted the implications of an order, it has been given a false piece of evidence that, were it true, would not have led to a loss. By acting on the false information generated, it has been exploited.


>How does model 4.2 differ from model 4.1? Altering the distribution of input data is generating new input data, or is it? Please provide additional explanation for this.

4.1 and 4.2 differ in that there are cases where arbitrary spoofed data cannot be created, or it would be prohibitively expensive to do so, but true data can be hidden, or the reverse. This is now explained in the text.

>One part of model 5.3 seems the same as model 4.1 - manipulating the input data. If not, please provide additional explanation.

This is now clarified; 5 is discussing modification at the point where the system trains on the data, which is potentially less expensive and less noticeable than spoofing data in the wild before ingestion.


There are some research papers done on the topic of safety in multi-agent AI systems, which the author did not cite...

I was unaware of the Worley paper, and have now cited it. I have also added other very recent papers that touch on multi-agent systems, including a new paper from DeepMind, https://arxiv.org/abs/1903.00742. These engage with the need for cooperation, but fail to address the need to prevent failures. 

The Shalev-Schwartz et al. paper is cited, and as noted is susceptible to some of these failures. It provides a safety envelope, rather than a method to align the system's goals. (I discussed this with one of the authors several months ago, and while the paper doesn't discuss it, it is possible for these "safe" systems to be manipulated into being unable to go to the desired destination, for example.) This is effectively providing a limited optimization, a strategy which is also discussed. Unfortunately, doing so relies on a provably valid "physics model" to ensure the envelope is valid, and so the method can address a very limited set of cases. Christiano, Shlegaris, and Amodei have a more generalized version of a similar idea, and the paper they have published is also cited. (They are seemingly in the process of publishing more on the topic, but I have not seen that work.)

Reviewer 3 Report

The language usage throughout this paper need to be improved, the author should do some proofreading on it. 

It is better for the authors to discuss the scalability of the method


Author Response

Dear Reviewer 3,


Thank you for your suggestions. In addition to various organizational changes, clarifications of individual points, and requests to include specific points that were requested by other reviewers, I have attempted to address the concerns raised to the best of my ability. 


The specific responses and modifications are below:

The language usage throughout this paper need to be improved, the author should do some proofreading on it. 

I have reread and revised the paper, and addressed a number of specific issues brought up by other reviewers, and hope this addresses the concern. I would also be happy to further revise any remaining specific points of confusion in coming rounds of revisions if they are suggested.


It is better for the authors to discuss the scalability of the method

I am unsure what this refers to, given that no specific method for addressing the outlined challenges was proposed in this paper. I have not addressed scalability of the methods discussed in papers cited explicitly, but there is discussion in several of those papers noting that finding safe methods is likely to be less scalable than unsafe methods. This is now mentioned in the paper as well. Hopefully this addresses your concerns.

Round 2

Reviewer 1 Report

Authors have not addressed my comments seriously. After reading the original manuscript and revised one, we observed that authors have just changed the section headings, but failed to satisfied it.

For example Section 1: previous one is "Introduction" and after revision, it is "Motivation and contribution". Similarly, subsection 3.2. "Model Failures and 426 Policy Failures" previous version and now in this version, it is "Section 4: Conclusion: Model Failures 467 and Policy Failures" and the text inside the manuscript remains identical mostly. 


Authors have failed to improve the manuscript. 

Author Response

See the attached PDF.

Author Response File: Author Response.pdf

Reviewer 2 Report

I have no further objections to the publishing of the paper.

Author Response

Dear Reviewer 2,


Thank you for your helpful comments earlier, and I am glad I was able to address the concerns. Because other reviewers had a few remaining concerns, I have further restructured the paper, and included several new references and examples for the single-agent case.


Thank you again.

Reviewer 3 Report

Thanks for revising. This paper can be accepted once address the following issues

*Overall, the basic background is not introduced well, where the notations are not illustrated much clear. I recommend the authors to employ certain intuitive examples to elaborate the essential notations. 

*Generally, the paper looks good, however, some important literature on dynamics modeling and learning methods that can be potentially used for this problem are missing and are highly appreciated to be cited, e.g., "Recognizing complex activities by a probabilistic interval-based model", "From action to activity: Sensor-based activity recognition".


Author Response

Dear Reviewer 3,


Thank you for your constructive and helpful comments. I have now included a number of concrete examples in the very beginning of the paper, from Krakovna's list of failures - including exploiting faulty physics engines, manipulating the wrong objects, etc. This will hopefully concretize the problem for the single agent case, and make the types of failures being discussed clearer.


In addition, I appreciate the pointers to "activity recognition" as an approach - I was previously unaware of this avenue, and it seems like a promising idea for minimizing many of the types of surprises identified. It also usefully reframes some of the identified problems with "goals," since identifying the action prevents many of the concrete failure modes above. For this reason, I cited the first paper as a different understanding of part of the problem, where "a complex activity can often be performed in several different ways," and the second paper as an avenue for addressing the problem.


I have also restructured the paper slightly further to address another reviewer's concerns.

Thank you again.

Round 3

Reviewer 1 Report

The present version of the manuscript can be accepted 

Back to TopTop