1. Introduction
Within the framework that takes concepts to be mental representations, one of the central and current debates concerns their format or vehicle. According to the traditional amodal approach, concepts are couched in a symbolic format that bears no structural similarity to perceptual states [
1,
2]. According to the modal or grounded approach, in contrast, concepts are couched in perceptual representations that become stored during perception and action, and can be later reenacted in absence of the stimuli that originally produced them [
3,
4].
1 Thus, a way of distinguishing amodal and modal vehicles relies on their relation to perceptual and motor systems, but besides this distinction, there is no general consensus on the specific characteristics of each type of format.
Some authors distinguish both types of representations in terms of the type of input to which they respond. For example, Dove states “[…] modal-specificity is defined with regard to input. Representations contained within a mechanism that has been shaped by natural selection to handle inputs from multiple modalities are amodal by definition” [
5] (p. 424). Other authors appeal to the relation between a concept’s structure and its content, and propose a distinction similar to that between icons and symbols. For example, Fodor identifies iconic representations with those that can be divided in any way obtaining a part that represents part of the content, whereas discursive symbolic representations are those that exhibit a canonical decomposition in which not every part represents part of the content [
6]. In a similar vein, Martin opposes a “highly abstract, ‘amodal,’ language-like propositional format […]” versus a “[…] depictive, iconic, picture-like format […]” [
7] (p. 983).
2 Other versions are based on evidence of neural organization, and claim that modal concepts consist of representations located on modality-specific pathways, while amodal concepts reside in different independent systems [
12]. Sometimes amodal representations are characterized in purely negative terms, as having a format that “differs from the format or formats of perceptual and motor representations” [
13] (p. 1090), which has in turn elicited responses such that they constitute “black holes in conceptual space” [
12] (p. 1127). It has also been suggested that the properties of concepts’ vehicles are still unknown and in need of empirical research, yet the debate can advance without specifying those characteristics [
13], and even that the question of format lacks “practical significance”, given that we have no procedure for determining the format of the neural code [
7].
I agree in that the debate is advancing, even though the characteristics of each type of representation remain underspecified. While the grounded or modal approach is still growing and is supported by an increasing amount of empirical evidence (for a review see [
14], for discussion of the strength of the evidence see [
15]), there is also evidence supporting amodal views (for reviews see [
5,
13]). Moreover, there are new hybrid and pluralist views that postulate both modal and amodal concepts [
5,
16,
17,
18,
19]. The newer versions within each approach also search for biological plausibility and take into account recent neurological evidence. I think that it is therefore worth looking into the main criteria proposed for distinguishing modal and amodal representations. I will argue that all of them run into difficulties, either conceptual or of empirical application, and will also show that adopting different criteria leads to cross-classifications of representations taking as an example approximate number representations. This type of representations constitute an interesting example for several reasons. First, although they are sometimes held as direct evidence of amodal representations [
5,
20], their status as amodal is also discussed [
21]. Second, there is a vast amount of evidence concerning the neural localization of these representations, and several functional models of how we encode this particular type of magnitude, which have considerable indirect support. Thus, these representations can be analyzed by all of the criteria that I will discuss.
In
Section 2, I discuss the distinction in terms of the type of input to which the representations respond. I point out that this criterion offers an underspecified distinction, since it does not distinguish between multimodal and amodal representations. I also consider an alternative interpretation of this criterion that attributes a specific proprietary format to each modality, and places this criterion closer to one based on the neural systems of the representations that I analyze in
Section 4. In
Section 3, I analyze the isomorphism criterion. I consider two versions: an isomorphism between modal representations and their contents, and between modal representations and the perceptual states that produced them. I review several objections to the first interpretation, and argue that the second one resembles the neural one. Finally, in
Section 4, I present the neural base criterion, and claim that it is the clearer and stronger of the three. However, I argue that this criterion also faces difficulties. In particular, it presupposes ways of individuating modalities and of determining which systems are constitutively involved in conceptual tasks and which ones are auxiliary, both of which encounter empirical difficulties.
2. Input Specificity
Some authors suggest that what distinguishes modal and amodal representations is the type of input to which they respond. According to this criterion, modal systems respond to a specific class of input, such as sound waves or molecule shapes, whereas amodal systems respond uniformly to inputs from different perceptual modalities [
4,
5]. It is usually assumed that a distinctive feature of perceptual systems is that they respond to specific physical magnitudes. It is less clear how modalities are individuated [
21,
22]. I will address this issue in
Section 4. The main difficulty with this criterion is that, even assuming a clear individuation of modalities, it is underspecified. First, considering responding to perceptual inputs as a sufficient condition for modality would be too permissive. Given our physical constitution, every input route or modality of stimulation is perceptual. All of the information that we receive arrives to us through our senses, and as a consequence every representation is to a certain degree connected to perception. Thus, under this interpretation every representation would be trivially modal, and that does not seem to be the notion of modality endorsed by either side of the debate on conceptual format. As a consequence, proponents of this criterion emphasize a distinction between responding to a single class of input versus responding to several. However, this alternative does not allow to distinguish between amodal and multimodal representations, since both types represent several input classes [
23].
In fact, amodal and multimodal representations are sometimes conflated. Barsalou reports an informal survey carried out by Martin, who asked researchers what they meant by the term “amodal” and found out they actually meant “multimodal” [
12]. Sometimes the conflation is explicit. For example, Dehaene characterizes the horizontal intraparietal sulcus (hIPS), a brain area centrally involved in processing quantity, as amodal or plurimodal: “There are many indications that this region is indeed intimately involved with quantity, as opposed to other aspects of number. First of all, it responds to all the modalities of number presentation—whether the person is watching a set of dots, […] or looking at a symbol, like the Arabic numeral 3 or the written or the spoken word ‘three.’ This simple criterion, places the hIPS in what neuroscientists call ‘plurimodal’ or ‘amodal’ sectors of cortex—brain regions which, unlike the sensory areas, are not attached to a specific sensory modality, such as vision or touch, but lie at the meeting point of many input routes. If a brain region is to encode an abstract concept, one that is not tied to specific sensations, it is essential that it respond to all of the relevant modalities of stimulation in which the concept can be communicated” [
24] (p. 239).
The problem with adopting this criterion is that every representation would be amodal. The ways in which concepts can be communicated will most likely include language, even for concepts that are “tied to sensations”. Setting aside debates about phenomenal aspects of perceptual experience, and whether there are phenomenal concepts—e.g., concepts of specific shades of a given color—for a vast amount of sensations we not only have concepts, but also words to name those concepts. Thus, language constitutes one of the ways in which a concept can be communicated. As a consequence, every concept for which there is a term in natural language would be amodal, inasmuch as it responds to language plus the modality or modalities by which it can be communicated.
Considering the example of approximate number representations, the type of input to which they respond seems to be the main reason for considering them amodal. We have an ability to form approximate representations of the number of individuals in a set. This ability that is shared by human adults, human infants, and nonhuman animals (in particular pigeons, rats, and monkeys) has two distinctive characteristics, the distance and magnitude effects. First, the discriminability between two approximate representations is a function of their ratio, that is, the greater the numerical distance between two quantities, the easier it is to distinguish them. Second, for equal numerical distances, the smaller the magnitudes of the collections involved, the easier it is to distinguish them. These effects distinguish it from our ability to count and to represent exact quantities [
24].
Studies have shown that performance is not affected by the modality of the stimulus [
25]. Not only is performance in representing the approximate cardinality of visual and auditory stimuli similar, but there have also been studies of transfer from one modality to another [
26]. It has been argued that these results reveal an underlying amodal system [
5,
13,
20]. Dehaene explicitly refers to those representations as amodal [
24,
27,
28]. As the quote above shows, he stresses that they not only respond to inputs from several modalities, but also to symbolic notations and words. Thus, he identifies them as amodal inasmuch as they respond to several input routes. However, he also refers to this ability as a “sense”, talks of “number perception” (e.g., [
24] p. 237) and even refers to these representations as evolutionary precursors from our symbolic number representations [
28]. In order to be a precursor from a symbolic system, they must be different from symbols in a relevant sense. A candidate to make this difference is that they are analog, and in that sense different from amodal representations. This distinction involves an alternative criterion, based on the way the internal structure of the vehicles is related to what they represent, which I will discuss in the next section. Dehaene also stresses out evidence that the system’s operation precedes symbolic education, suggesting that the format it employs is not symbolic. The fact that they respond equally to different modalities makes them amodal by the input specificity criterion. However, there are other considerations that lead to consider them modal. As I will argue, taking into consideration the internal features of the representations and the relation they bear to perceptual states, they would be modal. That is, according to alternative criteria these representations seem to be modal.
I have argued that this criterion underspecifies the distinction, given that it collapses multimodal and amodal representations and, as a result, there is a sense in which every representation would be amodal. Furthermore, casting the distinction in terms of the input to which the representations respond does not mention any characteristics of the vehicles, but only a relational property with their inputs. A related way of presenting this criterion that takes into account internal properties of the vehicles—i.e., the nature of the code—is as opposing a single amodal code or format to the multiple codes of perceptual modalities. It is usually assumed that the fact that the senses respond to different classes of inputs supports the thesis that they have proprietary formats [
4,
12]. As Barsalou explains, it is likely that the representations in modality-specific pathways exhibit evolutionary constrains that led them to specialize in specific types of computations over different kinds of inputs. As a result, it is likely that the representations and processes are proprietary for each modality. Furthermore, supported theories of perceptual processing postulate distinct representational primitives [
4].
In order to classify approximate number representations by this last interpretation of the criterion, it is necessary to establish whether the way in which cardinality is encoded is shared by other magnitudes. If the codification is proprietary, i.e., specific for the codification of approximate numerosity and not shared by other systems, then these representations could belong to a distinct perceptual system. If, on the contrary, several magnitudes share the same type of code, then approximate numerosity would not have its own proprietary code. Still, these representations could be part of a broader perceptual system [
21]. Finally, if all of our conceptual representations share this type of code, then they would be the result of a single amodal encoding scheme.
Evidence suggests that there is a specific system dedicated to approximate numerosity detection and codification. As Jones argues, there have been found neurons that selectively respond to specific numerosities in macaque homologous of the human hIPS, whose functioning resembles selective neurons in the visual system, such as edge- and face-detectors. In that sense, the system for approximate number representation is closer to a perceptual than to an amodal system [
21]. Jones considers two possibilities: it could be itself a perceptual system dedicated to representing a specific numerical property in a proprietary code, or it could be a part of a broader perceptual system. I will address the particular way of encoding approximate numerosity in more detail in
Section 4.
Considerations of the type of code and the systems to which the representations belong bring this criterion closer to the other criteria, i.e., isomorphism and neural base. I will analyze these two criteria in the following sections and then turn to the interrelations between the three.
3. Isomorphism
It is sometimes assumed that the main criterion for distinguishing both types of representations is the relation they exhibit between some properties of their vehicles and some properties of their contents. For example, in a recent introduction to the state of the debate between grounded and amodal approaches, Mahon and Hickok present the distinction between the formats they postulate as depending on whether “there is an isomorphism between the format of conceptual representation and conceptual content” [
29] (p. 945). According to this criterion, whereas amodal representations are arbitrarily related to what they represent, modal representations, in contrast, are analog, i.e., they exhibit an isomorphism between their content and some properties of their vehicles [
29,
30]. In the context of this debate, isomorphism is not taken to be a strict one-to-one mapping between two sets or categories, but a “a structure-preserving function from elements in one domain onto elements in another, […] [that] may abstract away from detail in each domain” [
31] (p. 177). This function also allows for “a certain degree of imprecision” [
31].
There are two prevailing interpretations of this criterion. As Mahon and Hickok’s introduction shows, it is commonly taken to be a correspondence between the format of conceptual representations and what they represent [
4,
11,
30,
31]. Although this criterion is usually attributed to Barsalou, this was not his original claim. As he states it: “The structure of a perceptual symbol corresponds, at least somewhat, to the perceptual state that produced it” [
3] (p. 578). That is, there is a correspondence between the format of conceptual representations and the format of perceptual states. Barsalou clarifies that the intended correspondence is not between modal concepts and the world, because it might be the case that not every modal representation bears a structural resemblance to the physical world [
3] (Note 1). Another way to state this criterion is as to whether the perceptual brain states that give rise to conceptual thoughts are constitutive of those thoughts or not [
32]. The central tenet of the grounded approach claims that at least some of the perceptual states caused by encounters with category members become a constitutive part of the concept of such category. The modal representations that become stored for later reuse when thinking about, e.g., marshmallows, are a subset of the perceptual states that arise in interactions with marshmallows. That is, they are recorded patterns of activation in sensorimotor systems that can be reactivated to think about marshmallows off-line, i.e., in their absence. Thus, the format of modal representations exhibits a structural correspondence with the format of the brain states that produce them, not necessarily with what is represented in the world.
An objection to this interpretation of the criterion could be that a correspondence needs to be between two independent sets or entities. If conceptual thought constitutively involves reactivating the very same representations used during perception, in what sense are there two different things to bear the correspondence? The answer lies in the account of concept acquisition provided by the grounded approach. Not every perceptual representation becomes a part of a concept. There is a process of abstraction and selection (guided by attention and contextual factors such as specific goals). Thus, the two sets are not equivalent and therefore a structural correspondence can be established between them.
Still, some of Barsalou’s claims against the amodal approach might suggest that he also has in mind certain correspondence between the format of concepts and what they represent. One of the objections against amodal approaches calls into question the plausibility of representations that are arbitrarily related to their contents: “I continue to doubt that the brain contains amodal conceptual representations that are arbitrarily related to modality-specific representations and to their corresponding referents in the world” [
12] (p. 1127). This suggests that what distinguishes amodal from modal representations is not only their relation with sensorimotor states, but also with what they represent. This is problematic in at least two senses. First, the use of the term “arbitrary” in this context is confusing. It is generally accepted that, unlike analog representations, symbols are arbitrarily related to what they represent. This is the standard way for distinguishing them. For example, in characterizing amodal symbolic representations, Wilson and Foglia state: “Such representations are arbitrarily related to their referents because the way in which they are formed and deployed, along with their characteristics, bears no relationship to the physical and functional features of the referents” [
30].
Nevertheless, in this context, the role of such arbitrariness is misleading. An empty symbol could be about anything, and there’s nothing on its internal structure suggesting what content it will end up carrying. However, once it has acquired a content by whatever means are in place in the system, its relation to its content is not arbitrary anymore (it could be e.g., causal or nomological). That is, the semantic relation between a symbol and its referent is not necessarily arbitrary. Taking a paradigmatic example, words are arbitrarily related to their referents, in the sense of not bearing an internal correspondence or isomorphism to them, but are still systematically related to them [
29]. Moreover, modal representations could bear the same type of semantic relation to their referents that amodal symbols (e.g., an informational relation). This reveals a problem in trying to distinguish types of vehicles by involving semantic considerations. The difference must be cast in terms of properties of the format, i.e., properties of the encoding scheme. These include the nature of the basic representational units and their principles or modes of combination [
31].
A further problem of overlapping format and semantics is that it might suggest semantic consequences for the grounded approach that do not depend on the thesis concerning the format of representations, and thus, do not correspond to every version included under the approach. For example, Wilson and Foglia point out: “[…] not only do cognitive and perceptual mechanisms share representational states, but cognitive processing essentially re-activates sensorimotor areas to run perceptual simulations. A further implication is that perceptual symbols are not independent of the biological system that embodies them and the content conveyed would be likely to vary if intelligent systems varied physically” [
30]. This consequence depends on the specific theory of content endorsed. Assuming a version of a dual factor theory of content that is compatible with embodied versions of format (e.g., [
4]), the content carried by a certain representation not only depends on its internal structure, and thus in a way on its format, but also on its causal and nomological relations to the world.
Second, an appeal to isomorphism between the format of concepts and their contents is problematic in the more general sense that it does not specify the precise type of mapping between the properties of vehicles and their referents. Moreover, by considering that the mapping admits certain imprecision, it is not clear to what extent the structure of the referents is preserved.
3 Thus, the interpretation as isomorphism with the world is more problematic than that of isomorphism with perceptual states.
4 The adoption of the former interpretation might be related to the fact that it overlaps with the iconic–symbolic distinction (e.g., [
6]). Fodor, one of the main advocates of amodal vehicles differentiates iconic from discursive representations in terms of the relation between the internal structure of their vehicles and what they represent. Whereas every part of an iconic representation represents a part of its content, not every part of a discursive representation does. Discursive representations have canonical constituents, and iconic ones have mere parts. In other words, icons represent in a continuous analog way, and any decomposition of them yields parts that represent parts of the content. Discursive representations, in contrast, represent in a discrete manner and have a canonical decomposition.
Adopting the iconic–symbolic distinction as a criterion seems to presuppose that all the representations used by sensorimotor systems are iconic. However, it has been argued that isomorphism between vehicles and what they represent is neither a necessary nor a sufficient condition for modality. First, Machery claims that it might not be a necessary condition, since it is an open empirical question whether every modal representation is isomorphic with its content [
13]. The same cannot be argued with respect to the isomorphism to perceptual brain states. Until there is evidence of perceptual representations that do not exhibit a correspondence to the format of perceptual states, it seems that this offers a necessary condition for modality. Second, it has been argued that not every representation that is isomorphic with its content is modal. Machery mentions as an example the analog representations used by analog computers, that are isomorphic with their contents, but are not perceptual [
13]. Prinz provides two further examples: Wittgenstein’s suggestion that there is an isomorphism between true sentences and the world [
33], and the possibility of a language of thought that is isolated from perception and exhibits a one-to-one correspondence with the world [
4]. Analogously to the arguments against the necessity of isomorphism with the content, the arguments against its sufficiency have no weight against the interpretation as isomorphism with perceptual states. Understanding the isomorphism with perceptual brain states in the way I have presented it offers both a necessary and sufficient condition for modality. Yet, this interpretation has its own shortcomings. Since this interpretation is similar to the neural base criterion, I will address its difficulties in the next section. In the remaining of this section, I will analyze how the isomorphism criterion classifies approximate number representations.
Going back to the numerosity example, the isomorphism criterion suggests that approximate number representations are modal. For example, as Carey describes it, approximate cardinality is represented by “a physical magnitude that is roughly proportional to the number of individuals in the set being enumerated” [
34] (p. 118), that is, by a representation that is isomorphic to the quantity represented. Contemporary research in number representation supports this characterization. There are two main models of numerosity detection, both of which employ representations that are isomorphic to the brain states that produced them, as well as to what they represent.
The first one, the mode-control model by Meck and Church, proposes that numerosity is encoded sequentially in an iterative manner. The number of items in a set is encoded by an accumulator that receives an impulse pacemaker for each item in the set [
35]. As a result, the amount of energy registered by the accumulator is a function of the number of items. The second one, the numerosity detector model by Dehaene and Changeux, detects cardinality in parallel [
27]. It proposes an approximately constant number of active neurons for each item in a set, and summation units that pool activations from all the underlying units. Thus, the total neuronal activity estimates the number of items. The units that perform the summation operation become tuned to a selected range of values—i.e., a preferred numerosity—and in this way become numerosity detectors. Both models employ representations that add the number of items in the set in an analog way (either by an increased amount of energy in an accumulator or by an increased activation threshold in a neural cluster). Thus, approximate number representations seem to be analog and bear a correspondence both to what they represent, and to the perceptual states that produced them. Furthermore, their outputs seem to be couched in a proprietary code, specifically tuned to numerosity. They seem therefore modal.
Those who reject the isomorphism as a characterizing feature of modal representations, offer an alternative criterion based on the neural systems to which the representations belong. For example, Machery argues that, as the debate stands, the difference between both approaches is limited to the involvement of perceptual and motor systems in conceptual tasks. Thus, he proposes a purely negative characterization of amodal representations, as those recruited to solve conceptual tasks without involving perceptual and motor systems [
13]. As a result, the problem of distinguishing modal and amodal representations is shifted to that of individuating perceptual and motor systems, and, moreover, their involvement on conceptual tasks. This is not without difficulties, as I will discuss in the next section.
4. Neural Base
Barsalou claims that the grounded approach does not actually focus on the format of representations, but instead on the neural reuse of the brain states that underlie perception [
12]. That is, whatever the type of format that could be involved in perceptual streams, the central claim of the grounded approach is that conceptual thought consists (at least partially) on reactivating such streams. As a result the vehicles of conceptual representations will have the neural properties of the representations used in perception and action. In that sense, Barsalou seems to agree with Machery in that, whichever properties modal representations exhibit, what crucially distinguishes modal from amodal approaches is a commitment to the constitutive involvement of sensorimotor systems in conceptual tasks. This is not a new version of the grounded approach. As I pointed out in
Section 3, the grounded account of concept acquisition has always implied that conceptual thought involves the reactivation of representations that were first activated in perception and action systems (e.g., [
3,
36]). “Simulation”, “reenactment”, or “reuse” all refer to the same processes by which the conceptual system reactivates perceptual brain states in the absence of the entities or events that would have produced those states.
What is new is an explicit emphasis in that the central grounded tenet is not about a specific type of format, but instead about the specific systems involved in cognition. Does this imply that format is actually not relevant to the debate? I will address this question at the end of this section, after better analyzing the reuse as a criterion for distinguishing types of representations. The proposal leaves the specific properties of conceptual vehicles underspecified and states instead that, as the empirical research stands, it can only go as far as to evaluate the involvement of perceptual systems in conceptual thought, and not to thoroughly describe the representations involved [
12,
13]. As a result, this criterion poses the involvement of perceptual systems or neural reuse of their representations in conceptual tasks as a necessary and sufficient condition for modal representations. In
Section 2 and
Section 3, I have offered different interpretations for the type of input to which the representations respond and whether they are isomorphic, and at least under some of the interpretations, these two first criteria overlap with the neural base.
First, one of the ways of interpreting the type of input to which the representations respond, is opposing representations couched in a single domain general code, to representations couched in the multiple different codes of the perceptual modalities. This version is closely related to the notion of neural reuse, in that it opposes the reuse of the multiple codes used in perception and action to the transcription from those perceptual codes into a single amodal one. Thus, the difference between modal and amodal vehicles can be cast in terms of reuse or reenactment versus transcription. For example, as Vermeulen and colleagues describe: “The latter grounded or embodied cognition framework proposes that memory, language or judgments directly depend on simulations in sensory–motor systems that were active during the initial experience rather than on abstract symbols (i.e., amodal systems) that consist of transcriptions of neural states” [
37] (p. 468). Second, the interpretation of the isomorphism criterion as a resemblance or correspondence between the structure of conceptual representations and the structure of perceptual brain states also overlaps with the reuse criterion. This is so because the isomorphism between modal representations and perceptual states is a direct consequence of the grounded account of concept nature and acquisition, which centrally involves the thesis of neural reuse, reenactment, or simulation. These interpretations offer a way around the problems I presented for the first two criteria. However, adopting the neural base as a criterion also runs into difficulties.
The first problem is that the evidence concerning the involvement of sensorimotor systems in conceptual tasks is not clear cut. The issue is how to determine when the use of certain systems in a given task is constitutive or merely auxiliary. It is not the aim of this paper to discuss in detail the evidence offered to defend either approach, but it is useful to briefly look into it in order to show the empirical problems of the neural base as a criterion. The modal approach has been defended by both behavioral and neuroimaging results that suggest the involvement of perceptual systems in conceptual tasks, such as evaluating whether certain features belong to a given category or whether a sentence is sensible (for a detailed review see [
14]). This strategy has received several objections that point out to alternative interpretations of the evidence. For instance, in reply to this strategy and as part of his defense of the amodal approach, Machery advances the offloading hypothesis as an alternative interpretation of this evidence [
13]. According to this hypothesis, in order to solve some tasks we manipulate perceptual and motor representations. Machery speculatively suggests that this might be so because the information relevant to solve such tasks is not encoded in the conceptual system, or because certain tasks are more efficiently solved in that way.
An unintended consequence of this strategy is that it conflicts with the very criteria for identifying concepts. If perceptual and motor representations can be used to solve conceptual tasks by themselves, or in Barsalou’s terms in a “stand-alone” manner, then what prevents them from being conceptual? I’d like to propose a reason why they seem to be concepts. If the offloading occurs as a strategy to solve certain tasks more efficiently, and this becomes the default way to deal with such cases, this would qualify those representations as concepts or at least as parts of concepts by Machery’s own criterion, since it identifies concepts with the “bodies of knowledge that are used by default in the processes underlying the higher cognitive competences” [
38] (p. 11). It is not clear whether Machery is proposing that when offloading to the perceptual systems, the amodal systems are involved as well, or that the conceptual tasks are solved by resorting solely to perceptual and motor representations. In any case, both alternatives entail that modal representations used by default in certain tasks are at least parts of conceptual representations. If the evidence under discussion is of involvement of perceptual systems in conceptual tasks, then the offloading hypothesis is not sufficient for discarding modal conceptual representations.
At the same time, an analogous argument can be pressed against some alternative interpretations suggested by modal advocates to explain the evidence supporting amodal theories. There is a growing body of evidence for amodal systems. One type of evidence comes from patients suffering from semantic dementia. In these cases, damage to the anterior temporal lobe is followed by impaired conceptual abilities that seem to be general across modalities. Semantic dementia is associated with multimodal deficits and, as a consequence, is taken as evidence that amodal concepts are stored in the specific damaged areas [
13,
39,
40]. However, there are alternative interpretations of this evidence. As Kiefer and Pulvermüller review, it has been argued that, instead of hosting amodal representations, the anterior temporal lobe acts as a facilitation area for conceptual processing, while the conceptual representations per se remain hosted in perceptual systems [
41]. This brain region acts as a supramodal convergence zone that integrates information from different modalities and guides the reactivation of stored perceptual representations. In an analogous way to my objections to the offloading hypothesis, it is not clear what prevents this “supramodal” representations from being concepts. Both strategies exhibit the difficulties in discriminating the systems that constitutively host conceptual representations from those that are auxiliary in conceptual processing. As a consequence, they are not sufficient to discard either type of format.
A further problem in interpreting behavioral evidence is the “mimicry problem” [
42]. As Anderson argued, given that it is not possible to perform tests of types of representations in isolation, but only of pairs of representations and processes, it is always possible to exploit tradeoffs between representations and processes in order to offer alternative accounts of the evidence. He offered a formal proof showing that: “Given any representation-process pair, it is possible to construct other pairs with different representations whose behavior is equivalent to it. These pairs make up for differences in representation by assuming compensating differences in the processes” [
42] (p. 263). A usual response to this problem is the appeal to other notions for evaluating competing theories, such as their plausibility and complexity [
43]. Still, the assessment of a theory’s plausibility and complexity is not always straightforward, and the difficulties in interpreting behavioral evidence constitute a significant limitation for the empirical application of certain criteria.
5The difficulties in providing an empirical distinction between modal and amodal representations are closely related to the ones in providing characterizing features for being a concept. At a minimum, authors seem to be working with an operationalization notion of a concept as what is constitutively involved in solving certain types of tasks, such as categorizing or understanding language. This raises a problem: it is unlikely that any of those tasks is solved by resorting exclusively to concepts, and, as a consequence, they do not provide an ultimate empirical test for distinguishing the representations and systems that implement our conceptual repertoire from mere auxiliary or concomitant ones. Thus, although in principle the involvement of perceptual systems or neural reuse of their representations should be both a necessary and sufficient condition for modal representations and thus a criterion for distinguishing them from amodal ones, it is problematic to distinguish constitutive from auxiliary systems empirically.
The second problem is how to individuate perceptual systems. That is, even if there was a way for distinguishing constitutive from auxiliary systems, this criterion presupposes a way for identifying perceptual systems. As I pointed out in
Section 1, there seems to be an underlying assumption that perceptual systems are the ones that respond to a specific physical magnitude. However, as Machery argues, “perceptual systems are not delineated in a principled and uncontroversial manner” [
13] (p. 1091, see also [
22]). Although there are clear cases of at least parts of perceptual systems (such as the primary visual cortex or area V1), the limits of such systems are not clear. Thus, identifying the representations that are constitutively involved in conceptual tasks will not be sufficient as a criterion if there is no way to determine whether those representations belong to sensorimotor systems or not.
Finally, some authors argue that localizing the neural bases of conceptual processes is not informative of their format [
7,
44]. Martin argues that the format question has no “practical significance”, given that we have no procedure for determining the format of the neural code [
7]. As a consequence, the neural base would not serve as a criterion for distinguishing types of format. Martin’s point goes beyond the problem of determining whether a certain brain region belongs to a perceptual system or not. He calls into question whether knowing that certain representations belong to a perceptual system tells us something about their format. Knowing more about how different areas of the brain encode information and the processes by which it is retrieved and combined is not in principle impossible. In fact, there are steps towards “decoding” some neural populations. Going back to the example of approximate number representations, some studies suggest the way in which neurons in the intraparietal sulcus (IPS) might encode cardinality. For example, Roitman and colleagues recorded neurons in the lateral intraparietal area in macaques whose responses resembled the summation units from the two models of numerosity detection mentioned in the previous section [
45]. Assuming Roitman and colleagues’ results were replicated in humans, and further, the discharge patterns proposed by a model of numerosity detection were found in a specific neural population, this would inform the classification. If certain neurons in the IPS present a discharge rate that increases or decreases in proportion to the cardinality of the stimulus, this suggests that they bear a structural isomorphism both to what they represent and to the perceptual representations that originated them. Furthermore, it offers a starting point to compare how other magnitudes are encoded in order to determine whether approximate number representations have a proprietary format or not.
6 However, the knowledge of how information is encoded is not available for every concept.
Therefore, even though the overlap between the three criteria, considering the way in which the information is encoded, the systems to which the representations belong and whether there is reuse of the representations underlying perception and action, constitutes a strong interpretation that promises to allow a better assessment of the evidence, its empirical application still encounters limitations.
5. Conclusions
The debate between amodal and modal—or grounded—views is one of the central ones within the mental representations approach to concepts. Although much research focuses on determining the type of vehicles concepts are couched in, the distinction between the amodal and modal format is not a clear one. On the one hand, there is not a general consensus on the specific properties that distinguish modal and amodal representations, and different authors adopt different criteria, which leads to a cross classification of certain representations depending on the chosen criterion. On the other hand, as I argued, the main criteria that have been proposed have difficulties. The distinction in terms of the type of input to which the representations respond is underspecified, given that it collapses multimodal and amodal representations. An alternative version of this criterion, that opposes a single amodal code to multiple proprietary codes corresponding to each modality, avoids this consequence, and places this criterion closer to one based on the neural systems of the representations. The isomorphism criterion has two interpretations, both of which encounter difficulties. One of them, an isomorphism between modal representations and their contents, has been objected as neither sufficient nor necessary for modality, given that it is unlikely that every perceptual representation is isomorphic with its content, and there are cases of isomorphic representations that are not perceptual. The other one, an isomorphism between modal representations and the perceptual states that produced them is better suited for the distinction, given that it offers a necessary and sufficient condition for modality. Nevertheless, it resembles the neural base criterion, and shares its problems. There seems to be a convergence on the neural base criterion: a distinction between the reuse in conceptual tasks of the neural systems underlying perception and action, versus the transcription from the sensorimotor systems to a different code in a separate system (or systems). Although this criterion offers a clearer conceptual distinction, it has limitations in its empirical application. As I argued, it presupposes modes of individuating modalities and of determining which systems are constitutively involved in conceptual tasks and which ones are auxiliary. Although the distinction between modal and amodal representations is at the center of one of the main debates on concepts, the available criteria for the distinction are problematic. Without a criterion that offers an adequate conceptual and empirical distinction, a proper assessment of the evidence for each type of format is not possible.