1. Introduction
Negation represents a fundamental aspect of human language that is rooted in cognitive processes. The expression of negation begins in early childhood (
Morris 2003) and, as
Horn (
2010) claims, it is an essential device of the communicative system, since it furnishes speakers with the tools for denial, contradiction, misrepresentation, deception, and irony. Abandoning the simplistic view of negation as a mere binary operator that assigns truth values, recent research has described negation as a complex cognitive, linguistic, and logical device displaying complex syntactic, semantic, and pragmatic properties and functions (
Prieto and Espinal 2020).
Indeed, languages possess morphological, syntactic, and semantic mechanisms that allow speakers to express negation verbally. In face-to-face communication, verbal expressions of negation are frequently supplemented by nonverbal cues, such as prosody (e.g., intensity, pitch, etc.) and gestural behavior (e.g., hand gestures, shoulder shrugs, etc.). These nonverbal devices can also operate independently as, for instance, head shaking is associated with negation in certain cultural and linguistic contexts.
The evolution of negation spans from the basic act of refusal, a communicative behavior that is already present in early stages of language development and is shared with animals, to a sophisticated range of conceptually grounded uses exclusive to human beings. Actually, negation serves a variety of communicative purposes, including the expression of falsity, absence, non-existence, denial, rejection, and correction (
Vandamme 1972). In this respect,
Roitman (
2017) refines the perspective by asserting that linguistic negation encompasses three primary meanings: non-existence, rejection, and denial.
The scope of this paper is to examine one of these three specific functions, namely ‘denial’. We will focus on the distinctive realizations of denial analyzed from a multimodal perspective (i.e., gestures and prosody). Pragmatic aspects, such as how suspects refer to victims during denial instances, are also briefly considered, though not with the same systematic approach we conducted for the analysis of prosody and gestures.
Before looking at denial through the lens of multimodality, it is necessary to situate it within the broader context of negation.
Hummer et al.’s (
1993) study indicates that, from a developmental point of view, denial emerges later than other functions of negation, as it needs the simultaneous representation of two mental models: one reflecting the true state of the world and one reflecting its false counterpart. Equally significant is the study by
Ripley (
2020) that asserts that denying a certain claim involves performing an act that introduces new information, namely that the claim is ruled out. More broadly,
van der Sandt (
1991) defines denial as a means of objecting to utterances produced by previous speakers. In this paper, we use the term ‘denial’ to refer to a speech act encompassing verbal and/or nonverbal elements employed by a speaker to object to or correct the form, content, presuppositions, and implicatures of an utterance (
Combei 2023).
This operationalization of denial allows us to examine how denial is encoded multimodally—through prosody and bodily conduct—and how this has been investigated in the literature. First of all, an important contribution to the study of multimodal denial is
Harrison’s (
2018) monograph which argues that negation has clear grammatical and gestural manifestations and that there are regularities between the two elements in human communication. On a similar note,
Bressem and Müller’s (
2017) study on multimodal patterns of negation indicates that recurrent gestures display a fixed form–meaning pairing. It has also been mentioned that multimodality can influence the speech act of denial and their associated belief statuses (
Combei 2023). Moreover, the review by
Prieto and Espinal (
2020) indicates that denial is expressed through various prosodic and gestural features across natural languages, mentioning, in particular, the use of high tones in tonal languages and pitch accentual prominence in intonational languages.
Equally interesting are the studies that explore denial as a deception mechanism from a multimodal perspective, including more recent attempts to automatically detect it. One of the first large-scale multimodal studies on deception is the work by
Buller and Aune (
1987). They investigate how deceivers manage nonverbal cues to convey nonimmediacy and create a positive image, while simultaneously revealing signs of arousal and negative affect.
Buller and Aune’s (
1987) research, involving 130 participants, claims that deceivers display nonimmediacy and arousal but fail to project a positive image. Additionally, the study indicates that deception cues are influenced by relational history and exhibit significant variability over time. Deceivers also appear to actively regulate their nonverbal behavior, attempting to suppress signs of arousal and negative affect.
A study by
Vrij et al. (
1996) explores how liars are often unaware that they reduce their movements during deception. Their research aims to determine how deceivers might respond if informed about this rigidity and how factors like tension, behavioral control, and cognitive effort relate to deception. In their experiment, subjects participated in two interviews: one truthful and one deceptive. In the information-present condition, participants were informed beforehand that deception typically involves decreased movement, while the information-absent condition provided no such insight. The findings show that, despite participants believing they increase their movements while lying, they actually exhibit a decrease. Interestingly, informing deceivers about deceptive behavior has no impact on their movements. The authors claim that the decrease is linked to efforts by deceivers to control their behavior and cognitive load, rather than the tension they feel.
Moving to NLP approaches,
Soldner et al. (
2019) note that deception frequently occurs in everyday conversations, yet conversational dialogues remain underexplored in the field of automatic deception detection. To fill this gap, their paper focuses on detecting multimodal deceptive cues in conversational settings. They introduce a multimodal dataset featuring deceptive conversations from the Box of Lies game on The Tonight Show Starring Jimmy Fallon, where participants attempt to discern whether their opponent’s object descriptions are truthful. The authors annotate various multimodal communication behaviors, including facial expressions and linguistic cues, and derive several features from these annotations. Initial classification experiments yield promising results, significantly outperforming both random and human baselines, with an accuracy of up to 69% in differentiating between deceptive and truthful behaviors.
Similarly,
Jaiswal et al. (
2016) present a data-driven approach for automatic deception detection using audio–video data from real-life trials in legal contexts, focusing, among other things, on visual and verbal cues of denial. They employ OpenFace for facial action unit recognition to analyze witnesses’ facial movements during questioning, and OpenSmile to study acoustic patterns. Additionally, the authors conduct a lexical analysis of the spoken words, focusing on pauses and breaks, and feed this data into a Support Vector Machine for deception prediction. They also explore a method that fuses visual and lexical cues through string-based matching. While human judgment accuracy ranged from 53% to 60%, their automated system achieved an average accuracy of 78.95%, with higher accuracy in truth videos (81.10%) than in deceptive ones (76.80%).
As the brief literature review above suggests, previous research has demonstrated that gestures play a significant role in shaping and emphasizing denials, functioning as complementary elements to verbal negation (
Harrison 2009). To sum up the overview presented in this section, the multimodal characteristics associated with denials include, among others, head shaking, finger shaking, and palm-down hand gestures (
Kendon 2002,
2004;
Harrison 2010).
Building upon the research outlined above, this paper seeks to examine the multimodality of denial exhibited in English-language discourse within legal settings in the United States, with a specific focus on individuals accused of femicide (and eventually found guilty). We expect to identify distinctive and systematic patterns of prosodic and gestural features that characterize denials in these specific contexts. The findings of this exploratory analysis may contribute not only to improving our understanding of denial as a linguistic phenomenon, but also to uncovering how it is conveyed through a combination of verbal and nonverbal cues in legal contexts.
The rest of this paper is structured as follows:
Section 2 presents the aims, motivations, and scope of the study;
Section 3 explains the corpus and methodology;
Section 4 outlines the results; and
Section 5 provides a discussion of the findings, addresses the limitations of this study, and offers concluding remarks and future directions for our work.
2. Motivations and Aims
This paper presents a qualitative study that is part of a broader research endeavor exploring the multimodal dimension of denial within the legal sphere across the United States. A portion of this larger project, focusing on different data and excluding prosodic analysis, has already been published in
Combei (
2023). To validate and build upon the findings of the previous study, the present work examines gestural as well as prosodic discursive strategies used by femicide suspects to deny their involvement in crimes during post-crime interactions, such as police interrogations and cross-examinations in courtroom proceedings. The analysis of the suspects’ discourse may, in fact, uncover the complex ways in which gendered violence is implicated in denial. This section will explain the rationale of examining the linguistic phenomenon of denial in this specific context, the importance of adopting a multimodal approach in this investigation, and what we aim to achieve with this study.
We concentrate on a specific legal context in which denials of involvement in femicide are uttered, namely situations where the suspect is acquainted with the victim. For the purposes of this paper, femicide is understood as “the killing of women and girls because of their gender” (
United Nations 2013, p. 2), as was defined on the International Day for the Elimination of Violence against Women and the Vienna Declaration on Femicide. It should be stressed that femicide differs from general homicide as it is characterized by a disproportionate prevalence of intimate partner violence, familial abuse, and power imbalances (e.g., at home, at work) between victims and perpetrators.
This research centers on the discourse of suspects of femicide precisely because of their close relationship with the victims. We focus on suspects that know the victim well because they may deploy denial strategies that reflect the complex nature of their relationship with the victim (e.g., not admitting or trivializing the severity of the crime, deflecting responsibility, and shifting blame). More generally, analyzing the discourses of this kind of suspect may enhance our understanding of the dynamics of gendered violence and the ways in which such crimes are contested or minimized.
As mentioned above, our study adopts a multimodal perspective on denial, an aspect typically overlooked in forensic linguistics. The term ‘multimodality’ is used here in accordance with its understanding within the field of conversation analysis and following
Mondada’s (
2016, p. 338) definition as “the various resources mobilized by participants for organizing their action—such as gesture, gaze, facial expressions, body postures, body movements, and also prosody, lexis, and grammar”.
As
Wang (
2024, p. 163) notes, research on legal discourse from a multimodal perspective remains limited, and while gesture studies are advancing in theory and methodology, empirical research in forensic linguistics is still scarce, especially in the area of examining stance in legal discourse through gestures. Some notable exceptions that consider multimodality in analyzing discourse within legal contexts are the studies by Gregory Matoesian. For example,
Matoesian and Gilbert (
2016) illustrate the importance of multimodal and material actions that accompany speech, showing how attorneys use hand movements, physical objects, and verbal communication to emphasize key pieces of evidence for the jury. The authors also provide a theoretical framework explaining how beat gestures and material objects align with speech to enhance rhythm and highlight points of evidential significance, while also evoking semantic imagery.
The scarcity of multimodal research on legal language is likely attributable to the complexity and time-consuming nature of such analyses, which add to the challenges inherent in investigating legal discourse and content in general. In particular, multimodal analysis of spoken legal language requires the transcription and annotation of a wide range of features, including overlaps, pauses, hesitations, and bodily conduct. In addition, each of these features must be categorized into various classes, each comprising multiple levels (see
Section 3 for an example).
Even though we acknowledge the challenges inherent in multimodal analysis, we believe that a close examination of nonverbal features offers a more comprehensive understanding of denial within legal interactions, such as those between suspects and law enforcement. In this regard, we follow
Matoesian (
2010, p. 541), who asserts that verbal and nonverbal elements function as “co-expressive semiotic partners—as multimodal resources—in utterance construction and the production of meaning”. Indeed, the multimodal analysis of discourses produced within legal settings may be useful to better outline the suspects’ profiles. With this in mind, our study investigates denial, aiming to describe how suspects negotiate credibility through multimodal resources as well as verbal strategies, before and after a confession or indictment. To this end, the following research question guides our exploratory research: How do suspects of femicide deny allegations?
3. Data and Methods
3.1. Materials
Due to the inherent sensitivity of the data, forensic linguistics corpus collection, storage, access, and distribution are often restricted by privacy legislation (
Larner 2019). The ease of access to police recording of custodial interrogations or court data varies from country to country, contingent on the stringency of the pertinent data protection laws. In general, however, building and storing a corpus of forensic data is challenging. For instance, the jurisdictions of Italy and Great Britain impose strict limitations on the accessibility of this type of data (
Petyko et al. 2022). As regards the United States, the issue appears to some extent less complex, even though the audio–video recording process and data availability may vary in scope and by state (
Bang et al. 2018). A large quantity of forensic multimedia data, such as police interviews, interrogations, cross-examinations, and trials, useful for linguistic analysis can be accessed via online platforms like YouTube. For these reasons it was decided to work with a corpus of multimedia data from North America, in English, and freely retrievable from online sources. The intent was to be able to retrieve an easily accessible dataset that would allow for a focused study of the gestures and prosody of denial.
The entire corpus comprises ten videos sourced from websites and open-access YouTube channels, including
Fifth Estate, Red Circle Interrogations and Confessions, Law & Crime Trial Network, and
Macon Telegraph Archive1. Five North American suspects, aged 20–44, and accused of femicide are portrayed in the videos; all of them were eventually deemed guilty and convicted. In each instance, the perpetrator was either a close family member or had a close and/or intimate relationship with the victim (husband, boyfriend, or son-in-law). In addition to the accusation of femicide, all the suspects denied the charges on several occasions, some even during and after the trial, appealing the jury verdicts. Four of the suspects were recorded during police interviews. In one case, a suspect was recorded during cross-examination while his trial was in progress.
Initially, the decision to examine denial in two distinct legal contexts (police interviews and cross-examinations in courtroom proceedings) was driven by the goal of conducting a comparative analysis. This comparison was intended to explore how denial functions under different questioning situations. However, as the study progressed, we encountered significant challenges in gathering data from courtroom proceedings, which are scarce, or are not available as high-quality recordings. Given the exploratory nature of our study and its qualitative focus, we adapted our approach. Despite the imbalance in the corpus, we chose to retain the available cross-examination data, recognizing their value in contributing to our understanding of the denial phenomenon, even with a smaller sample size.
The corpus comprises a total of 10,655 tokens, corresponding to a duration of over thirteen hours of audio–video material. The duration and number of tokens in each video were determined by data availability and are, thus, independent of the research design and methodologies implemented. The audio quality of the videos is satisfactory, generally allowing automatic speech processing and analysis of the data. In terms of image quality, some of the data are less satisfactory, and this was reflected in some results (see
Section 4). Even if all the videos were recorded in color, in some cases the image resolution was insufficient for the analysis of certain parameters, such as the subtle and swift movement of the eyes and eyebrows. Moreover, although the videos are publicly accessible, all identifying information, including names, sensitive details, and geographic references, were redacted, anonymized, or renamed.
3.2. Methods
The corpus data were used to pursue the examination of gestural manifestations of denial and the analysis of prosody associated with it, before and after the incrimination or admission of guilt. Some aspects related to the pragmatics of referencing the victim and the crime were annotated as comments. The data were processed in accordance with these research directions, so the implementation of distinct procedures was needed. These steps are summarized below, and each of them is discussed in greater detail in the following paragraphs.
To obtain audio data useful for the analysis of prosodic cues, the .mp4 video files were converted to .wav files using VLC Media Player and Audacity
2. The resulting audio files were divided into approximately 10-min samples to facilitate the forced alignment process, the .TextGrid creation, and the automatic annotation of pauses through an Automatic Speech Recognition (ASR) pipeline, provided by WebMAUS Services (
Kisler et al. 2017).
The pipeline outputted ninety-nine .TextGrid files corresponding to each .wav file considered. Subsequently, a Python script was employed to automatically identify and extract the number and the duration of each pause. The Python script produced Excel spreadsheets, which were used to store the values related to the pauses. The manually extracted information from the .TextGrid files using Praat (
Boersma 2001) regarded pitch and intensity. Manual pitch and intensity analysis was preferred in this case, due to the inherent error susceptibility of automated approaches, particularly when considering the quality of the data at hand. The output of the ASR allowed us to use the automatic transcription of the speech as a base for examining the verbal dimension of denial.
As concerns gestural resources, the .mp4 files were processed directly, having been previously annotated with ELAN software
3. An annotation scheme was designed and implemented using ELAN, with multiple tiers allocated to distinct components of gestural manifestation. Each audio–video track was the object of complex annotation and analysis, with the focus on the conversational turns of the suspect (the process is detailed below).
3.3. Gestures: ELAN and the Annotation Scheme
This study employed ELAN for gestural annotation. A custom annotation scheme was developed to categorize bodily conduct across multiple tiers. Each tier corresponded to a distinct, predefined element created with the controlled vocabulary feature on ELAN. The MIT Boston Speech Communication Group’s ‘Gesture Coding Manual’
4 was chosen as the annotation scheme for hand gestures. The other features were annotated using the annotation scheme detailed in
Combei (
2023). We also considered the Linguistic Annotation System for Gesture (LASG), proposed by
Bressem et al. (
2013) for the annotation of our data. Although well-structured and articulated, we decided not to adopt this annotation scheme because it was too refined, and it took into account some linguistic parameters that were outside the scope of this research (such as syntactic or semantic aspects). At the same time, to the best of our knowledge, LASG lacked annotation patterns for other bodily parameters considered in our study (e.g., head movement, eyebrows, etc.). However, we acknowledge the fact that it would be useful to use LASG for a different, more complex gestural annotation, both to verify the goodness of the scheme we adopted and to explore parameters of multimodality that could not be included in this research. Since this qualitative study relied on a single annotator, future work should involve multiple annotators to measure inter-annotator agreement.
For the purposes of this research, any movement, shape, or orientation expressed by the suspects when uttering a denial act was considered to be a relevant gesture. The types of gestures of interest were restricted to those of the hands (especially their shape and positioning), the head and its movements, the direction of the gaze, the micro-movements of the eyebrows (when analyzable), and the posture during interrogations with police officers or cross-examinations in court. The annotation scheme was designed with six main tiers, each of which was associated with a specific gesture parameter:
‘Suspect HG’ (hand gesture): describes the shape of the suspect’s hand gesture while he is uttering the speech act of denial;
‘Suspect Gaze’: describes the suspect’s gaze direction while he is uttering the speech act of denial;
‘Suspect Eyebrows’: describes the suspect’s movement of the eyebrows while he is uttering the speech act of denial;
‘Suspect Head’: describes the movement of the suspect’s head while he is uttering the speech act of denial;
‘Suspect Posture’: describes the suspect’s body position while uttering the speech act of denial;
‘Handedness’ (dependent tier of the parent tier ‘Suspect HG’): specifies whether one hand or both hands were used to execute the annotated gesture.
Diverging from the annotation scheme developed by
Combei (
2023), we did not include the ‘legs’ feature. Due to current resource constraints, we focused our efforts on the upper body and hand movements, ensuring a more in-depth examination of these areas for our research. In fact, we updated the ELAN controlled vocabulary for all of the intra- and inter-suspect recurring movements and positions that were not portrayed in the ‘Gesture Coding Manual’ (such as ‘arms crossing’, ‘counting’, ‘measurement’, ‘pinch’).
Furthermore, a tier named ‘Suspect’ was added for each video to collect the verbal transcript of the suspects’ speech (statements). This was used for transcribing their verbal expressions of denial. A tier for comments was also included on the annotation, which was used to highlight relevant elements or findings that went beyond the established labels and annotation scheme. Observations regarding the pragmatics of the suspects’ discourses (in particular the way victims and crimes were referenced by the suspects) were also indicated in the comments tier. In order to ensure consistency and facilitate comparison between suspects, the same annotation scheme was used for all videos.
3.4. Prosody
Regarding the prosody of denial, Praat was employed as a tool for speech processing and analysis. Praat functionalities for pitch and intensity analysis were exploited to extract statistic descriptors related to prosodic parameters inherent to the episodes of denial uttered by the suspects. First of all, the intensity was normalized across all videos. Then, minimum, maximum, and average values of pitch and intensity were manually extracted for each instance of denial. To extract these values, we defined the boundaries of each ‘denial’ instance based on the discursive unit of the suspect. In particular, we considered the discursive unit to be the utterance in which the denial—whether verbal and/or gestural—occurred, extending up to the next pause in the interaction. This approach guaranteed that each denial was analyzed within its immediate context, capturing the correct communicative intent of the suspect.
In terms of speech processing and annotation, Praat was also used to control the pipeline output and check the automatic annotation of pauses. The algorithm’s accuracy in identifying the start and end of each pause was evaluated qualitatively and manual intervention was used to correct segment boundaries when necessary. Two primary categories of errors were identified. In the first case, the algorithm failed to accurately identify the onset and conclusion of spoken sequences, resulting in the misclassification of longer segments as pauses. In the second case, the error was more nuanced, involving the inclusion of vowels within the pause segment because the phonation was not correctly captured. All these issues were corrected manually.
4. Results
The research findings will be organized as follows:
Section 4.1. will provide a general overview of the analysis with some information regarding the multimodal annotation, the pauses, and some pragmatic observations. Then,
Section 4.2 and
Section 4.3 will be dedicated to gestural and prosodic analysis, respectively.
4.1. General Overview
Table 1 provides a summary of some general results derived from both automatic (i.e., pauses) and manual (i.e., verbal denial) feature annotation. The number of pauses reported for each video depends on the length of the file. The count includes pauses of all types: from those occurring within the same conversational turn to those occurring between the conversational turns of the suspect and the police officer/lawyer/judge. Thus, we considered both ‘pre’-response pauses occurring before the suspect’s reply to the official’s question and ‘post’-response pauses that occur while awaiting the next question or the completion of the suspect’s response. In the analyses presented in
Section 4.3, we only considered pauses occurring before the suspect’s response to questions posed by police officers and lawyers.
Regarding the manual annotation, the fourth and the fifth columns are dedicated to general denials and femicide denials, respectively. This distinction was introduced to account for two-fold manually performed data processing. First, all denial cases encountered during video listening and viewing were annotated, regardless of their degree of relation to the events closely connected to femicide episodes. Subsequently, a manual verification was conducted on these annotations to identify denials expressed by the suspects specifically regarding accusations of committed murder, fictitious statements about the murder weapon, innocence in the matter, concealment of bodies, etc. The column labeled as ‘general denials’ was included in
Table 1 for the purpose of comparison with the column ‘femicide denials’. The latter regards the number of denials associated with falsehoods identified in police interviews. This is because isolated denials strictly related to femicides, reported in the fifth column, all turn out to be fictitious denials, intended to distort the reality and avoid a guilty verdict.
‘Femicide denials’ were identified among the ‘general denials’ using the following categories as selection criteria: denials related to the timeline of events (for all the events related to the day of the murder itself), specific denials related to the murder weapon (e.g., gun, knife), and denials related to the harm done to the victim (e.g., physical assaults, body concealment). This differentiation between ‘femicide denials’ and ‘general denials’ has allowed us to distinguish more clearly between the general use of denial in the forensic context (e.g., the suspect’s response “No” to the officer’s question “Would you like a glass of water?”) and the use of denial for aspects strictly related to the femicides. Below are examples for each identified category of ‘femicide denials’ to provide insight into the observed data and how it is classified.
Timeline of events
Lawyer: Were you in the office when the woman was killed?
A.B.: No, I wasn’t in the office.
Police officer: So why would you call her if you were in the same house. From ten o’clock on. We are not making it up.
I.J.: No, I’m just saying I’m not recalling this you are talking about.
Murder weapon
Police officer: Do you have a gun?
K.L.: I’ve never touched a gun before.
Police officer: Did you have other experience where you just wake up and you don’t know what happened? Like ‘I just woke up and here I am, there was a gun and there was a knife, and drugs and I don’t know what was going on’, you know, and I understand that.
O.P.: I’ve never touched these knives. These knives they were just there. I’ve never—I’ve never touched them.
Harmed victim
Police officer: You know man, this is stuff we need to know to figure out what’s going on.
M.N.: I didn’t try to, I didn’t want to, I didn’t mean to at all!
Police officer: So what really happened that day?
O.P.: I didn’t do anything. I didn’t hurt anybody!
An exploratory review of the transcribed and annotated data revealed some interesting instances of pragmatic choices that align with findings in the work of
Combei (
2023), providing validation of both studies. In particular, there is an almost complete absence of direct references to the names of the victims in the discourses of the suspects. An educated guess for this vagueness is that the suspects strategically avoid naming the victim to deflect attention and minimize their emotional engagement as well as the consequences of the crime.
The victims’ names occur in only three cases throughout the entire 13-h corpus. In example (4), the victim’s name is uttered as a response to a direct and explicit question from the police officer, in example (5), the name appears only as an appellation used in reported speech, and finally in example (6), the name is uttered as a violent way to distance oneself from the accusations made by the police officer. In all other instances where the suspects mentioned the victim, they used anaphoric expressions, typically referring to the victims with third-person singular pronouns (i.e., she or her).
- 4.
Police officer: And what’s your wife’s name?
K.L.: Claire.
- 5.
Police officer: And what did you do next?
O.P.: We said like “Have you talked to Jo?” I was like “No, have you talked to Jo?”
- 6.
Police officer: O. you are under arrest for murder right now. The murder of Johanna.
O.P.: I didn’t murder Johanna! I don’t.
On the same note, another interesting pragmatic aspect is the lack of direct references to the act and the result of killing in the suspects’ discourses. In fact, in the annotated speech (Suspects tier), the word ‘murder’ appears only once (uttered by one suspect), while terms like ‘death’, ‘to kill’ (0), and ‘dead’ are entirely absent from the dataset. Instead, we frequently find generic pronouns, names, verbs, or other anaphoric expressions used to refer to the crimes and their consequences. For instance, terms such as ‘it’ (126), ‘that’ (117), and ‘anything’ (75) occur frequently, as expected, especially following explicit references to the crime made by police officers. In this case, the vagueness could also be interpreted as both a mitigation strategy (i.e., it lessens the weight of the crime) and a detachment strategy from the victim.
4.2. Gestural Analysis
Regarding the gestural manifestation of denial, as detailed above, we annotated all videos based on posture, hand gestures, gaze, eyebrows, and head movements. The output of the annotation process allowed us to extract the most frequent features for various bodily characteristics,
5 namely ‘front’ for head position, ‘sitting erect’ for posture, ‘towards other speakers’ for gaze, ‘open’ for hand gestures, and ‘both’ for handedness. It should be mentioned that the frequency of occurrence of each type of feature is influenced by the different sizes of the five subcorpora; in particular, the disproportionate amount of data annotated for the O.P. suspect skews the final count of each entry. To address this issue and account for the specific distribution of the gestural features, information related to the subcorpus of each suspect considered is reported in
Table A1 in
Appendix A.
Given the frequency of gestures and the acknowledged imbalance in our data, the results, though interesting, should be interpreted with caution. That being said, our findings largely point to a prototypical multimodal expression of denial, characterized predominantly by a simple gestural apparatus involving the head either in a frontal position or lowered downwards (presumably to obscure the gaze). Head shaking occurs parallel to the downward head movement. This gesture occurs multiple times even autonomously, serving as a paraverbal signal of denial without any verbal expression. Following the head gestures, the gaze is engaged, primarily in the configuration ‘towards other speaker’ (mostly in concordance with the head in a frontal position) or avoiding, opposing the interlocutor’s gaze by maintaining a closed-eye configuration throughout the denial. Additionally, similar to the ‘avoiding disposition’ that characterizes closed eyes, there is the ‘down’ configuration, predominantly occurring with a ‘frontal’ head position.
The significant number of cases where annotation of eyebrow-related traits was not feasible (due to the video recording quality or the angles) makes this parameter challenging to assess. This highlights the importance of using high-quality multimedia material in multimodality studies. The most suitable situation for annotating eyebrows was found in the video of suspect A.B., filmed in the courtroom, where the high-quality close-up footage allowed precise observation of facial expressions, shapes, and movements. Despite technical issues, it is interesting to note that, in the cases where these features could be analyzed, denial did not manifest through eyebrow movements support (‘relaxed’ featured 179 occurrences). Nevertheless, paradoxically, within this dataset the feature ‘frowning’ often occurs, denoting a very specific movement typically associated with negative emotions.
The most common posture associated with denial is the suspect seated, often upright (erect), and facing the interviewer. However, this posture is prevalent across the entire corpus and is not exclusive to denial scenarios. An interesting pattern in the corpus involves suspect O.P., who frequently adopts a forward-leaning posture, particularly after the confession, conducting much of the interview hunched over. Since this posture is especially traceable in the second phase of the interviews, one plausible interpretation is that the suspect may feel increasingly pressured.
The situation concerning hand gestures is more complex. To begin with, we will focus on the less complex findings. Our observations indicate that the majority of manual gestures associated with denials are predominantly performed with both hands simultaneously. It would be interesting to investigate whether this aspect is specific to the multimodality of denials (and the forensic contexts) or if what is observed is a general trend in gestural expression. Next, moving to more complex aspects of hand gestures, we found that the most frequent categories for hand gestures are the forms ‘open’, ‘relaxed’, and ‘arms crossed’. We have intentionally excluded the ‘fist’ hand gesture from what we report as prototypical denial gestures, despite their frequency. In fact, the ‘fist’ label appears predominantly in O.P. videos and is therefore more indicative of individual expression of denial rather than representative of broader patterns of this phenomenon. The ‘handcuffed’ label is excluded from our analysis because the mere presence of restrained hands does not qualify as a gesture. Handcuffs represent a state of limitation rather than an intentional communicative action. During the final “verdict” we traced an open-hand gesture performed with both hands while keeping the head in a frontal position in relation to the interlocutor. This is accompanied by a gaze mostly directed towards the interlocutor, while seated in an upright position.
As shown in
Table A1 in
Appendix A the ‘not available’ category of gestures is very frequent as a result of A.B.’s framing in the video, which predominantly features close-up shots that obscure the visibility of arms and hands. It is important to mention that the ‘not available’ label applied to all types of gestures (e.g., posture, gaze, etc.) does not represent a real feature. However, we documented instances where gestures were indistinguishable due to video quality or subtle movement limitations in order to capture the impact of visibility constraints on our findings.
An important point to emphasize is the difference between the gestural multimodality of denials as it occurs before the confession or ‘turning point’
6 during the police interview and the multimodality of denials as it occurs after these relevant moments. In our dataset there is a marked reduction in both verbal and nonverbal expression in the ‘post’-confession phases to varying degrees among all suspects. Below is an example illustrating both the verbal and multi-modal behavior of one suspect at two distinct moments: first, prior to learning about their partner’s death resulting from their aggression, and subsequently, following confirmation and subsequent charges of femicide.
As can be inferred from
Figure 1, in the ‘pre’-phase, the suspect’s conversational turn is marked by heightened gestural dynamism, complemented by generally longer and more complex sentences. In
Figure 2, however, greater heaviness and stillness is observed in the physicality and gestures of the suspects. In the ‘post’-phase the curtailment in verbal expression is total, as nothing is uttered verbally; head shaking is the only element through which the suspect conveys his denial. It is interesting that the open hand shape is clearly visible, suggesting, in this case, an attitude of non-acceptance of the facts.
4.3. Prosodic Analysis
The prosodic analysis involved extracting the values of minimum, mean, and maximum pitch and intensity for each speech act of denial. This manual extraction was complemented by automatic extraction of pauses. Given the significantly large amount of annotated data, we decided to work with the extracted values of ten sample denials selected for each suspect (the denials with their pitch and intensity values were first stored in an Excel spreadsheet and then randomly selected to avoid bias). All these ten samples were selected from the subset of femicide denials. For the denials selected for each suspect, a further internal subdivision of the total collected denials was made. Denials produced before the confession or turning point during police interviews were distinguished from denials produced after these moments. Of the ten denials selected for each suspect, five were randomly chosen from the ‘pre-confession’ denials, while the remaining five were selected from those produced by the suspect during ‘post-confession’. This choice is motivated by the interest in the variation of pitch, intensity, and the number of pauses between, before, and after the confirmation of the accusations or the suspect’s admission of guilt. This is aimed at observing a possible systematic difference in the parameters between before and after instances, motivated by emotional and circumstantial reasons stemming from the exposure of lies and/or the formal accusation of femicide. It was assumed that there could be variation due to the strong emotional impact that being caught lying and/or being accused of murder entails, and that this could be found in all cases under consideration. Even if this possibility is acknowledged, its quantification falls outside the aims of this study.
Table 2 shows the ‘pre’ and ‘post’ averages for pitch and intensity for each suspect. The results appear to provide some responses to the research question. In particular, variation is observed, within the constraints of an exploratory qualitative study, regarding the intensity and the pitch of denial expression. In two cases, it seems to be more contained regarding pitch (K.L. and O.P.). What should be noted is that this snapshot of observations does not seem to indicate a steady direction of variation for the parameters considered, particularly concerning pitch. For intensity, there is a tendency towards a decrease in decibels following the confession or turning point (and during the cross-examination phase) in four out of the five suspects. O.P. contradicts this trend with a significant deviation and an increase in intensity after the turning point of the interview. Regarding pitch, however, there is a tendency towards a decrease following the confession or during the cross-examination phase in the cases of A.B. and I.J., while for the remaining three, there is an increase in F0 within the same circumstances. It is certainly important to mention, however, that the only case showing a significant increase in pitch is M.N., while K.L. and O.P. present a more subtle variation in which the increase may be more due to randomness or idiosyncrasy.
Table 3 shows the average length of pauses calculated by ‘pre’ and ‘post’ phase, in addition to the analysis of pitch and intensity. Regarding the average length of pauses observed in individual suspects, it appears that there are longer pauses in the post-phase for A.B., I.J., and M.N. Even if K.L. and O.P. do not confirm this trend, the gap between the ‘pre’ and ‘post’ phase in these two suspects is smaller than the gap observed in the other three. The ‘pre’ and ‘post’ totals reflect the majority trend. Regardless of the length, the number of pauses is almost equivalent in the ‘pre’ and ‘post’ phases for all five suspects. However, this does not correspond with the distribution of pauses in the entire dataset. In general, the number of pauses is greater in the ‘post’ phases of each suspect.
5. Discussion, Limitations, and Conclusions
Although the exploratory nature of this qualitative study precludes the derivation of systematic generalizations, the results and the interpretation of the data described in
Section 4 highlighted the prototypical nature of both generic and feminicide-specific multimodal expressions of denial. Here, we extend the above considerations by adding some comparisons with the relevant literature on gestures, particularly in relation to arms and hands movements.
Concerning arms, we interpret the feature ‘arms crossed’ as a gesture of closure and separation, typically suggestive of downplaying and avoidance (
Gallace et al. 2011). Therefore, we can claim that this element functions as a mechanism to express detachment and diminish the perceived importance of the crime in question, which is rendered as unexpected. Thus, this could indicate an intentional effort to downplay involvement in crime-related events.
In
Figure 3 and
Figure 4 we see two examples of the ‘arms crossed’ gesture, which we interpret as a cue of detachment from the crime under discussion. In the first case (
Figure 3), the position of the arms co-occurs with a fake statement of desperate and sad astonishment (“I don’t know why anybody would do that”), aimed at avoiding possible allegations of involvement. In the second example (
Figure 4), the closed position with crossed arms is also used by the suspect to detach himself from the reality of the situation. In this particular instance, the selected image represents a frame within a ‘bump and grind’ phase of the police interview. During this phase, the suspect assumes and maintains a defensive position, responding to all questions posed by the police officer with a lie. More generally, at the corpus level, this gesture appears to be associated with a defensive stance.
Similarly, an ‘open’ hand gesture, in both configurations (palm up and palm down), falls under
Kendon’s (
2004) classification which analyzes this type of hand shape in relation to manual actions of stop, refusal, denial, or interruption accompanying verbal expression. It is significant from this perspective that this specific hand shape is the most recurring one, not only throughout the dataset of denials but also for denials specifically related to femicides.
Two particularly illustrative examples are provided below. The ‘open’ hand gesture, employed in both the palm-up and palm-down configurations, was used by the same suspect to explicitly disavow any involvement with the murder weapon. In
Figure 5, the suspect’s hands, with palms facing upwards, accompany the declaration of innocence concerning the allegation of firearm usage. The use of the hands creates a particular sense of surrender and innocence that follows the sentence, which not only denies the use of a firearm but also its possession (“I’ve never had a gun”).
Figure 6 illustrates the progression of denial. After discrediting the initial assertion regarding contact with a firearm, the police officer asks about the location of the firearms at the crime scene. The suspect then denies having used them, stating, “Please, don’t take it the wrong way. I really never used it. Neither one of them.” In this instance, the suspect’s use of the phrase “Neither one of them” accompanied by the gesture of shaking his open hands with the palm down in front of him, serves to reinforces his denial and strengthens his assertion that he has no connection to the murder weapon.
Then, we observed the ‘relaxed’ gesture, which to some extent resembles the ‘open’ form, occurring mostly in conjunction with other gestures. It was indeed the hand shape most often adopted individually by the right or left hand and not in double configurations.
As previously stated, the examples confirm the assumption of the ‘relaxed’ hand shape within a gesture made with only one of the two hands. In the case of
Figure 7, the relaxed right hand is accompanied by a gesture of nervousness (the act of scratching the back or face) that O.P. often performs when he is in a recumbent position in relation to his actions on the day of the crime. In
Figure 8, on the other hand, the suspect’s right hand, in a relaxed position, is accompanied by the left hand that instead expresses denial in an open position with palm facing upward. The ‘open’ form is once more employed to convey innocence and detachment from the facts being verbally denied, as illustrated in
Figure 5 and
Figure 6.
Having presented and discussed our results, it is necessary to acknowledge that this study is preliminary in nature, and it has several limitations, leaving numerous areas for further exploration and refinement. Some of the limitations may, in fact, represent new opportunities for development, which could be implemented in the relatively near future.
First of all, for our findings to be supported quantitatively, the study would require a larger and more balanced corpus (e.g., the same amount of data for police interviews and cross-examinations in courtroom proceedings). At the same time, a greater amount of audio–visual material from suspects of femicide, as well as from people accused of other crimes, is needed to compare the features assessed in this study with those in cases involving different types of criminal suspects. This would provide a better picture of the pragmatic, multimodal, and prosodic behavior of suspects, providing a more representative and generalizable overview of how denial is expressed verbally and/or nonverbally.
From a multimodal perspective, while the annotation scheme used in this study was sufficiently rich and complex, we believe it could be further enhanced by adding a few new parameters. Particularly, it could be useful to add cues regarding the lower body as well as ‘shoulder shrugs’ as a gestural feature. As reported by previous research, lower body cues (e.g., feet and legs) are less studied compared to the upper body but they are still relevant for the organization of social interaction (
Mondada 2014). Equally interesting are the features regarding the shoulders: including them into the annotation of denial could be useful because the action of shoulders shrugging has been reported to signal a detachment attempt since “[they] can work as markers of ‘dis-stance’ or disengagement, in which case they take on an epistemic-evidential dimension” (
Debras 2017, p. 24).
Since our study did not investigate the detailed shape and execution of gestures, future research could benefit from decomposing these gestures into finer traits, configurations, and subtler movements. This would offer a more granular picture of the multimodality of denial and, in more general terms, it would represent a multi-level analysis of gestures, investigating not only the syntactic–semantic or pragmatic aspect but also the “morphological” composition of gestures. At this stage, as outlined in the methodological section, the inclusion of an additional annotator and calculating inter-annotator agreement would be necessary for validation purposes.
As regards prosody and focusing particularly on the fundamental frequency parameter, another possibility to enhance this kind of study is to carry out precise annotation of the intonational contour of denials, to assess the possible presence of denial-specific pitch characteristics. In this regard,
Mertens’ (
2014,
2020) Prosogram and Polytonia tools could be used to automatically obtain a stylization of pitch contour as well as an automatic labelling of pitch movements. Parallelly, following the line drawn by classical works of
Beckman and Pierrehumbert (
1986) as well as
Ladd (
2008) on the intonational aspects of the English language (particularly the study of pitch accents and pitch contours based on the autosegmental-metrical theory), specific configurations could be observed, which may also be useful for an analysis of the pragmatic use of intonational features, both in production and perception. Moreover, once the necessary conditions for the retrieval of the aforementioned useful data are satisfied, comparisons could be drawn with studies on intonational contours of denials in other languages, such as Italian and other Romance languages (
D’Imperio 2002;
Prieto et al. 2005). Additionally, considering both prosodic and pragmatic aspects together, it would be interesting to complete the data on the mean length of utterances, adding speech rate (e.g., in the form of a count of syllables produced per second by each suspect), both overall and separately in the ‘pre’ and ‘post’ turning point phases.
Despite the limitations discussed above, the results of our paper outline a recognizable profile for multimodal denial. Recurrent features include a predominant gestural component characterized by head positioning (either neutral or lowered) and head shaking. The head shaking feature is frequently repeated and can serve independently as a nonverbal marker of denial. We also observed that denial is often accompanied by open-hand gestures and a sitting (erect) posture; this posture is frequently observed in conjunction with denial but is also common in non-denial instances throughout the corpus. A certain degree of vagueness in speech patterns was reported as regards the way suspects refer to the victims (i.e., they are not named explicitly). As regards prosody, our findings indicate that expressions of denial frequently involve a reduction in pitch and intensity following a confession or indictment. Finally, the analysis of pauses reveals that a greater number of pauses typically occur after incrimination.
Overall, we believe that this research may contribute to future studies on the multimodality of denial in legal settings, and to the limited literature in forensic linguistics and the broader academic discourse on this topic.