The present work evaluates the possibility of quantifying the interaction of each participant in clinical conversations, measuring each individual’s contributions and the evolution of group work. In network discussions, mechanisms that support cooperative diagnoses are also considered [
11,
12]. To successfully evaluate collaborative conversations, several aspects from different areas must be taken into account [
13,
14].
2.1. Indicators for Measuring Contributions in Clinical Diagnosis Cases
The goal of this work is to obtain a numerical representation of the value of each contribution in a clinical conversation. Relevant amount of valuable information from the words written in a conversation has to be extracted and represented numerically.
Besides using some classical indicators from collaborative environments, like the number of words in a contribution or effort needed by each participant in their messages construction, we have defined additional indicators that measure contextual information about contributions in a conversation that could be found in some scenarios or tools. These additional indicators want to measure previous research done by a participant before writing a message, i.e., if the user has inspected multimedia or images loaded in the discussion; or, if she has studied any research paper related to the case. We have also defined new indicators strongly related to the specific area of discussions: health. Thus, in addition to common collaborative aspects, specific biomedical knowledge appearing in discussions should be measured. Natural language processing (NLP) tools for extracting medical knowledge based in standard terminologies are used for obtaining the amount of clinical information provided in a contribution, taking then into account the number of concepts appearing in contributions.
Indicators that we have defined to quantify the interaction in clinical conversations are: user participation, object manipulation, interaction variation, and previous research. Additionally, one indicator and two ratios for measuring participants’ activity have been defined: diagnostic relevance indicator, new concepts by contribution ratio, and diagnostic correctness ratio. Each of these indicators and ratios are explained in the following subsections.
2.1.1. User Participation Indicator
The participation of users in textual communication or implementations on an artefact can be collected without variations derived from the environment or the domain [
15,
16]. Written text, for example, is a way of getting information from an environment. This is an important factor for knowing the level of performance of the individual in the construction of an artifact, in this case, the diagnosis or solution of the conversation.
The level of participation of the user is measured until the moment he or she stops talking or is interrupted by another person, as if it were a shift system. A shift begins when a person starts talking alone. In summary, the most appropriate way to measure is by the number of messages; that is, each message sent can be considered one episode of speaking. Counting the number of words can also be an option to find the amount of participation, since the words are separated by spaces. Thus:
Group conversation is the sum of all individual participations. If the goal is to obtain the amount of each participant, an average can be calculated as
In the context of our work, this aspect is reflected directly through the comments on each separated case. Obviously, the influence of the user in case A would not lead to a greater participation in case B.
2.1.2. Object Manipulation Indicator
The amount of work performed by one user within an online conversation application is by itself a good indicator of her interest in the consecution of the task [
17]. Similarly to the participation indicator mentioned above, this measure allows us to establish whether users are following the case and in what manner. We define the manipulation of objects by
In practical terms with a discussion tool, this manipulation of the objects is given by the use of the attached media and links in the cases up for discussion.
In a wider domain, it would also be possible to measure the level of user interaction with other elements of the collaborative application that do not refer exclusively to clinical cases, i.e., changes in user profile, documentation, and searches in cases.
2.1.3. Interaction Variation Indicator
Another factor that defines the richness of one discussion is the variation of the interaction [
18]. A user can actively participate in a specific discussion even if he or she has never discussed other cases.
This can occur for a variety of reasons, such as the number of cases related to patient’s illness or the professional’s specialty. On the other hand, it may be a warning about the quality of the remaining cases. If they are more complete, they can intimidate other users to participate.
A derived measurement of this aspect would be to calculate the user’s participation during a certain period. This could be an indicator of the level of application fidelity.
2.1.4. Previous Research Indicator
Adequate preliminary research can accelerate and promote efficiency in clinical case discussions. When this indicator is not used, it can discourage participants and compromise the quality of the final result. The preliminary investigation process can benefit both the author of a case and the participants in the interaction.
In the context of the methodology, the following indicator quantifies the number of cases that the user and the group access in order to calculate the manipulation of clinical cases in the public database repository
As discussed, the interaction occurs through the action of consulting previous cases of the repository.
The amount of related cases investigated by participants can be difficult to measure and it is limited to the work inside the health social network or collaborative platform used. Thus, for the implementation of this indicator, the inclusion of a contextualization method is recommended. Contextualization would provide links to public repositories such as PubMed, which provides free access to access to MEDLINE [
19,
20], as well as magazines and e-books. MEDLINE is the National Library of Medicine (NLM) journal citation database. It includes more than 26 million citations and abstracts from the biomedical literature. Contextualization of clinical cases implies using natural language processing techniques on the case title and description in order to extract medical concepts that appear and finally provide publication links containing those concepts [
10,
20]. The implementation of this indicator could be even more complete, i.e., measuring the time spent by one participant in each previous case through the supervision of the web session. This action aims to increase the degree of certainty about the participant’s work in the reading of each previous related case.
If no contextualization methods are available in collaborative environment, the indicator would need an activity report filled by the participant containing citations of research performed supporting his contribution.
2.1.5. Diagnostic Relevance Indicator
Current indicators can measure the interaction of a user within a collaborative decision. The usefulness of the suggestions or interventions of participants, however, is usually not only due to the number of words or interventions performed. For this reason, we have defined a new specific indicator within the collaborative scope among health professionals: the diagnostic relevance indicator.
Through the use of automatic annotations of the contributions and diagnostic proposals of the participants using natural language processing techniques, we can quantify the clinical contributions (diagnoses and symptoms) that appear in an interaction that have not appeared previously in the case
The moderation of clinical cases requires experience related to the specialty being discussed. In cases where the action zone is still undefined, or when managing a multidisciplinary case, this aspect is even more critical, as it is possible for there to be more than one moderator support.
Each contribution to a case should be annotated using standard vocabularies and the concepts that have not appeared before in the case, either in questions or affirmations, should also be associated with each contribution. A possible variation would be to divide these contributions by the type of clinical concept used, whether it is a diagnosis (or suggestion thereof), or rather concerns diagnostic results or symptoms. As a complement, the indicator could culminate in a ranking.
2.1.6. New Concepts by Contribution Ratio
Based on the indicators explained above, derived indicators can be obtained, such as the percentage of new diagnostics by contribution to the total of an intervention.
In the context of the presented methodology, this factor reflects the diagnostic suggestions or contributions. This ratio could help to characterize the participants, i.e., if they are brief in their messages but provide information, or on the contrary, if the texts of their contributions are very broad but do not add new clinical suggestions or information.
2.1.7. Diagnostic Correctness Ratio
Because cases, as discussed in the previous subsections, must be concluded by a moderator, the participant who suggested the final diagnosis in the case can be marked.
This ratio, similarly to the previous one, could be used to characterize the participants of the discussions by creating a rank of expertise in the discussion application.
Table 1 summarizes indicators and ratios that have been defined in the methodology.
2.2. Application of Indicators for Measuring Relevant Information in Clinical Diagnosis Cases and Conversations
Once the characteristics to be measured have been concretized into a set of indicators, it is necessary to define how these indicators must be used in clinical discussions for different phases of the interaction evaluation. Those different stages are depicted in
Figure 1.
2.2.1. Data Collection
The initial step in order to measure the interaction of participants in a collaborative diagnosis is to obtain the data for calculating certain indicators. Theoretically, the data collection could be applied during a face-to-face discussion using recordings and a supervisor that could be manually gathering the defined indicators. Defined indicators could also be applied partially to ‘a posteriori’ discussion, or in those cases when there is no possibility of collecting all defined indicators, by using only the texts of the interactions.
2.2.2. Calculation of Indicators
To validate the relevance of the defined indicators, experiments were performed considering two discussion tools. The expected result is a similar pattern for both.
In the construction of the methodology, two forms of calculation have been tested. Even if they are compensated with standard deviations and variances, they result in a measurement difficulty. To increase the assertion of the measurements in the context of a discussion, we used coefficients using two types of comparisons. In the first comparison, the basis is the sum of the measurement values commonly used for group measurements. For example, the variation indicator of group interaction can use this basis for the coefficient. In the second comparison, the base used corresponds to the highest value, normally used for individual measurements. For example, the individual object manipulation indicator can use this basis for the coefficient.
2.2.3. Characterization of Discussion
The proposed methodology can be applied to both synchronous and asynchronous discussions, although the latter format improves the greater inflow of new participants and contributes to a more lasting and effective interaction.
To contribute to the automation of the process, the discussion occurs practically without the intervention of the moderator figure. In specific situations of experimentation, in which the discussion may range far from the expected objective, the moderator may arbitrate, e.g., by suggesting that the reasoning and the discussion be directed in a different way. Although concern has been mentioned, the aim of the experiment is to encourage discussion naturally without intervention. Ultimately, the discussion functions similarly to a forum; however, there are a number of mechanisms available to evaluate the particularities inherent in the interaction between the participants in a more precise way.
2.2.4. Evaluation of Participants and Discussion
The evaluation of a participant is obtained from the measurement indicators and should be presented based on the results of each coefficient. To quantitatively characterize the conversational style of participants in a discussion, tables of indicators are used by ordering the participants’ values for each indicator. The conversational style is evaluated using three aspects:
- (i)
Participatory style, based on indicators: (i.1) user participation (using the individual participation rate in the case), (i.2) object manipulation, and (i.3) interaction variation.
- (ii)
The interest of a participant in review literature about the case, as measured by the previous research indicator.
- (iii)
Clinical contribution, which estimates participant interactions specifically related to clinical knowledge, using the diagnostic relevance indicator.
2.2.5. Conversation Transcription
The group of messages that compose the discussion are translated and printed so that the expert could read, and interpret them in paper format. In addition to the messages, the transcription contains all available information about manipulated images, cases accessed, and investigations accessed by participants in the conversation.
2.2.6. Expert Assessment
To evaluate the numerical results obtained using the methodology indicators, a clinical expert (external to the cases) was asked to describe the participation of the members involved in the case. Due to the difficulty of finding experts available and willing to contribute, it was decided to carry out the assessment with one expert but with a broad experience in illnesses treated in the cases. The expert is a general medicine clinical doctor, 58 years old, and active in clinical practice.
Particularly, the expert has considered the work dynamics during the interactions, without presenting him indicators collected from conversation in the cases. To define the work dynamics, the expert provided information regarding: user participation, the value and impact of the contributions of each user, and the relevance of his or her previous research (access of objects such as images and literature review).
The expert reviews the transcription of the conversation (
Section 2.2.5), and freely makes comments about the interventions of participants. Afterwards, the assessment of the expert is compared with the evaluation of participants (
Section 2.2.4) for validation, as displayed in
Figure 1.
2.3. Platforms Used in the Experiments
To test the validity of the proposed methodology two clinical online platforms have been selected: Sanar [
21], and PatientsLikeMe.
Sanar is a collaborative diagnosis web tool, where each member of the group can use a web browser to create and comment on health cases. It was designed to support the building of medical knowledge among health professionals in a global context. The main idea is to support health professionals in sharing knowledge about clinical cases, resulting in more informed opinions about the case and, consequently, allowing a more consistent diagnosis.
The Sanar platform was selected primarily because it could be modified to implement all the proposed indicators. Experiments using Sanar were the basis for analyzing the relevance of the indicators proposed in the methodology, with a focus on arriving at a common diagnosis in a real environment among clinical professionals. The experiments are also important to identify what helps and what hinders the development of a discussion, which can serve as a basis for reinforcing the relevance of each indicator for a more efficient interaction.
PatientsLikeMe is a well-known health social platform designed specifically for interactions among patients with a common disease. In this system, patients report their health information, which is presented as a coherent graphic sample in their profile. Member profiles are posted where other members can access them, providing a basis for the exchange of passive information and active dialogue among patients. Internal implementation and databases of PatientsLikeMe were not available; thus, only some of the proposed indicators can be applied.