1. Introduction
In the fast-evolving field of healthcare research [
1,
2], the complexity of data, particularly within healthcare claims, mirrors the intricacies of biological systems. The ability to model and analyze this vast, multifaceted data is crucial for making informed decisions about patient care, diagnostics, and treatment pathways. Soft sets and their numerous extensions provide a valuable toolkit for addressing the uncertainty and variability prevalent in healthcare claims data, which encompasses details about treatments, providers, costs, and prescriptions [
3].
These mathematical constructs, first introduced by Molodtsov in 1999 [
4], offer a flexible framework for tackling imprecision. Healthcare claims data, where uncertainty is intrinsic, benefits from soft set theory, which models this uncertainty more effectively than classical statistical methods [
5,
6]. Since their inception, soft sets have evolved significantly. Extensions like HyperSoft Sets, introduced by Smarandache in 2018 [
7], and more recent advancements such as SuperHyperSoft Sets, IndetermSoft Sets, IndetermHyperSoft Sets, and TreeSoft Sets [
8,
9,
10,
11] have been developed to address specific challenges in handling intricate relationships within healthcare data. Furthermore, the contributions of Alkhazaleh in 2010 with MultiSoft Sets have further enriched these mathematical tools [
12].
While the evolution of soft sets has been robust, with applications spreading across diverse fields, including bioinformatics, chemistry, and public health, healthcare remains a relatively underexplored area for these models. The paucity of studies applying soft set theory to healthcare claims data presents an opportunity for significant advancements.
A legitimate question arises: How can the application of soft set theory and its recent extensions in analyzing and modeling healthcare claims data contribute to improving diagnostics and personalized treatments?
Recent works have examined the fusion of soft set theory with fuzzy logic, yielding combinations like neutrosophic, picture fuzzy, and plithogenic soft sets, each contributing unique perspectives on handling uncertainty. TreeSoft Sets, for instance, offer promise for improving healthcare analytics in the era of Industry 4.0 [
13], while IndetermSoft Sets are increasingly applied to real-world challenges in healthcare [
14,
15]. However, further research is needed to explore how these methodologies can be combined with modern computational techniques to address complex real-world problems.
Future research should focus on refining these applications and addressing existing limitations, ensuring that soft set methodologies can be fully leveraged to enhance healthcare decision-making and improve patient outcomes.
2. Evolving Impact of Soft Sets in Healthcare Data Analysis
The ongoing exploration and application of soft sets and their extensions represent a significant advancement in the realm of data analysis, particularly for addressing real-world challenges within healthcare. Soft sets, with their ability to handle uncertainty, imprecision, and indeterminacy, offer a versatile framework for analyzing complex healthcare claims data. This evolving methodology has the potential to transform how we interpret and utilize healthcare information for improved diagnostics, treatment decisions, and resource allocation.
The fusion of soft set theory with complementary mathematical frameworks, such as fuzzy logic, paves the way for deeper insights into healthcare datasets. This convergence not only enhances our understanding of complex data patterns but also provides a foundation for the development of innovative tools and techniques that address the specific needs of healthcare researchers and practitioners.
Current Survey Mission
This paper seeks to explore and assess the evolution and application of soft sets and their extensions within the domain of healthcare claims data analysis. Our study addresses the inherent complexities, uncertainties, and interrelationships present in such datasets. The key contributions of our research are outlined as follows:
We provide a thorough examination of the development of soft sets and their extensions—such as HyperSoft Sets, SuperHyperSoft Sets, IndetermSoft Sets, IndetermHyperSoft Sets, and TreeSoft Sets—highlighting their relevance and utility in healthcare claims analysis. This review offers an in-depth understanding of how these extensions have evolved to handle complex, multi-attribute data in healthcare scenarios.
We present practical examples and case studies from healthcare claims data to demonstrate the real-world applicability of these soft set frameworks. These examples illustrate how soft set-based methodologies can be leveraged to improve decision-making, optimize treatment strategies, and enhance the analysis of healthcare claims by capturing uncertainties often overlooked by traditional statistical methods.
Our review emphasizes key methodological advancements made possible through the use of soft sets and their extensions. We show how these tools can improve the accuracy and efficiency of healthcare data analysis by addressing challenges such as missing information, imprecise relationships, and multi-dimensional dependencies. This study contrasts soft set-based methods with classical approaches to highlight their benefits.
Building on the advancements reviewed in this paper, we propose future research directions aimed at further enhancing data analysis in healthcare. Specifically, we suggest exploring the integration of soft sets with fuzzy logic and other computational techniques to improve predictive accuracy and develop personalized treatment models. We also identify opportunities for expanding soft set applications to other complex, data-intensive domains beyond healthcare.
3. Related Work
In this section, we provide a comprehensive overview of the contributions in the field, emphasizing their impact and relevance to the application of soft set theory in healthcare claims data analysis. In fact, soft set theory, applied to healthcare claims data, provides a flexible framework for analyzing the uncertainty and imprecision inherent in medical records. Consider a scenario where a patient’s diagnosis is uncertain due to incomplete information or conflicting test results. Traditional methods may struggle to handle such ambiguity, leading to inaccurate assessments or diagnoses.
However, by employing soft set theory, we can represent the uncertainty associated with each diagnosis or treatment option using membership functions. These membership functions assign degrees of certainty to various outcomes based on available evidence, allowing healthcare practitioners to make informed decisions despite incomplete or conflicting data.
For example, a soft set approach could be used to determine the likelihood of a patient having a particular condition based on their symptoms, medical history, and test results, even when some information is missing or contradictory. This flexibility makes soft set theory a valuable tool for analyzing healthcare claims data, improving diagnostic accuracy, and ultimately enhancing patient care.
The most notable contributions in this field are mentioned below.
1. Molodtsov’s seminal work laid the foundation for soft set theory, offering a novel approach to handling uncertainty and vagueness in data analysis [
3]. This foundational work has been pivotal in subsequent research exploring various extensions and applications of soft sets in different domains, including healthcare claims data analysis.
2. In 2018, Smarandache introduced HyperSoft Sets, an extension designed to better handle multi-attribute decision-making processes. This extension has shown promise in dealing with the complex and multi-dimensional nature of healthcare claims datasets, providing a more nuanced framework for analysis [
7].
3. The MultiSoft Set, introduced by Alkhazaleh and his team, expanded the versatility of soft sets by accommodating multiple parameters, making it particularly useful for applications in healthcare claims data where multiple factors need to be considered simultaneously. This work has significantly enriched the toolkit available for researchers and specialists in healthcare [
10].
4. In 2022, Smarandache introduced IndetermSoft Sets and IndetermHyperSoft Sets, which address indeterminacy in data analysis. These extensions have been applied to real-world scenarios in healthcare, demonstrating their utility in dealing with uncertain and incomplete healthcare claims data [
6,
9]. The next year, Smarandache proposed SuperHyperSoft Sets [
15].
5. Convergence with Fuzzy Logic and its Extensions
The integration of soft set theory with fuzzy logic and its various extensions has formed a robust framework for managing the inherent fuzziness and uncertainty in healthcare claims data. P. K. Maji’s seminal work, exemplified by “Intuitionistic Fuzzy Soft Sets”, has played a pivotal role in this domain [
16].
Furthermore, the foundational contributions of Lotfi A. Zadeh and other collaborators in fuzzy logic have paved the way for the amalgamation of fuzzy logic with soft set theory, notably documented in fuzzy set applications to pattern classification and clustering analysis [
17] or decision analysis [
18].
S. K. Samanta’s research on neutrosophic soft sets and their applications has significantly bolstered this convergence, offering invaluable insights into managing uncertainty in biomedical data analysis [
19,
20].
Additionally, Florentin Smarandache’s exploration of neutrosophic sets, particularly showcased in 2020 [
21] alongside collaborative endeavors with K. Atanassov on intuitionistic fuzzy sets [
22], have greatly propelled the methodologies for extracting actionable insights from complex datasets [
23]. The research conducted by M. Shabir and M. Naz on bipolar soft sets [
24] and their fusion with fuzzy logic has contributed substantial insights into multi-criteria decision-making problems, further enhancing the analytical capabilities in healthcare contexts.
These advancements underscore the potential of integrating soft set theory and its extensions into healthcare data analysis, offering avenues for enhancing diagnostics and personalized treatments.
The adeptness of these mathematical constructs in handling uncertainty, multi-dimensionality, and indeterminacy aligns seamlessly with the intricacies inherent in healthcare claims datasets.
Consequently, delving into systematic applications of these tools to improve medical outcomes stands as an imperative avenue for future research.
Collectively, these studies underscore the dynamic evolution of soft set theory and its extensions, emphasizing their growing significance and versatility in the domain of healthcare claims data analysis. The ongoing research and development in this sphere hold the promise of unlocking novel possibilities for advancing diagnostics, therapeutics, and personalized medicine.
6. Recent Applications in Medical Image Analysis and Preventive Practices
Recent studies have highlighted the practical applications of soft set theory in medical image analysis. For instance, Dhanalakshmi and Bhaskaran explore the application of soft set methodologies to evaluate the degree of evidence in medical recommendations and assess factors influencing preventive practices in clinical images with indeterminate features [
25].
Similarly, Yang and Zhao provide insights into the advantages and specific methods used in employing soft set theory for similar purposes [
26]. Additionally, Khan and Gupta offer a detailed examination of soft set-based approaches in medical image analysis, focusing on their role in evaluating evidence in medical recommendations and analyzing factors influencing preventive practices in clinical images [
27].
These applications underscore the relevance and adaptability of soft sets in contemporary healthcare research, particularly in the domain of medical image analysis and preventive practices.
7. The innovative work by Alqazzaz and Sallam explored the use of TreeSoft Sets combined with interval-valued neutrosophic sets, providing novel insights into data analysis within the context of Industry 4.0. [
13]. This study demonstrates the evolving nature of soft set applications and their potential to address modern data challenges.
Given these advancements, it becomes evident that the integration of soft set theory and its extensions into healthcare claims data analysis holds significant potential for enhancing diagnostics and personalized treatments. The ability of these mathematical constructs to handle uncertainty, multi-dimensionality, and indeterminacy aligns well with the complexities inherent in healthcare claims datasets. Therefore, exploring how these tools can be systematically applied to improve medical outcomes is a compelling avenue for future research.
These studies collectively highlight the dynamic evolution of soft set theory and its extensions, showcasing their growing importance and versatility in the realm of healthcare claims data analysis.
The ongoing research and development in this field promise to unlock new possibilities for improving diagnostics, therapeutics, and personalized medicine.
4. Soft Sets Extensions
In this section, we delve into the various extensions of soft sets, each offering unique capabilities and applications within the realm of healthcare claims data analysis.
These extensions include the HyperSoft Set, SuperHyperSoft Set, Fuzzy-Extension-SuperHyperSoft Set, IndetermSoft Set, IndetermHyperSoft Set, and TreeSoft Set.
Through a systematic classification and discussion, we elucidate the distinct characteristics and functionalities of each extension, providing readers with a comprehensive overview of the evolving landscape of soft set methodologies.
We recall the definitions of soft set, HyperSoft Set, IndetermSoft Set, IndetermHyperSoft Set, and TreeSoft Set, including a few suggestive examples applied to healthcare claims data.
4.1. Soft Set
A soft set provides a flexible framework for modeling uncertain or imprecise information by associating each attribute with a set of possible elements from the universe of discourse. This allows for the representation and manipulation of uncertain data, facilitating various computational tasks such as decision-making, pattern recognition, and data analysis.
4.1.1. Definition
A soft set is a mathematical abstraction designed to encapsulate uncertainty and fuzziness inherent in data within a specific domain of discourse. Let us break down this definition:
Firstly, we define a universe of discourse, denoted as U, which encompasses all conceivable elements or entities relevant to the context under consideration. The power set of U, represented as P(U), comprises all possible subsets derived from the elements within the universe of discourse. Essentially, it represents the complete range of potential combinations or groupings of elements from U.
Next, we introduce a set of attributes, denoted as A, which serves to characterize the properties or features associated with the elements within the universe U. These attributes could represent any discernible traits, qualities, or characteristics relevant to the domain being studied.
Now, a soft set is formally defined as a pair (F, U), where F: A → P(U).
F represents a mapping function that associates each attribute in A with a subset of elements from the universe U. In other words, for every attribute within set A, there exists a corresponding subset of elements from the universe of discourse U, as determined by the mapping function F.
In summary, a soft set provides a structured framework for capturing and managing uncertainty by linking attributes to subsets of elements within a given universe of discourse. This enables the representation and manipulation of imprecise or indeterminate data, facilitating various computational tasks such as decision-making, pattern recognition, and data analysis within the specified domain.
4.1.2. Example
Let us define the universe of discourse U as a set of patients.
U = {Patient1, Patient2, Patient3, Patient4} and a subset included in
U representing patients with specific conditions:
Now, let us consider an attribute related to medical conditions:
with attribute values representing different medical conditions:
where
P(
U) represents the power set of
U.
This means that both Patient2 and Patient3 have been diagnosed with asthma.
This representation (
Figure 1) allows us to capture complex relationships between patients and their conditions. It is particularly useful in healthcare claims data analysis because
It can handle uncertainty: if a patient’s diagnosis is uncertain, we could represent it by associating the attribute with multiple patients or using fuzzy sets within the mapping.
It accommodates missing data: if we do not know whether a patient has a particular condition, we simply would not include them in the corresponding subset.
It facilitates pattern recognition: by looking at the mappings, we can easily see patterns like comorbidities (e.g., Patient 2 has both hypertension and asthma).
This soft set representation provides a flexible framework for analyzing healthcare claims data, allowing us to capture and manipulate uncertain or imprecise information effectively.
4.2. IndetermSoft Set
An IndetermSoft Set provides a flexible framework for modeling uncertain or imprecise information by associating each attribute with a set of possible elements from the universe of discourse. This enables the representation and manipulation of uncertain data, facilitating various computational tasks such as decision-making, pattern recognition, and data analysis.
4.2.1. Definition
An IndetermSoft Set expands upon the foundational principles of the classical soft set by accommodating indeterminate data, reflecting the inherent uncertainty and ambiguity prevalent in real-world scenarios. Let us dissect this definition:
We begin with the establishment of a universe of discourse, denoted as U, which encompasses all relevant elements or entities under consideration. Additionally, we identify a non-empty subset of U, denoted as H, and its corresponding powerset, P(H), which comprises all possible subsets derived from the elements within H.
Furthermore, we introduce an attribute, denoted as ‘a’, and a set of attribute-values, denoted as A.
The mapping function F: A → P(H) is designated as an IndetermSoft Set if one or more of the following conditions are met:
- (i)
The set A exhibits some level of indeterminacy.
- (ii)
The sets H or P(H) demonstrate indeterminacy.
- (iii)
The function F itself contains elements of indeterminacy, indicating the presence of attribute-values for which the mapping is unclear, incomplete, conflicting, or non-unique.
IndetermSoft Sets, characterized by their capacity to handle indeterminate data, arise from real-world situations where information sources may provide approximate, uncertain, incomplete, or conflicting data. Rather than introducing indeterminacy artificially, such as in the classical soft set framework, the indeterminacy is identified within the data itself, reflecting the limitations and nuances of our world.
The term “Indeterm” signifies “Indeterminate”, encompassing attributes of uncertainty, conflict, incompleteness, or lack of uniqueness within the outcomes. This distinction prompts the consideration of determinate versus indeterminate operators, leading to the development of an IndetermSoft Algebra.
Smarandache’s contributions extend the concept further with the introduction of HyperSoft Sets, which involve multi-attribute functions, and subsequently, the hybridization of various soft set variants. These hybrids incorporate elements from crisp, fuzzy, intuitionistic fuzzy, neutrosophic, and other fuzzy extensions, as well as the plithogenic HyperSoft Set.
While the classical soft set relies on determinate functions with certain and unique values, the reality of our world often involves sources that provide indeterminate information due to a lack of knowledge or precision. Consequently, operators with varying degrees of indeterminacy are utilized to model such scenarios, acknowledging the inherent imprecision of our environment.
4.2.2. Example
Consider a dataset comprising healthcare claims from various patients.
(1a) You inquire from a source:
—“Which patients have been diagnosed with diabetes?”
The source responds:
—“I’m uncertain; it could be patients Patient1 or Patient2”.
Thus, F(diabetes) = Patient1 or Patient2 (an indeterminate/uncertain response).
(1b) Another query:
—“And which patients have undergone surgery?”
The source replies:
—“I’m not certain; all I can confirm is that Patient5 has not had surgery because I have their records”.
Thus, F(surgery) = not Patient5 (again, an indeterminate/uncertain response).
(1c) Further inquiry:
—“Then, which patients have high blood pressure?”
The source asserts:
—“It’s either Patient8 or Patient9 for sure”.
Thus, F(high blood pressure) = either Patient8 or Patient9 (yet another indeterminate/uncertain response).
- II.
Indeterminacy with respect to the set P of patients:
You ask the source:
—“How many patients are included in the dataset?”
The source replies:
—“I haven’t counted them, but I estimate the number to be between 100–120 patients”.
- III.
Indeterminacy with respect to the set C of medical conditions:
You inquire:
—“What are all the medical conditions diagnosed in the patients?”
The source states:
—“I’m certain there are patients diagnosed with diabetes, high blood pressure, and heart disease, but I’m unsure if there are patients with other conditions”.
The IndetermSoft Set addresses the inherent indeterminacy present in healthcare claims data by introducing a flexible framework that accommodates varying degrees of uncertainty. Through the incorporation of indeterminacy measures, the IndetermSoft Set offers researchers the ability to effectively manage and quantify uncertainty, facilitating more robust decision-making processes and knowledge discovery.
4.3. Hypersoft Set
A HyperSoft Set provides a robust framework for modeling uncertain or imprecise information by associating each attribute with a collection of potential elements from the universe of discourse. This framework is designed to handle a wide range of data uncertainties, enabling effective decision-making, pattern recognition, and comprehensive data analysis.
4.3.1. Definition
The extension from soft sets to HyperSoft Sets (HS Sets) marks a significant advancement in modeling complex relationships by expanding the mapping function to accommodate multiple attributes.
Here is a breakdown.
Initially, the soft set concept is broadened into the realm of HyperSoft Sets by transitioning the mapping function F into a multi-attribute function. This transformation enables the representation of intricate relationships between elements within the universe of discourse.
Let us delve into the formal definition.
We begin with the universe of discourse, denoted as U, along with its powerset, P(U), which encompasses all conceivable elements or entities.
Next, we introduce n distinct attributes, denoted as a1, a2, …, an, for n ≥ 1. Each attribute is associated with a set of attribute values, denoted, respectively, as A1, A2, …, An, with Ai ∩ Aj = Φ, for i ≠ j, and i, j in {1, 2, …, n}.
Notably, these attribute sets are pairwise disjoint, ensuring no overlap between them.
The pair (F, A1 × A2 × … × An) represent a HyperSoft Set over U, where F is a mapping function defined on the Cartesian product of the attribute sets where A1 × A2 × … × An.
Formally,
signifies that for each combination of attribute values, there exists a corresponding subset of elements from U.
The introduction of HyperSoft Sets facilitates the exploration of complex relationships and interactions among multiple attributes within the universe of discourse. This extension opens avenues for the comprehensive analysis and modeling of intricate systems, spanning various domains and applications.
Moreover, Smarandache’s contributions have led to the hybridization of HyperSoft Sets with diverse frameworks, including crisp, fuzzy, intuitionistic fuzzy, neutrosophic, and other fuzzy extensions, as well as the plithogenic set. These hybrid models integrate elements from different mathematical paradigms, enhancing their adaptability and utility in addressing real-world complexities.
In essence, HyperSoft Sets offer a versatile and robust framework for modeling and analyzing complex systems characterized by multiple attributes, thereby facilitating informed decision-making and knowledge discovery across diverse domains.
4.3.2. Example
Let the attributes be
a1 = diagnosis,
a2 = treatment,
a3 = cost,
a4 = duration,
and their attributes’ values, respectively,
Diagnosis = A1 = {diabetes, heart condition, respiratory issue},
Treatment = A2 = {medication, surgery, therapy},
Cost = A3 = {low, medium, high},
Duration = A4 = {short-term, medium-term, long-term}.
Let the function be F: A1 × A2 × A3 × A4 → P(U).
Then, for example, consider a healthcare claims dataset with the following attributes:
Diagnosis: {Diabetes, Hypertension}
Treatment: {Medication, Therapy}
Cost: {Low, High}
Duration: {Short-term, Long-term}
We want to analyze claims that involve a diagnosis of diabetes, treatment with medication, low cost, and short-term duration.
Soft Set Representation:
In a soft set, we might represent the data as follows: F(Diagnosis, Treatment, Cost, Duration) where F(Diabetes, Medication, Low, Short-term) = {Claim1,Claim2}
This means that both Claim1 and Claim2 involve
A diagnosis of diabetes,
Medication as treatment,
Low cost,
Short-term duration.
HyperSoft Set Extension:
The HyperSoft Set extends this by incorporating hyperparameters to refine the analysis. Let us introduce two hyperparameters:
Enhanced Representation:
With these hyperparameters, the HyperSoft Set can be expressed as: FHyper(Diagnosis, Treatment, Cost, Duration)
Using the hyperparameters, we refine our dataset representation: Enhanced ScoreDiabetes,Medication,Low,Short-term = wDiabetes,Medication × FrequencyDiabetes,Medication + wLow,Short-term × FrequencyLow,Short-term
Where
Frequency is the count of claims matching the respective attributes.
Enhanced Score combines these weights to provide a more nuanced view of how often and significantly these attributes co-occur in the dataset.
Practical Implication:
By integrating hyperparameters, the HyperSoft Set allows for a more detailed and flexible analysis of healthcare claims data:
It captures complex relationships between attributes.
It adjusts the influence of these relationships based on predefined weights, leading to a more accurate and reliable representation of uncertainty.
It improves the decision-making process by providing insights into the significance of various attribute combinations.
Comparison to Classical Methods:
In classical statistical analysis, relationships are often considered in isolation or through basic frequency counts, which may not capture nuanced interactions. The HyperSoft Set, with its hyperparameters, offers a more sophisticated approach by incorporating these interactions into the analysis, enhancing the overall accuracy and interpretability of the results.
Basically, this is an extension of the previous real example of soft set use.
The HyperSoft Set extends the foundational principles of soft sets by incorporating hyperparameters that capture complex relationships and interactions within healthcare claims datasets.
By integrating hyperparameters, the HyperSoft Set enables a more nuanced representation of uncertainty, thereby enhancing the accuracy and reliability of data analysis and interpretation within the healthcare domain.
4.4. SuperHypersoft Set
A SuperHyperSoft Set introduces an innovative framework for modeling complex and uncertain information, where each attribute is associated with an expansive set of potential elements from the universe of discourse. This advanced approach enables the comprehensive representation and manipulation of intricate data, facilitating advanced computational tasks including decision-making, pattern recognition, and data analysis at a highly refined level.
4.4.1. Definition
The SuperHyperSoft Set (SHS Set) is an extension of the HyperSoft Set. As for the SuperHyperAlgebra, SuperHyperGraph, SuperHyperTopology, and, in general, for SuperHyperStructure and neutrosophic SuperHyperStructure (that includes indeterminacy) in any field of knowledge, “Super” stands for working on the powersets (instead of sets) of the attribute value sets.
Let be a universe of discourse, () the powerset of .
Let a1, a2, …, an, for n ≥ 1, be n distinct attributes, whose corresponding attribute values are, respectively, the sets A1, A2, …, An, with Ai ∩ Aj = ∅, for i ≠ j, and i, j ∈ {1,2, …, n}.
Let (A1), (A2), …, (An) be the powersets of the sets A1, A2, …, An, respectively. Then, the pair
(F, (A1) × (A2) × … × (An), where × meaning Cartesian product, or
F: (A1) × (A2) × … × (An) → ()
is called a SuperHyperSoft Set.
4.4.2. Example
If we define the function
F: (A1) × (A2) × (A3) × (A4) → ().
we get a SuperHyperSoft Set.
Let us consider a scenario involving healthcare claim data, extending the previous examples. Assume we have a dataset comprising healthcare claims, and we want to categorize them based on various attributes.
Let us define the attributes and their possible values as follows:
Attribute A1: Type of Treatment (e.g., Surgery, Medication, Therapy)
A1: {Surgery, Medication, Therapy}
Attribute A2: Diagnosis Code (e.g., Injury, Illness, Chronic Condition)
A2: {Injury, Illness, Chronic Condition}
Attribute A3: Patient Age Group (e.g., Child, Adult, Senior)
A3: {Child, Adult, Senior}
Attribute A4: Insurance Provider (e.g., Company A, Company B, Company C)
A4: {Company A, Company B, Company C}
Let the function
F:
A1 ×
A2 ×
A3 ×
A4→
P(
U) map combinations of these attributes to subsets of the set of healthcare claims
U.
this means that claims
claim1 and
claim2 involve either surgery or medication, are related to either injury or illness, are for adult patients, and are covered by either CompanyA or CompanyB insurance providers.
This SuperHyperSoft Set approach allows for a flexible categorization of healthcare claims, accommodating various combinations of treatment types, diagnoses, patient age groups, and insurance providers, reflecting the complexity and diversity of real-world healthcare scenarios.
In fact, we assume a new theorem: the SuperHyperSoft Set is equivalent to a union of HyperSoft Sets.
4.4.3. Demonstration
Let us consider the SuperHyperSoft:
F: (A1) × (A2) × …× (An) → ()
Assume that the non-empty sets
B1 ⊆ A1, B2 ⊆ A2, …, Bn ⊆ An and
F (B1, B2, …, Bn) ∈ P(U)
B1 = {b11, b12, …}, B2 = {b21, b22, …}, …, Bn = {bn1, bn2, …}, therefore
F({{b11, b12, …}, {b21, b22,…}, …, {bn1, bn2, …}) can be composed in many
, which are actually HS Sets.
Considering the attributes diagnosis, treatment, cost, and duration, we can derive the following 12 possibilities:
Diagnosis: diabetes, Treatment: medication, Cost: low, Duration: short-term;
Diagnosis: diabetes, Treatment: medication, Cost: low, Duration: medium-term;
Diagnosis: diabetes, Treatment: medication, Cost: low, Duration: long-term;
Diagnosis: diabetes, Treatment: medication, Cost: medium, Duration: short-term;
Diagnosis: diabetes, Treatment: medication, Cost: medium, Duration: medium-term;
Diagnosis: diabetes, Treatment: medication, Cost: medium, Duration: long-term;
Diagnosis: diabetes, Treatment: medication, Cost: high, Duration: short-term;
Diagnosis: diabetes, Treatment: medication, Cost: high, Duration: medium-term;
Diagnosis: diabetes, Treatment: medication, Cost: high, Duration: long-term;
Diagnosis: diabetes, Treatment: surgery, Cost: low, Duration: short-term;
Diagnosis: diabetes, Treatment: surgery, Cost: low, Duration: medium-term;
Diagnosis: diabetes, Treatment: surgery, Cost: low, Duration: long-term.
For each of these combinations, the function F yields the set of patients who meet these criteria, represented by {x1, x2}. In total, 12 are HyperSoft Sets.
4.5. Fuzzy-Extension-SuperHyperSoft Set
A Fuzzy-Extension-SuperHyperSoft Set introduces an advanced framework that combines fuzzy logic with HyperSoft Set theory, providing a robust approach for modeling highly complex and uncertain information. Each attribute is associated with an expansive set of potential elements from the universe of discourse, allowing for nuanced representation and manipulation of uncertain data. This innovative approach empowers advanced computational tasks such as decision-making, pattern recognition, and data analysis with enhanced adaptability, precision, and the ability to handle fuzzy boundaries effectively.
4.5.1. Definition
F: (A1) × (A2) ×…× (An) → ((x(d0))) where x(d0) is the fuzzy or any fuzzy extension degree of appurtenance of the element x to the set .
Fuzzy-Extensions mean all types of fuzzy sets [
14], such as: suzzy sets, intuitionistic fuzzy sets, inconsistent intuitionistic fuzzy sets (picture fuzzy sets, ternary fuzzy sets), Pythagorean fuzzy sets (Atanassov’s intuitionistic fuzzy set of second type), Fermatean fuzzy sets, q-Rung Orthopair fuzzy sets, spherical fuzzy sets, n-HyperSpherical fuzzy sets, neutrosophic sets, spherical neutrosophic sets, refined fuzzy/intuitionistic fuzzy/neutrosophic/other fuzzy extension sets, plithogenic sets, etc.
4.5.2. Example
In the previous example, considering the attributes diagnosis, treatment, cost, and duration, we can envision a neutrosophic SuperHyperSoft Set.
Let us assume
({diabetes},{medication},{low},{short-term}) = x1(0.7, 0.4, 0.1)
F({diabetes},{medication},{low},{medium-term}) = x2(0.9, 0.2, 0.3).
This would mean that x1, corresponding to the values ({diabetes}, {medication}, {low}, {short-term}), holds an appurtenance degree of 0.7, an indeterminate degree of 0.4, and a non-appurtenance degree of 0.1.
Similarly, x2, associated with the values ({diabetes}, {medication}, {low}, {medium-term}), exhibits an appurtenance degree of 0.9, an indeterminate degree of 0.2, and a non-appurtenance degree of 0.3.
4.6. IndetermHyperSoft Set
An IndetermHyperSoft Set builds upon the HyperSoft Set framework by incorporating advanced mechanisms for dealing with indeterminacy in data. Each attribute in this model is linked to a set of potential elements, similar to HyperSoft Sets, but with enhanced capabilities to manage and represent varying degrees of uncertainty. This extension facilitates more nuanced decision-making, pattern recognition, and data analysis, providing greater adaptability and precision in complex scenarios.
4.6.1. Definition
The IndetermHyperSoft Set represents an extension of the HyperSoft Set to accommodate indeterminate data, functions, or sets. Here is a refined explanation:
We start with the universe of discourse, denoted as U, along with a non-empty subset H of U, and its powerset, P(H), which encompasses all possible subsets of H.
Next, we introduce n distinct attributes, denoted as a1, a2, …, an, for n ≥ 1.
Each attribute is associated with a set of attribute values, denoted, respectively, as A1, A2, …, An, with Ai ∩ Aj = Φ for i ≠ j, and i, j in {1, 2, …, n}.
Notably, these attribute sets are pairwise disjoint, ensuring no overlap between them.
Then, the pair (F, A1 × A2 × … × An), where F: A1 × A2 × … × An → P(H) represents an IndetermHyperSoft Set over U if at least one of the following conditions holds true:
- (i).
At least one of the attribute sets A1, A2, …, An has some indeterminacy;
- (ii).
The sets H or P(H) exhibit indeterminacy;
- (iii).
There exists at least one n-tuple (e1, e2, …, en) ε A1 × A2 × … × An such that the function F(e1, a2, …, en) = indeterminate (unclear, uncertain, conflicting, or not unique). In other words, F yields an indeterminate outcome for that tuple.
In essence, the IndetermHyperSoft Set extends the HyperSoft Set framework to accommodate situations where uncertainty or vagueness is present in the attribute sets, subsets, or the mapping function itself.
Moreover, the IndetermHyperSoft Set provides a flexible and adaptable approach for modeling and analyzing complex systems in which precise information may be lacking or uncertain. By incorporating indeterminate elements, functions, or sets, this extension enhances the applicability of the HyperSoft Set framework in real-world scenarios characterized by inherent uncertainty or ambiguity.
4.6.2. Example
Assume there are many patients in a hospital database.
(1a) You ask a source:
—What patients have been diagnosed with diabetes and prescribed medication?
The source:
—I am not sure, I think it is either Patient1 or Patient2. Therefore, F(diabetes, medication) = Patient1 or Patient2 (indeterminate/uncertain answer).
(1b) You ask again:
—But what patients have hypertension and are undergoing surgery?
The source:
—I do not know, the only thing I know is that Patient5 does not have hypertension and did not undergo surgery because I have checked their records.
Therefore, F(hypertension, surgery) = not Patient5 (again indeterminate/uncertain answer).
(1c) Another question you ask:
—Then what patients have asthma and are being treated with therapy?
The source:
—For sure, either Patient8 or Patient9.
Therefore, F(asthma, therapy) = either Patient8 or Patient9 (again indeterminate/uncertain answer).
- 2.
Indeterminacy with respect to the set P of patients.
You ask the source:
—How many patients are in the database?
The source:
—I never counted them, but I estimate their number to be between 100 and 120 patients.
- 3.
Indeterminacy with respect to the product set A1 × A2 × … × An of attributes.
You ask the source:
—What are all diagnoses and treatments of the patients?
The source:
—I know for sure that there are patients diagnosed with diabetes, hypertension, and asthma, but I do not know if there are patients with other diagnoses (?) About the treatments, I recall seeing many patients receiving medication, but I do not remember seeing patients undergoing surgery or therapy.
Combining the strengths of both the IndetermSoft Set and the HyperSoft Set, the IndetermHyperSoft Set provides a comprehensive framework for analyzing complex healthcare claims datasets characterized by both uncertainty and hyperparameters.
By synergistically integrating indeterminacy measures and hyperparameters, this extension empowers researchers to unravel intricate relationships and patterns within biological data, thereby advancing our understanding of biological systems.
4.7. TreeSoft Set
A TreeSoft Set introduces a structured framework for modeling uncertain or imprecise information, where each attribute is organized in a hierarchical tree-like structure, associating each node with a set of potential elements from the universe of discourse. This hierarchical approach enables the systematic representation and manipulation of uncertain data, facilitating various computational tasks such as decision-making, pattern recognition, and data analysis with a focus on hierarchical relationships and dependencies.
Definition
The TreeSoft Set is an innovative extension that introduces a hierarchical structure to soft sets, providing a comprehensive framework for modeling complex systems with multiple levels of attributes. Here is a refined explanation:
We begin with a universe of discourse, denoted as U, and a non-empty subset H of U, along with its powerset, P(H), which encompasses all possible subsets of H.
Next, we define a set of attributes, denoted as A, which consists of parameters, factors, and other relevant characteristics. This set is organized hierarchically into levels: first-level attributes A = {A1, A2, …, An}, for integer n ≥ 1, where A1, A2, …, An are considered attributes of first level (since they have one-digit indexes).
Each attribute
Ai, 1 ≤
i ≤
n, is formed by sub-attributes:
where the above
Ai,
j are sub-attributes (or attributes of second level) (since they have two-digit indexes).
Again, each sub-attribute
Ai,
j is formed by sub-sub-attributes (or attributes of third level):
And so on, with as much refinement as needed going into each application, up to sub-sub-…-sub-attributes (or attributes of
m-level (or having
m digits into the indexes):
This hierarchical structure forms a graph-tree, denoted as Tree(A), with A as the root node (level zero), followed by nodes at levels 1 to m, where m represents the maximum level of refinement. The leaves of this graph-tree are terminal nodes that have no descendants.
The TreeSoft Set, denoted as
maps subsets of the graph-tree Tree(
A) to subsets of H. The powerset P(Tree(A)) encompasses all possible subsets of the graph-tree.
All node sets of the TreeSoft Set of level m are
The sets within the TreeSoft Set correspond to nodes at each level of the graph-tree: the first set consists of nodes at level 1, the second set consists of nodes at level 2, and so on, up to the last set comprising nodes at level
m. If the graph-tree has only two levels (
m = 2), then the TreeSoft Set simplifies to a MultiSoft Set [
7].
In summary, the TreeSoft Set provides a structured approach for representing and analyzing complex systems with hierarchical attributes.
By incorporating a hierarchical organization, it enhances the flexibility and expressiveness of soft set-based methodologies, enabling more nuanced modeling and analysis of multi-level systems across various domains.
An illustrative example of a classical tree is shown in
Figure 2.
This tree contains three levels as followed:
Level 0 (the root) is the node Attributes;
Level 1 is formed by the nodes: Diagnosis, Treatment;
Level 2 is formed by the nodes Diabetes, Cancer, Medication, and Surgery;
Level 3 is formed by the nodes Pills, Injections.
Let us consider p = {patient1, patient2,…, patient10} to be a set of patients, and P(p) to be the power set of p.
The attributes are defined as follows: A = {A1,A2}
Then,
and
A2 = {
A21,
A22} = {Medication,Surgery}.
Let us further break down A22 into A221 and A222, representing specific treatments:
A221 = {Pills,Injections} for medication and A222 = {Chemotherapy,Radiation} for surgery.
Now, let us assume the function F has the following values:
F(Diabetes,Medication,Pills) = {p1,p2,p3,p4};
F(Diabetes,Medication,Injections) = {p5,p6};
F(Diabetes,Surgery,Chemotherapy) = {p7,p8};
F(Cancer,Surgery,Radiation) = {p9,p10}.
The TreeSoft Set introduces a hierarchical structure to soft set methodologies, enabling the representation and analysis of complex biological data in a hierarchical manner.
By organizing data into hierarchical trees, the TreeSoft Set facilitates the exploration of nested relationships and dependencies within healthcare claims datasets, offering insights into the hierarchical organization of biological systems.
6. Conclusions
The evolution and adoption of soft sets, along with their extensions—such as HyperSoft Sets, IndetermSoft Sets, IndetermHyperSoft Sets, and TreeSoft Sets—represent a significant advancement in computational methodologies, especially in healthcare claims data analysis. These extensions offer innovative ways to model and analyze complex datasets characterized by uncertainty, imprecision, and indeterminacy, which are prevalent in healthcare data.
In the context of bioinformatics, where data is diverse and frequently noisy or incomplete [
28,
29], the adaptability of soft sets proves invaluable. They offer researchers a structured way to manage inherent uncertainties in biological data, such as those arising from gene expression profiles, protein interactions, and metabolic pathways [
30,
31]. By embracing fuzziness and imprecision, soft sets empower researchers to perform more accurate and robust analyses, revealing deeper insights into biological systems.
In healthcare, soft sets enable a nuanced representation of relationships within datasets, capturing complexities that traditional statistical methods may overlook. Given the often incomplete and ambiguous nature of healthcare claims data, soft sets provide a flexible framework for handling such uncertainty, improving the accuracy of data interpretation and decision-making processes.
Additionally, soft sets integrate seamlessly with other computational techniques, such as fuzzy logic, further enhancing their utility in data analysis across various fields. This flexibility makes them essential tools not only in healthcare but also in broader domains where managing uncertainty is critical.
In conclusion, soft sets and their extensions present a powerful framework for addressing the intricacies of healthcare claims data. Their ability to manage uncertainty, imprecision, and complexity holds great promise for improving diagnostics, personalized treatments, and overall decision-making in healthcare. As future research continues to explore the integration of soft sets with emerging technologies, these methodologies will play an increasingly pivotal role in healthcare, bioinformatics, and beyond.