A Lexicon-Based Framework for Mining and Analysis of Arabic Comparative Sentences
Abstract
:1. Introduction
- A comprehensive lexicon-based framework for the detailed mining and analysis of Arabic comparative sentences is proposed and evaluated.
- An algorithm for the identification of Arabic comparative sentences is proposed. The algorithm is referred is to referred to as the Arabic Comparative Sentence Identification (ACSI) algorithm.
- An algorithm for the identification of Arabic comparative sentences types. The algorithm is referred to as Arabic Comparative Sentence Type Identification (ACSTI) algorithm. All types of Arabic comparative sentences are considered in this algorithm. The identified comparative sentences were classified into four different types, namely non-equal gradable, equative, superlative, and non-gradable.
- An algorithm for the extraction of the relation between the different entities in the Arabic comparative sentence is proposed. The algorithm is referred to as the Relation Extraction from Arabic Comparative Sentence (REACS) algorithm. REACS algorithm considers all elements that form any relation which are a relation word, a comparison feature, a first entity, also referred to as entity 1, and a second entity, also referred to as entity 2.
- Two algorithms for the extraction of the preferred entity are proposed. These algorithms are referred to as Preferred Entity Extraction from Arabic Non-Equal Comparative Sentence (PEEANCS) and Preferred Entity Extraction from Arabic Superlative Comparative Sentence (PEEASCS) algorithms.
- An Arabic comparative keyword lexicon is specifically developed to evaluate the proposed algorithms. This lexicon contains 649 Arabic comparative keywords that cover all the Arabic comparative sentence types.
2. Related Work
3. Proposed Framework
3.1. Arabic Comparative Keywords Lexicon
- Non-equal gradable relation expresses a greater than or less than relation in which an ordering of two entities with respect to some of their features is applied. An example Arabic sentence of such a type is دراسة أوراكل أعمق من مایكروسوفت (studying Oracle is deeper than Microsoft).
- Equative relation expresses a relation which states two objects are equal with respect to some of their features. For an example, an Arabic sentence of such a type is الجامعتان نفس المستوى في التعلیم (the two universities have the same level of education).
- Superlative relation expresses a relation that is greater than or less than all others or in other words it ranks one object over all others. In Arabic language such a relation adds ال (the) to the comparison word as in االنادى الاهلى المصري الافضل في التاريخ (the Egyptian Club Al-Ahly is the best in the history).
- Non-gradable relation expresses sentences that compare features of two or more objects, but do not grade them. An example Arabic sentence of such a type is تدریس الدكتور رشدي یختلف عن تدریس الدكتور عیسى (the teaching style of doctor Roshdi differs from from the teaching style of doctor Esaa).
3.2. Data Collection
3.3. Data Preprocessing
3.4. Arabic Comparative Sentence Identification
Algorithm 1 Pseudo-code of the ACSI algorithm |
Input: dataset of preprocessed sentences. Output: dataset of comparative sentences.
|
3.5. Arabic Comparative Sentence Type Identification
- Non-Equal Gradable Comparison Type: Relations of this type express an ordering of objects with regard to some of their features. An example of this sentence type is the sentence that contains the Arabic comparative keyword, whose format is أفعل. This keyword has an original verb that consists of three letters. An example of such sentence is دراسة أوراكل أعمق من مايكروسوفت (studying Oracle is deeper than Microsoft) where the comparative keyword is directly mentioned in the comparative sentence. On the contrary, if the verb contains more than three letters, the sentence will contain the Arabic word اقل او اكثر (less or more) and the sentence will be like the following سعيد أكثر إجتهادا من أخيه (Said has more diligence than his brother).
- Equative Comparison Type: Relations of this type state that two objects are equal with respect to some of their features. An example of such sentence is الجامعتان نفس المستوى فى التعليم (the two universities have the same level of education).
- Superlative Comparison Type: Relations of this type ranks one object over other objects. In Arabic language this type may add ال to the comparison word or not. Examples of this type are الأهلى المصرى الأفضل فى العالم (the Egyptian Club Al-Ahly is the best in the history) and رونالدو أفضل لاعب فى العالم (Ronaldo is the best player in the world).
- Non-gradable Comparison Type: Non-gradable comparative sentences type compares features of two or more objects, but do not grade them. There are three subtypes as follows:
- Object A is similar to or different from object B with regard to some features. An example of this type is تدريس الدكتور رشدى يختلف عن تدريس الدكتور عيسى (the teaching style of doctor Roshdi differs from from the teaching style of doctor Esaa).
- Object A has a feature f1, and object B has another feature f2 where f1 and f2 can substitute each other. An example of this type is الكمبيوتر المكتبى يستخدم سماعات خارجية أما اللاب توب يستخدم سماعات داخلية (the desktop computer uses external speakers while the laptop uses internal speakers).
- Object A has a certain feature, but object B does not have it. An example of this type is جوال أ يستخدم سماعات أذن وجوال ب لا يستخدم (mobile A uses headphones while mobile B does not).
Algorithm 2 Pseudo-code of the ACSTI algorithm |
Input: dataset of comparative sentences. Output: dataset of comparative sentences with identified sentence types
|
3.6. Relation Extraction
Algorithm 3 Pseudo-code of the REACS algorithm |
Input: dataset of NonEqual comparative sentences. Output: extracted relations from the input dataset.
|
3.7. Preferred Entity Extraction
3.7.1. Non-Equal Gradable Type
Algorithm 4 Pseudo-code of the PEEANCS algorithm |
Input: dataset of NonEqual comparative sentences. Output: extracted preferred entities from the input set.
|
3.7.2. Superlative Type
Algorithm 5 Pseudo-code of the PEEASCS algorithm |
Input: dataset of Superlative comparative sentences. Output: extracted preferred entities from the input dataset.
|
4. Evaluation Metrics
5. Results and Discussion
5.1. Evaluation of ACSI Algorithm
Discussion
- The ACSI algorithm only considers the Arabic words in the modern standard Arabic language, which is different in many of its words and meanings from the Egyptian colloquial as well as other Arabic dialects such as Gulf dialect.
- Diacritics in modern standard Arabic language change the meaning of the word completely such as the word نَعم which means yes in English; however, (نٍعْمَ) means the best in English. In this example, if the correct diacritics is considered in the analysis of the sentence, the sentence would be identified as a superlative comparative type. The ACSI algorithm does not take into account the diacritics of the words.
- The Arabic word لكن (but or however) may complicate some Arabic sentences and provide them with two different sentiments in the same time. Ana example of this is the Arabic sentence موبايل أ أحسن من ب لكن موبيل ب أحسن فى البطارية (mobile A is better than mobile B, but mobile B has better battery). In this sentence, two comparative sentences, i.e.,موبايل أ أحسن من موبيل ب and موبايل ب أحسن فى البطارية, are combined together. The ACSI algorithm can identify each of these two sentences per se; however, it cannot detect the relation between them.
- Some Arabic comparison keywords do not indicate a comparison at all depending on the context of the sentence. For example in the sentence لا يوجد أى أحد بالمكان (there is nobody in place), the word (أحد) means in English, nobody, not the sharpest like in the sentence (هذا السيف من أحد السيوف) (this sword is one of the sharpest swords). In the later sentence, the word (أحد) (sharpest) is considered a comparison keyword while in the former sentence, it is not. Such limitation also faces the application of the ACSI algorithm.
- Some Arabic comparative keywords cannot detected by ACSI algorithm because of extra characters added to the keyword. For example, the superlative comparative sentence ستم إعطاء جائزة لأفضل طالب فى المدرسة (a prize will be given to the best student in the school), cannot be truly identified using the ACSI algorithm. The reason for this is the presence of the comparison keyword لأفضل (to the best) by adding the Arabic character ل (to) to it. This limitation can be resolved by adding comparison keywords like لأفضل, to the best, to the developed lexicon.
- The exclamation Arabic sentence can be falsely identified as a comparative sentence such as !ما أسرع النزول (how quickly is getting down!). The presence of the exclamation symbol at the end of the sentence is not taken into account when applying the ACSI algorithm.
5.2. Evaluation of ACSTI Algorithm
5.2.1. Discussion of Non-Gradable Sentence Type Results
5.2.2. Discussion of Superlative Sentence Type Results
5.2.3. Discussion of Equative Sentence Type Results
5.2.4. Discussion of Non-Equal Gradable Sentence Type Results
- Some Arabic comparison keywords are used in the Arabic language as normal words not as comparison keywords. This is due to the misspelling of the comparison keywords. For example, in the following sentence اللهم أرحم من رحلو عنآ دون ودآع (may God have mercy on those who left us without saying goodbye), the comparison keyword here was supposed to be written (إرحم) (have mercy) but it was written (أرحم) (mercier). In this example, the sentence will be identified as a non-equal comparison sentence while it is not.
- Some Arabic comparison keywords can provide a meaning in the Arabic language that is completely different from the comparison meaning such as the Arabic keyword (أقل) (tell) in the sentence قل لى من أصحابك، أقل لك من أنت (if you told me who your friends are, I will tell you who you are). To resolve this issue the context of the sentence should be considered in the identification process.
- In some Arabic superlative sentences, like أحمد أفضل من عرفت فى حياتى (Ahmed is the best person I have known in my life), using the Arabic word من may force the sentence to be falsely identified as non-equal sentence type. In such a sentence, the word من means in English, a person not the word, than. Diacritics should be considered to obtain true identification results.
5.3. Evaluation of REACS, PEEANCS and PEEASCS Algorithms
5.3.1. Discussion of REACS Algorithm Evaluation Results
- In some Arabic language sentences, each of entity 1 and entity 2 consists of more than one word. Each entity can be formed of several words; however, it is considered only one element in the relation extraction process. This results in extracting false relation elements, i.e., entity 1, entity 2 and feature.
- The difference in the sentence order can affect the correct extraction of relation components. For example, in the superlative comparative sentence type, the relation word can exist in the beginning of the comparative sentence. An example of this in the sentence أول رخصة (the first license), the relation word is أول (the first) and entity 1 is رخصة (license). However, in other sentence, the relation word may exist in the middle of the sentence and entity 1 can exist in the beginning. For example, in the sentence البلاد فى أضعف حال (the countries are in the weakest situation), the relation word is أضعف (the weakest) and entity 1 is البلاد (the countries). A third example, is in the sentence أصعب ما فيها الطموح (the hardest thing about it is ambition), where the relation word is أصعب (the hardest) and entity 1 is الطموح (ambition), which exists by the end of the sentence.
- Entity 1 can be itself a comparative sentence and can exist by the end of the main comparative sentence such as the sentence أن الأقرب للواقع يكون الأدرى بالمصلحة (the closest to reality is the most knowledgeable of the benefits). In this sentence, the relation word of the main comparative sentence is الأقرب (the closest) and entity 1 isالأدرى بالمصلحة (the most knowledgeable of the benefits).
- The correct relation extraction from some Arabic sentences mainly depends on precise understanding of the meaning and the context of the sentence such as أغنية أجمل سنين عمرنا (the song named “the most wonderful years in our life”). In this sentence, the relation word is أجمل and entity 1 is سنين عمرنا (“the most wonderful years in our life”) not أغنية (the song).
5.3.2. Discussion of PEEANCS and PEEASCS Evaluation Results
6. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
ACSI | Arabic Comparative Sentence Identification |
ACSTI | Arabic Comparative Sentence Type Identification |
REACS | Relation Extraction from Arabic Comparative Sentence |
PEEANCS | Preferred Entity Extraction from Arabic Non-Equal Comparative Sentence |
PEEASCS | Preferred Entity Extraction from Arabic Superlative Comparative Sentence |
References
- El-Halees, A.M. Opinion mining from Arabic comparative sentences. In Proceedings of the 13th International Arab Conference on Information Technology ACIT, Balamand, Lebanon, 11–13 December 2012. [Google Scholar]
- Sakr, A.M.; Keshk, A.; Youssef, A. Analysis and Mining of Arabic Comparative Sentences: A Literature Review. IJCI Int. J. Comput. Inf. 2024, 11, 66–78. [Google Scholar] [CrossRef]
- Alharbi, F.R.; Khan, M.B. Identifying comparative opinions in Arabic text in social media using machine learning techniques. SN Appl. Sci. 2019, 1, 1–13. [Google Scholar] [CrossRef]
- El Defrawi, M.; Salah, M.; Abd Al-Aziz, A.; Eldin, A.S. Comparative relation extraction from Arabic opinions. Int. J. Comput. Sci. Inf Secur. 2017, 15, 230–235. [Google Scholar]
- Bach, N.X.; Van Pham, D.; Tai, N.D.; Phuong, T.M. Mining Vietnamese comparative sentences for sentiment analysis. In Proceedings of the 2015 Seventh International Conference on Knowledge and Systems Engineering (KSE), Ho Chi Minh City, Vietnam, 8–10 October 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 162–167. [Google Scholar]
- Eldefrawi, M.M.; Elzanfaly, D.S.; Farhan, M.S.; Eldin, A.S. Sentiment analysis of Arabic comparative opinions. SN Appl. Sci. 2019, 1, 1–11. [Google Scholar] [CrossRef]
- Al-Sawi, L.; Saad, I. Al-Murshid: A Guide to Modern Standard Arabic Grammar for the Intermediate Level; Amer Univ in Cairo Press: Cairo, Egypt, 2012. [Google Scholar]
- DoniaGamal, M.A.; El-Horbaty, E.S.M.; Salem, A.B. Opinion mining for Arabic dialects on twitter. Egypt. Comput. Sci. J. 2018, 42. [Google Scholar]
- Younis, U.; Asghar, M.Z.; Khan, A.; Khan, A.; Iqbal, J.; Jillani, N. Applying machine learning techniques for performing comparative opinion mining. Open Comput. Sci. 2020, 10, 461–477. [Google Scholar] [CrossRef]
- Khan, A.; Younis, U.; Kundi, A.S.; Asghar, M.Z.; Ullah, I.; Aslam, N.; Ahmed, I. Sentiment classification of user reviews using supervised learning techniques with comparative opinion mining perspective. In Proceedings of the Advances in Computer Vision: Proceedings of the 2019 Computer Vision Conference (CVC), Las Vegas, NV, USA, 2–3 May 2019; Springer: Berlin/Heidelberg, Germany, 2020; Volume 21, pp. 23–29. [Google Scholar]
- Yang, S.; Ko, Y. Classifying Korean comparative sentences for comparison analysis. Nat. Lang. Eng. 2014, 20, 557–581. [Google Scholar] [CrossRef]
- Liu, Q.; Huang, H.; Zhang, C.; Chen, Z.; Chen, J. Chinese comparative sentence identification based on the combination of rules and statistics. In Proceedings of the Advanced Data Mining and Applications: 9th International Conference, ADMA 2013, Hangzhou, China, 14–16 December 2013; Proceedings, Part II 9. Springer: Berlin/Heidelberg, Germany, 2013; pp. 300–310. [Google Scholar]
- Alotaibi, N.; Al-onazi, B.B.; Nour, M.K.; Mohamed, A.; Motwakel, A.; Mohammed, G.P.; Yaseen, I.; Rizwanullah, M. Political Optimizer with Probabilistic Neural Network-Based Arabic Comparative Opinion Mining. Intell. Autom. Soft Comput. 2023, 36, 3121–3137. [Google Scholar] [CrossRef]
- Nabil, M.; Aly, M.; Atiya, A. Astd: Arabic sentiment tweets dataset. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; pp. 2515–2519. [Google Scholar]
- Martinez, A.R. Part-of-speech tagging. Wiley Interdiscip. Rev. Comput. Stat. 2012, 4, 107–113. [Google Scholar] [CrossRef]
- Berrar, D. Bayes’ theorem and naive Bayes classifier. In Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics; Elsevier: Amsterdam, The Netherlands, 2018; Volume 403. [Google Scholar]
- Guo, G.; Wang, H.; Bell, D.; Bi, Y.; Greer, K. KNN model-based approach in classification. In Proceedings of the OTM Confederated International Conferences “On the Move to Meaningful Internet Systems”, Sicily, Italy, 3–7 November 2003; Springer: Berlin/Heidelberg, Germany, 2003; pp. 986–996. [Google Scholar]
- Bonaccorso, G. Machine Learning Algorithms; Packt Publishing Ltd: Birmingham, UK, 2017. [Google Scholar]
- Sutton, C.; McCallum, A. An introduction to conditional random fields. Found. Trends® Mach. Learn. 2012, 4, 267–373. [Google Scholar] [CrossRef]
- Wallach, H.M. Conditional Random Fields: An Introduction. Technical Reports (CIS). 2004; p. 22. Available online: https://www.inference.org.uk/hmw26/papers/crf_intro.pdf (accessed on 5 January 2025).
- Dalianis, H. Evaluation Metrics and Evaluation. In Clinical Text Mining: Secondary Use of Electronic Patient Records; Springer International Publishing: Cham, Switzerland, 2018; pp. 45–53. [Google Scholar]
- Bayazed, A.; Almagrabi, H.; Alahmadi, D.; Alghamdi, H. ACOM: Arabic Comparative Opinion Mining in Social Media Utilizing Word Embedding, Deep Learning Model & LLM-GPT. IEEE Access 2024, 12, 148741–148755. [Google Scholar]
- Setyanto, A.; Laksito, A.; Alarfaj, F.; Alreshoodi, M.; Oyong, I.; Hayaty, M.; Alomair, A.; Almusallam, N.; Kurniasari, L. Arabic language opinion mining based on long short-term memory (LSTM). Appl. Sci. 2022, 12, 4140. [Google Scholar] [CrossRef]
- Antoun, W.; Baly, F.; Hajj, H. Arabert: Transformer-based model for arabic language understanding. arXiv 2020, arXiv:2003.00104. [Google Scholar]
- Abdul-Mageed, M.; Elmadany, A.; Nagoudi, E.M.B. ARBERT & MARBERT: Deep bidirectional transformers for Arabic. arXiv 2020, arXiv:2101.01785. [Google Scholar]
Reference | Approach Type | Comparative Sentence Identification | Comparative Sentence Type Identification | Relation Extraction | Preferred Entity Extraction |
---|---|---|---|---|---|
El-Halees [1] | Machine Learning | ✓ | x | x | x |
Alharbi and Khan [3] | Lexicon-Based | ✓ | x | x | x |
Eldefrawi et al. [4] | Machine Learning | x | x | ✓ | x |
Eldefrawi et al. [6] | Machine Learning | x | x | x | ✓ |
Alotaibi et al. [13] | Deep Learning | ✓ | x | x | x |
Proposed Work | Lexicon-Based | ✓ | ✓ | ✓ | ✓ |
Sentence Type | No. of Keywords | Keywords Considered |
---|---|---|
Non-equal gradable (positive sentiment) | 230 | أفضل أعمق أجمل أكثر أحسن أعرق أسرع أنعم أسعد أجدى أصدق أبقى أعدل أنور أحلى أكفأ أمجد أشرف أخلص أرق أصفى أقرأ أقرب أشجع أمتع أكتم أحفظ أودع أصبى أعرض أوسع أقبل أبر أرحم أقصى أنصف أطوع أجمد أخضع أزهد أكف أسهل أحب أسمع أفصح أقسط أقوم أمثل أصلح أزيد أولى ألطف أنفع أهدى أخف أثبت أول أمنع أقوى أميز أزهى أشبه أعسل أكرم أعلم أرفع أثمن أعظم أحق أكمل أقدم أحدث أعلى أكبر أغلى أطول أقصر أوضح ألين أقول أهيم أسير أذكى أسلم أعرف أسمى أروع أصح أجمع أظهر أصبر أهدء أحد أرجل أرشد أغنى أوفى أفتح أعز أطهر أحصى أحرص أنظف أبيض أخير أزهر أذخر أطيب أغور أسبق افضل اعمق اجمل اكثر احسن اعرق اسرع انعم اسعد اجدى اصدق ابقى اعدل انور احلى اكفأ امجد اشرف اخلص ارق اصفى اقرأ اقرب اشجع امتع اكتم احفظ اودع اصبى اعرض اوسع اقبل ابر ارحم اقصى انصف اطوع اجمد ازهد اكف اسهل احب اسمع افصح اقسط اقوم امثل اصلح ازيد اولى الطف انفع اهدى اخف اثبت اول امنع اقوى اميز ازهى اشبه اعسل اكرم اعلم ارفع اثمن اعظم احق اكمل اقدم احدث اعلى اكبر اغلى اطول اقصر اوضح الين اقول اهيم اسير اذكى اسلم اعرف اسمى اروع اصح اجمع اصبر اعجل اهدء احد ارجل ارشد اغنى اوفى افتح اعز اطهر احصى احرص ابيض اخير ازهر اذخر اطيب اغور اسبق أشيك اشيك انضف أنضف امرح أمرح أعجب اعجب أشهر اشهر أجدع اجدع |
Non-equal gradable (negative sentiment) | 67 | أقل أسوء أبطأ أخشن أبئس أظلم أسفل أقدر أبعد أبرد أحر أضيق أجحد أقسى أعصى أجمد أغمض أصعب أكفر أنقص أجهل أضر أثقل أخر أضعف أشذ أهزل أهبل أهون أمر أحمض أسود أموت أسمن أحقر أقبح أصغر أرخص أقصر أشد أبيع أغبى أبشع أدنى أشتى أشقى أبخل أفقر أصقع أذل أنكر أنجس أضل أقذر أشر أجشع اقل اسوء ابطأ اخشن ابئس اظلم اسفل اقدر ابعد ابرد احر |
Equative | 6 | نفس, متساوى, متساويين, متساوين, مطابق, نفسه |
Superlative (positive sentiment) | 236 | الأفضل الافضل الافضلان الأفضلان الأفضلون الافضلون الفضلى الفضليات الفضليان الأعمق الاعمق الأجمل الاجمل الأكثر الاكثر الأحسن الاحسن الأعرق الاعرق الأسرع الاسرع الأنعم الانعم الأسعد الاسعد الأجدى الاجدى الأصدق الاصدق الأبقى الابقى الأعدل الاعدل الأنور الانور الأحلى الاحلى الأكفأ الاكفأ الأمجد الامجد الأشرف الاشرف الأخلص الاخلص الأرق الارق الأصفى الاصفى الأقرب الاقرب الأشجع الاشجع الأدرى الادرى الأعلى الاعلى الأكبر الاكبر الأقصى الاقصى الأهم الاهم الأمتع الأمتع الأكتم الاكتم الأحفظ الاحفظ الأودع الاودع الأصبى الأعرض الأوسع الأبر الأرحم الأنصف الأطوع الأجمد الأخضع الأزهد الأكف الأغمض الأسهل الأحب الأفصح الأقسط الأمثل الأصلح الأزيد الأولى الألطف الأنفع الأهدى الأثقل الأول الأمنع الأقوى الاميز الازهى الاشبه الاعسل الاعوم الاكرم الاعلم الارفع الاثمن الاعظم الاحق الاكمل الاقدم الاحدث الاغلى الاطول الاوضح الاشد الالين الاقول الاهيم الاذكى الاسمى الاروع الاصح الاشتى الاظهر الاصبر الاهدء الارجل الارشد الاغنى الاوفى الافتح الاعز الاطهر الاحصى الاحرص الانظف الازهر الاذخر الاطيب الاغور الاسبق الاصبى الاعرض الاوسع الابر الارحم الانصف الاطوع الاجمد الاخضع الازهد الاكف الاغمض الاسهل الاحب الافصح الاقسط الامثل الاصلح الازيد الاولى الالطف الانفع الاهدى الاثقل الاول الاخر الامنع الاقوى الأميز الأزهى الأشبه الأعسل الأعوم الأكرم الأعلم الأرفع الأثمن الأعظم الأحق الأكمل الأقدم الأحدث الأغلى الأطول الأوضح الأشد الألين الأقول الأهيم الأذكى الأسمى الأروع الأصح الأشتى الأظهر الأصبر الأهدء الأرجل الأرشد الأغنى الأوفى الأفتح الأغمق الأعز الأطهر الأحصى الأحرص الأنظف الأزهر الأذخر الأطيب الأغور الأسبق الآخر الاشيك الاشيك الامرح الأمرح الأضخم الاضخم الاعجب الأعجب الأكثر الاكثر الاكتر الأكتر الاشهر الأشهر الأجدع الاجدع الاعجب الأعجب الانضف الأنضف |
Superlative (negative sentiment) | 113 | الأقل الاقل الأسوء الاسوء الأبطأ الابطأ الأخشن الاخشن الأبئس الابئس الأظلم الاظلم الأبعد الابعد الأبرد الابرد الأحر الاحر الأضيق الأجحد الأقسى الأعصى الأصعب الأكفر الأنقص الأجهل الأضر الأخف الأخر الأضعف الاشذ الاهزل الاخسر الاهبل الاهون الامر الاحمض الاسمن الاحقر الاقبح الأصغر الارخص الاقصر الاشد الابيع الاغبى الابشع الاعجل الاشقى الابخل الافقر الاصقع الاغمق الاذل الانكر الانجس الاضل الاخفى الاقذر الاشر الاجشع الاضيق الاجحد الاقسى الاعصى الاصعب الاكفر الانقص الاجهل الاضر الاخف الاخر الاضعف الأشذ الأهزل الأخسر الأهبل الأهون الأمر الأحمض الأسمن الأحقر الأقبح الأصغر الأرخص الأقصر الأبيع الأغبى الأبشع الأعجل الأشقى الأبخل الأفقر الأصقع الأغمق الأذل الأنكر الأنجس الأضل الأخفى الأقذر الأشر الأجشع الأصغر الآخر الاهطل الأهطل الاخطر الأخطر الاسوء الأسوء الاسوأ الأسوأ |
Non-gradable | 3 | يختلف, أما, اما |
Total | 649 |
Dataset | Total Number of Sentences | Number of Comparative Sentences | Number of Non-Comparative Sentences |
---|---|---|---|
Twitter(ASTD) [14] | 10,005 | 1345 | 8660 |
MSA | 100 | 70 | 30 |
Social Media | 501 | 217 | 284 |
Dataset | Number of Non-Equal Gradable Sentences | Number of Equative Sentences | Number of Superlative Sentences | Number of Non-Gradable Sentences | Sentences Not Classified |
---|---|---|---|---|---|
36 | 36 | 523 | 29 | 721 | |
MSA | 24 | 6 | 29 | 3 | 8 |
Social Media | 20 | 10 | 166 | 20 | 1 |
Dataset | Precision | Recall | F-Score | Accuracy |
---|---|---|---|---|
84.5 | 83.49 | 83.99 | 95.72 | |
MSA | 100 | 100 | 100 | 100 |
Social Media | 96.17 | 92.63 | 94.37 | 95.21 |
Dataset | Sentence Type | Precision | Recall | F-Score | Accuracy |
---|---|---|---|---|---|
Non-equal Gradable | 88.89 | 37.65 | 52.89 | 90.69 | |
Equative | 100 | 100 | 100 | 100 | |
Superlative | 87.57 | 99.13 | 92.99 | 88.94 | |
Non-gradable | 100 | 100 | 100 | 100 | |
MSA | Non-equal Gradable | 100 | 64.86 | 78.69 | 82.67 |
Equative | 100 | 100 | 100 | 100 | |
Superlative | 78.38 | 100 | 87.88 | 88.57 | |
Non-gradable | 100 | 100 | 100 | 100 | |
Social Media | Non-equal Gradable | 100 | 90.91 | 95.24 | 99.05 |
Equative | 100 | 100 | 100 | 100 | |
Superlative | 95.78 | 100 | 97.85 | 96.76 | |
Non-gradable | 100 | 100 | 100 | 100 |
Dataset | Sentence Type | Relation Word | Feature | Entity1 | Entity2 | Preferred Entity |
---|---|---|---|---|---|---|
Non-equal Gradable | 88.89 | 8.33 | 52.78 | 77.78 | 58.33 | |
Equative | 100 | 88.89 | 47.22 | 25 | N/A | |
Superlative | 82.79 | 31.36 | 52.01 | N/A | 50.67 | |
Non-gradable | 100 | N/A | 62.07 | 96.55 | N/A | |
MSA | Non-equal Gradable | 100 | 95.83 | 100 | 100 | 87.5 |
Equative | 100 | 100 | 100 | 83.33 | N/A | |
Superlative | 100 | 68.97 | 86.21 | N/A | 86.21 | |
Non-gradable | 100 | N/A | 100 | 100 | N/A | |
Social Media | Non-equal Gradable | 100 | 90 | 85 | 95 | 55 |
Equative | 100 | 90 | 70 | 60 | N/A | |
Superlative | 95.6 | 82.5 | 41.9 | N/A | 50.6 | |
Non-gradable | 100 | N/A | 100 | 100 | N/A |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hamed, A.; Keshk, A.; Youssef, A. A Lexicon-Based Framework for Mining and Analysis of Arabic Comparative Sentences. Algorithms 2025, 18, 44. https://doi.org/10.3390/a18010044
Hamed A, Keshk A, Youssef A. A Lexicon-Based Framework for Mining and Analysis of Arabic Comparative Sentences. Algorithms. 2025; 18(1):44. https://doi.org/10.3390/a18010044
Chicago/Turabian StyleHamed, Alaa, Arabi Keshk, and Anas Youssef. 2025. "A Lexicon-Based Framework for Mining and Analysis of Arabic Comparative Sentences" Algorithms 18, no. 1: 44. https://doi.org/10.3390/a18010044
APA StyleHamed, A., Keshk, A., & Youssef, A. (2025). A Lexicon-Based Framework for Mining and Analysis of Arabic Comparative Sentences. Algorithms, 18(1), 44. https://doi.org/10.3390/a18010044