Structured Life Narratives: Building Life Story Hierarchies with Graph-Enhanced Event Feature Refinement
Abstract
:1. Introduction
- LSH-GEFR builds life story hierarchies that amalgamate identical events into one event node and classify related events into one topic node. The root node, the topic node, and the event node are three different types of nodes that LSH-GEFR uses to organize life stories into hierarchies. These nodes signify the beginning of the tree, the topic of the events on the same branch, and the events under one topic branch, respectively.
- We propose a bilayer graph to enhance event feature refinement. The first layer is the event element map, and the second layer is the event map established on the event element map. The two maps combine to capture the potential correlation between events to extract additional event features and optimize event feature representation. This leads to an increased similarity among relevant event features while distinguishing irrelevant event features. Consequently, it enhances the identification of event relationships when dealing with limited sample sizes.
- We conducted a validation of LSH-GEFR’s performance across three distinct datasets. The experimental results show that LSH-GEFR excels in path coherence, branch reasonableness, and overall readability when generating structured life narratives. Over 84.91% of the structured life narratives achieve readability, 5.96% higher than the best-performing approach at the baseline.
2. Related Works
2.1. Life Narratives
2.2. Event Feature Refinement with Graph-Enhanced Methodologies
2.3. Structured Organization of Events
3. Method
3.1. Problem Definition
3.2. LSH-GEFR Architecture
3.3. Event Extraction
- The trigger word exists within the original life story text.
- The trigger word is a verb or a gerund.
- The trigger word directly prompts the occurrence of an event.
3.4. Graph-Enhanced Structure for Event Fusion and Clustering
3.4.1. Bilayer Graph-Enhanced Structure
3.4.2. DBN-Enhanced DBSCAN for Event Clustering
3.5. Event Tree Generation
3.5.1. Event Tree Construction
3.5.2. Event Tree Updating
Algorithm 1: Event Tree Update Process |
Input: represents the newly inserted life story, represents the current event tree, which consists of multiple events and multiple branches . The threshold for event similarity is 0.88, and the threshold for events belonging to the same cluster is 0.65. Output: represents the latest event tree after inserting a new life story.
|
4. Experiments
4.1. Datasets
4.2. Baselines
- LDA + Temporal Ordering (LDA + TO) [46]: This method builds a single LDA topic model over the datasets and temporally orders the events under the same topic into a timeline chain. This method exemplifies the naive approach to solving the timeline structure summarization problem.
- Story Forest [8]: Story Forest adopts a bilayer keyword graph to capture event relationships. The goal is to extract and cluster events from breaking news and generate event lines based on clustering results.
- EventKG [16]: EventKG is a multilingual event-centric temporal knowledge graph. The effectiveness of the biographical timeline generation is demonstrated based on the EventKG.
- EventNET [47]: EventNET is an automatic event tree generation method based on a single-layer event network.
- LSH-GEFR (No GE): We remove the graph enhancement part of the method and directly input the original event feature based on Bert into the integrated D2E-EC clustering framework.
- LSH-GEFR (No D2E-EC): After obtaining the optimized event feature vector based on the graph enhancement model, we group the events based on the event similarity. If the similarity of two events is greater than a threshold, they are grouped into a topic. The first cluster centroid is randomly selected and then iteratively selects the most dissimilar events from the current cluster centroids as the next cluster centroid.
4.3. Path Coherence Evaluation
4.4. Branch Reasonableness Evaluation
4.5. Readability Evaluation
- Question 1: Do all the events in each topic truly talk about the same topic (yes or no)?
- Question 2: Do all the life stories in each event node truly talk about the same event (yes or no)?
- Question 3: Do you agree that the event branches are logically coherent for each event tree generated by different methods?
- Question 4: Do you agree that the event tree is overall comprehensible for each event tree generated by different methods?
4.6. Influence of Parameters
5. Discussion
- LDA + TO builds a single LDA topic model over the datasets and temporally orders the events into a tree structure based on time clues. This method exemplifies the naive approach to solving the topic structured events problem. Compared with LDA + TO, LSH-GEFR is optimized in framework design, event representation, and event relationship handling.
- The Story Forest algorithm selects event features based on word features, TF-IDF-derived structural traits, and semantic features derived from LDA. LSH-GEFR relies on BERT-processed textual attributes as its foundation. Utilizing a dual-layered graph structure, it amalgamates multidimensional information such as event elements, element relations, and event connections to represent event features, enhancing the similarity of relevant event features.
- Both LSH-GEFR and Story Forest adopt graph-enhancement technology. Compared with Story Forest, which focuses on word co-occurrence in event text, the LSH-GEFR method concentrates on elements like time, location, people, and types within the event element graph. It is more advantageous in the structured event representation that emphasizes “people”, “time”, “ location”, and “what happened”.
- LSH-GEFR, akin to EventKG’s approach, employs a strategy involving big national event knowledge to optimize and supplement ambiguous temporal elements within events. Within the hierarchical construction of life narratives, this methodology enhances the accuracy of path branching.
- The LSH-GEFR method combines the relationship between event elements and events to build a two-layer graph enhancement structure and optimize event features. The EventNET algorithm focuses on event relationships using a one-layer event graph, but overlooks the importance of event elements.
- In the event cluster task, EventKG, and EventNET groups, events are solely based on relevance, resulting in weaker performance in organizing events thematically. The LSH-GEFR method uses a specifically designed D2E-EC event clustering framework based on DBN to reduce the influence of high-dimensional features on the clustering effect, which demonstrates superior performance in event clustering tasks.
6. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Béjot, Y.; Yaffe, K. Ageing population: A neurological challenge. Neuroepidemiology 2019, 52, 76–77. [Google Scholar] [CrossRef]
- Paradis, S.; Roussel, J.; Bosson, J.L.; Kern, J.B. Use of smartphone health apps among patients aged 18 to 69 years in primary care: Population-based cross-sectional survey. JMIR Form. Res. 2022, 6, e34882. [Google Scholar] [CrossRef] [PubMed]
- Stargatt, J.; Bhar, S.; Bhowmik, J.; Al Mahmud, A. Digital storytelling for health-related outcomes in older adults: Systematic review. J. Med. Internet Res. 2022, 24, e28113. [Google Scholar] [CrossRef]
- Köber, C.; Habermas, T. Development of temporal macrostructure in life narratives across the lifespan. Discourse Process. 2017, 54, 143–162. [Google Scholar] [CrossRef]
- Sun, W.; Wang, Y.; Gao, Y.; Li, Z.; Sang, J.; Yu, J. Comprehensive event storyline generation from microblogs. In Proceedings of the ACM Multimedia Asia, Beijing, China, 15–18 December 2019; pp. 1–7. [Google Scholar]
- Li, J.; Cardie, C. Timeline generation: Tracking individuals on twitter. In Proceedings of the 23rd International Conference on World Wide Web, Seoul, Republic of Korea, 7–11 April 2014; pp. 643–652. [Google Scholar]
- Ansah, J.; Liu, L.; Kang, W.; Kwashie, S.; Li, J.; Li, J. A graph is worth a thousand words: Telling event stories using timeline summarization graphs. In Proceedings of the World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 2565–2571. [Google Scholar]
- Liu, B.; Niu, D.; Lai, K.; Kong, L.; Xu, Y. Growing story forest online from massive breaking news. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 777–785. [Google Scholar]
- Browne-Yung, K.; Walker, R.B.; Luszcz, M.A. An examination of resilience and coping in the oldest old using life narrative method. Gerontologist 2017, 57, 282–291. [Google Scholar] [CrossRef] [PubMed]
- Thompson, R. Using life story work to enhance care. Nurs. Older People 2011, 23, 16–21. [Google Scholar] [CrossRef] [PubMed]
- Subramaniam, P.; Woods, B. Digital life storybooks for people with dementia living in care homes: An evaluation. Clin. Interv. Aging 2016, 11, 1263–1276. [Google Scholar] [CrossRef]
- Fiddian-Green, A.; Kim, S.; Gubrium, A.C.; Larkey, L.K.; Peterson, J.C. Restor (y) ing health: A conceptual model of the effects of digital storytelling. Health Promot. Pract. 2019, 20, 502–512. [Google Scholar] [CrossRef]
- Keown, K.; Tkatch, R.; Martin, D.; Duffy, M.; Wu, L.; Schaeffer, J.; Wicker, E. Lifebio: Life stories of older adults to reduce loneliness and improve social connectedness. Innov. Aging 2018, 2, 241. [Google Scholar] [CrossRef]
- Liu, J.; Zhang, R.; Liu, W.; Zhang, Y.; Gu, D.; Tong, M.; Wang, X.; Xue, J.; Wang, H. Context2Vector: Accelerating security event triage via context representation learning. Inf. Softw. Technol. 2022, 146, 106856. [Google Scholar] [CrossRef]
- Yang, Z.; Li, Q.; Xie, H.; Wang, Q.; Liu, W. Learning representation from multiple media domains for enhanced event discovery. Pattern Recognit. 2021, 110, 107640. [Google Scholar] [CrossRef]
- Gottschalk, S.; Demidova, E. EventKG: A multilingual event-centric temporal knowledge graph. In Proceedings of the European Semantic Web Conference, Heraklion, Greece, 3–7 June 2018; Springer: Cham, Switzerland, 2018; pp. 272–287. [Google Scholar]
- Martz, C.J.; Powell, R.L.; Wee, B.S.C. Engaging children to voice their sense of place through location-based story making with photo-story maps. Child. Geogr. 2020, 18, 148–161. [Google Scholar] [CrossRef]
- Yang, C.C.; Shi, X.; Wei, C.P. Discovering event evolution graphs from news corpora. IEEE Trans. Syst. Man Cybern.-Part A Syst. Humans 2009, 39, 850–863. [Google Scholar] [CrossRef]
- Franklin, N.T.; Norman, K.A.; Ranganath, C.; Zacks, J.M.; Gershman, S.J. Structured Event Memory: A neuro-symbolic model of event cognition. Psychol. Rev. 2020, 127, 327. [Google Scholar] [CrossRef] [PubMed]
- Keith Norambuena, B.F.; Mitra, T. Narrative maps: An algorithmic approach to represent and extract information narratives. Proc. ACM Hum.-Comput. Interact. 2021, 4, 1–33. [Google Scholar] [CrossRef]
- El-Kassas, W.S.; Salama, C.R.; Rafea, A.A.; Mohamed, H.K. Automatic text summarization: A comprehensive survey. Expert Syst. Appl. 2021, 165, 113679. [Google Scholar] [CrossRef]
- Hua, T.; Zhang, X.; Wang, W.; Lu, C.T.; Ramakrishnan, N. Automatical storyline generation with help from twitter. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, Indianapolis, IN, USA, 24–28 October 2016; pp. 2383–2388. [Google Scholar]
- Lin, C.; Lin, C.; Li, J.; Wang, D.; Chen, Y.; Li, T. Generating event storylines from microblogs. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, Maui, HI, USA, 29 October–2 November 2012; pp. 175–184. [Google Scholar]
- Yan, Z.; Tang, X. Hierarchical storyline generation based on event-centric temporal knowledge graph. In Proceedings of the International Symposium on Knowledge and Systems Sciences, Beijing, China, 11–12 June 2022; Springer: Singapore, 2022; pp. 149–159. [Google Scholar]
- Li, D.; Yan, L.; Zhang, X.; Jia, W.; Ma, Z. EventKGE: Event knowledge graph embedding with event causal transfer. Knowl.-Based Syst. 2023, 278, 110917. [Google Scholar] [CrossRef]
- Keith Norambuena, B.F.; Mitra, T.; North, C. A survey on event-based news narrative extraction. ACM Comput. Surv. 2023, 55, 1–39. [Google Scholar] [CrossRef]
- Liu, B.; Han, F.X.; Niu, D.; Kong, L.; Lai, K.; Xu, Y. Story forest: Extracting events and telling stories from breaking news. ACM Trans. Knowl. Discov. Data 2020, 14, 1–28. [Google Scholar] [CrossRef]
- Yan, Z.; Tang, X. Narrative Graph: Telling Evolving Stories Based on Event-centric Temporal Knowledge Graph. J. Syst. Sci. Syst. Eng. 2023, 32, 206–221. [Google Scholar] [CrossRef] [PubMed]
- Kunimitsu, T.; Pacchetti, M.B.; Ciullo, A.; Sillmann, J.; Shepherd, T.G.; Taner, M.Ü.; van den Hurk, B. Representing storylines with causal networks to support decision making: Framework and example. Clim. Risk Manag. 2023, 40, 100496. [Google Scholar] [CrossRef]
- Zhang, C.; Lyu, J.; Xu, K. A storytree-based model for inter-document causal relation extraction from news articles. Knowl. Inf. Syst. 2023, 65, 827–853. [Google Scholar] [CrossRef] [PubMed]
- Shahaf, D.; Yang, J.; Suen, C.; Jacobs, J.; Wang, H.; Leskovec, J. Information cartography: Creating zoomable, large-scale maps of information. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, IL, USA, 11–14 August 2013; pp. 1097–1105. [Google Scholar]
- Yan, R.; Wan, X.; Otterbacher, J.; Kong, L.; Li, X.; Zhang, Y. Evolutionary timeline summarization: A balanced optimization framework via iterative substitution. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, Beijing, China, 24–28 July 2011; pp. 745–754. [Google Scholar]
- An, N.; Gui, F.; Jin, L.; Ming, H.; Yang, J. Toward better understanding older adults: A biography brief timeline extraction approach. Int. J. Hum.-Interact. 2023, 39, 1084–1095. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
- Liu, X.; Huang, H.Y.; Zhang, Y. Open Domain Event Extraction Using Neural Latent Variable Models. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 2860–2871. [Google Scholar]
- Che, W.; Li, Z.; Liu, T. Ltp: A chinese language technology platform. In Proceedings of the Coling 2010: Demonstrations, Beijing, China, 23–27 August 2010; pp. 13–16. [Google Scholar]
- Nusser, L.; Wolf, T.; Zimprich, D. How do we recall the story of our lives? Evidence for a temporal order in the recall of important life story events. Memory 2022, 30, 806–822. [Google Scholar] [CrossRef]
- Bluck, S.; Habermas, T. The life story schema. Motiv. Emot. 2000, 24, 121–147. [Google Scholar] [CrossRef]
- Tang, Y.; Huang, J.; Pedrycz, W.; Li, B.; Ren, F. A Fuzzy Clustering Validity Index Induced by Triple Center Relation. IEEE Trans. Cybern. 2023, 53, 5024–5036. [Google Scholar] [CrossRef] [PubMed]
- Tang, Y.; Chen, R.; Xia, B. VSFCM: A Novel Viewpoint-Driven Subspace Fuzzy C-Means Algorithm. Appl. Sci. 2023, 13, 6342. [Google Scholar] [CrossRef]
- Hinton, G.E.; Osindero, S.; Teh, Y.W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef]
- Schubert, E.; Sander, J.; Ester, M.; Kriegel, H.P.; Xu, X. DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN. ACM Trans. Database Syst. 2017, 42, 1–21. [Google Scholar] [CrossRef]
- Hua, Y.; Guo, J.; Zhao, H. Deep belief networks and deep learning. In Proceedings of the 2015 International Conference on Intelligent Computing and Internet of Things, Harbin, China, 17–18 January 2015; pp. 1–4. [Google Scholar]
- Atkinson, R. The life story interview as a bridge in narrative inquiry. In Handbook of Narrative Inquiry: Mapping a Methodology; Sage: New York, NY, USA, 2007; pp. 224–245. [Google Scholar]
- Mihalcea, R.; Tarau, P. Textrank: Bringing order into text. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, 25–26 July 2004; pp. 404–411. [Google Scholar]
- Chen, L.C. An effective LDA-based time topic model to improve blog search performance. Inf. Process. Manag. 2017, 53, 1299–1319. [Google Scholar] [CrossRef]
- Gui, F.; Wu, X.; Hu, M.; Yang, J. Automatic Life Event Tree Generation for Older Adults. In Proceedings of the International Conference on Human-Computer Interaction, Virtual, 26 June–1 July 2022; Springer: Cham, Switzerland, 2022; pp. 366–377. [Google Scholar]
- Řezanková, H. Different approaches to the silhouette coefficient calculation in cluster evaluation. In Proceedings of the 21st International Scientific Conference AMSE Applications of Mathematics and Statistics in Economics, Kutná Hora, Czech Republic, 29 August–2 September 2018; pp. 1–10. [Google Scholar]
- Singh, A.K.; Mittal, S.; Malhotra, P.; Srivastava, Y.V. Clustering Evaluation by Davies-Bouldin Index (DBI) in Cereal data using K-Means. In Proceedings of the 2020 Fourth International Conference on Computing Methodologies and Communication, Erode, India, 11–13 March 2020; pp. 306–310. [Google Scholar]
- Wang, X.; Xu, Y. An improved index for clustering validation based on Silhouette index and Calinski-Harabasz index. In IOP Conference Series: Materials Science and Engineering; IOP Publishing: Beijing, China, 2019; Volume 569, p. 052024. [Google Scholar]
- Zhou, H.; Xiong, F.; Chen, H. A Comprehensive Survey of Recommender Systems Based on Deep Learning. Appl. Sci. 2023, 13, 11378. [Google Scholar] [CrossRef]
Dataset | The Volume of Data | The Volume of Topic |
---|---|---|
OALS2.0 | 15,890 | 195 |
Twitter dataset | 17,420 | 80 |
CNN dataset | 46,855 | 61 |
Method | Interview Data | Internet Data | Twitter Data | CNN Data |
---|---|---|---|---|
LDA + TO | 39.87% | 44.26% | 54.09% | 53.72% |
Story Forest | 54.42% | 57.62% | 62.07% | 61.90% |
EventKG | 58.87% | 60.93% | 68.07% | 67.61% |
EventNET | 63.26% | 65.46% | 71.86% | 70.45% |
LSH-GEFR | 70.14% | 73.27% | 79.75% | 78.34% |
Data Source | Interview Data | Internet Data | ||||
---|---|---|---|---|---|---|
Method | SC | DB | CH | SC | DB | CH |
LDA + TO | 0.3056 | 0.4014 | 1.9256 | 0.3793 | 0.3873 | 1.9902 |
Story Forest | 0.3323 | 0.3616 | 2.5717 | 0.4881 | 0.3566 | 2.6262 |
EventKG | 0.3386 | 0.3590 | 2.7419 | 0.4930 | 0.3401 | 2.2414 |
EventNET | 0.3392 | 0.3596 | 2.6082 | 0.4980 | 0.3487 | 2.2796 |
LSH-GEFR (No GE) | 0.4189 | 0.3115 | 3.2829 | 0.5277 | 0.3634 | 3.4698 |
LSH-GEFR (No D2E-EC) | 0.3586 | 0.3501 | 2.7924 | 0.5100 | 0.3375 | 2.3340 |
LSH-GEFR | 0.4497 | 0.3096 | 3.5551 | 0.5754 | 0.3120 | 3.7159 |
Data Source | Twitter Data | ||
---|---|---|---|
Method | SC | DB | CH |
LDA + TO | 0.4377 | 0.3537 | 3.3915 |
Story Forest | 0.5276 | 0.3112 | 4.2580 |
EventKG | 0.5540 | 0.3273 | 3.8780 |
EventNET | 0.5559 | 0.3132 | 3.7226 |
LSH-GEFR (No GE) | 0.5458 | 0.3346 | 4.5008 |
LSH-GEFR (No D2E-EC) | 0.5392 | 0.3177 | 3.8639 |
LSH-GEFR | 0.5937 | 0.2911 | 4.8479 |
Data Source | CNN Data | ||
---|---|---|---|
Method | SC | DB | CH |
LDA + TO | 0.4373 | 0.3511 | 3.2814 |
Story Forest | 0.5178 | 0.2915 | 4.1373 |
EventKG | 0.5464 | 0.3172 | 3.7431 |
EventNET | 0.5534 | 0.3052 | 3.5871 |
LSH-GEFR (No GE) | 0.5440 | 0.3199 | 4.3913 |
LSH-GEFR (No D2E-EC) | 0.5325 | 0.3048 | 3.7700 |
LSH-GEFR | 0.5962 | 0.2810 | 4.7417 |
Method | 1 (%) | 2 (%) | 3 (%) | 4 (%) | 5 (%) | 6 (%) | 7 (%) | M | SD |
---|---|---|---|---|---|---|---|---|---|
LDA + TO | 9.33 | 11.69 | 12.32 | 24.47 | 19.19 | 15.28 | 7.72 | 4.41 | 1.43 |
Story Forest | 7.68 | 10.71 | 11.27 | 19.68 | 20.08 | 17.15 | 13.43 | 4.02 | 1.77 |
EventKG | 3.12 | 6.67 | 13.54 | 27.33 | 27.38 | 14.57 | 7.39 | 4.01 | 1.72 |
EventNET | 4.33 | 6.67 | 10.05 | 20.31 | 21.64 | 23.12 | 13.88 | 4.74 | 1.79 |
LSH-GEFR | 3.35 | 3.51 | 8.23 | 16.41 | 23.65 | 27.59 | 17.26 | 4.88 | 1.65 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gui, F.; Yang, J.; Tang, Y.; Chen, H.; An, N. Structured Life Narratives: Building Life Story Hierarchies with Graph-Enhanced Event Feature Refinement. Appl. Sci. 2024, 14, 918. https://doi.org/10.3390/app14020918
Gui F, Yang J, Tang Y, Chen H, An N. Structured Life Narratives: Building Life Story Hierarchies with Graph-Enhanced Event Feature Refinement. Applied Sciences. 2024; 14(2):918. https://doi.org/10.3390/app14020918
Chicago/Turabian StyleGui, Fang, Jiaoyun Yang, Yiming Tang, Hongtu Chen, and Ning An. 2024. "Structured Life Narratives: Building Life Story Hierarchies with Graph-Enhanced Event Feature Refinement" Applied Sciences 14, no. 2: 918. https://doi.org/10.3390/app14020918
APA StyleGui, F., Yang, J., Tang, Y., Chen, H., & An, N. (2024). Structured Life Narratives: Building Life Story Hierarchies with Graph-Enhanced Event Feature Refinement. Applied Sciences, 14(2), 918. https://doi.org/10.3390/app14020918