Information Extraction and Language Discourse Processing
A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Artificial Intelligence".
Deadline for manuscript submissions: 31 January 2025 | Viewed by 18510
Special Issue Editors
Interests: information extraction; text mining; natural language processing; knowledge graphs
Special Issue Information
Dear Colleagues,
Information extraction (IE) plays an increasingly important and pervasive role in today’s era of digitalized communication media based on the Semantic Web. E.g., search engine results, as snippets, are slowly replaced by “rich snippets”; there is an interest in converting scholarly publications to structured records available in such downstream IT applications as Leaderboards, etc. IE is thus the task of automatically extracting structured information from unstructured and/or semi-structured electronically represented documents. In most cases, this activity concerns processing of human language texts by means of natural language processing (NLP). The automatic extraction of information from unstructured sources has opened up new avenues for querying, organizing, and analyzing data by drawing upon the clean semantics of structured databases and the abundance of unstructured data.
Apart from extrinsic models of IE, research in linguistics and computational linguistics have long pointed out that text is not just simple sequence of clauses and sentences but rather follows a highly elaborated structure formalized within discourse. The framework used for discourse analysis has long since been rhetorical structure theory (RST). Within a well-written text, no unit of the text is completely isolated; interpretation requires understanding the unit’s relation with the context. Research in discourse analysis aims to unmask such relations in the text, which is helpful for many downstream applications such as summarization, information retrieval, and question answering.
This Special Issue seeks novel research reports on the spectrum that blends information extraction and language discourse processing research in diverse communities. The editors welcome submissions along various dimensions derived from the nature of the extraction task, the advanced neural techniques used for extraction, the variety of input resources exploited, and the type of output produced. Quantitative, qualitative, and mixed methods studies are welcome, as are case studies and experience reports if they describe an impactful application at a scale that delivers useful lessons to the journal readership.
Topics of interest include (but are not limited to):
- Knowledge base population with discourse-centric information extraction (IE)
- Coreference resolution and its impact on discourse-centric IE
- Relationship extraction leveraging linguistic discourse
- Template filling
- Impact of pragmatics or rhetorics on information extraction
- Discourse-centric IE at scale
- Intelligent and novel assessment models of discourse-centric IE
- Survey of discourse-centric IE in natural language processing (NLP)
- Challenges implementing discourse-centric IE in real-world scenarios
- Modeling domains using discourse-centric IE
- Human–AI hybrid systems for learning discourse and IE
- Application of discourse-centric IE
Dr. Jennifer D'Souza
Prof. Dr. Chengzhi Zhang
Guest Editors
Manuscript Submission Information
Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.
Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.
Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.
Keywords
- coherence
- topic focus
- information structure
- conversation structure
- discourse processing
- scholarly discourse processing
- anaphora resolution
Benefits of Publishing in a Special Issue
- Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
- Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
- Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
- External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
- e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.
Further information on MDPI's Special Issue polices can be found here.
Planned Papers
The below list represents only planned manuscripts. Some of these manuscripts have not been received by the Editorial Office yet. Papers submitted to MDPI journals are subject to peer-review.
Title: Astro-NER --- Astronomy Named Entity Recognition: Is GPT a Good Domain Expert Annotator?
Authors: Julia Evans, Sameer Sadruddin and Jennifer D'Souza
Affiliation: --
Abstract: This study explores the problem of the readiness of the state-of-the-art GPT large language model (LLM) to help non-experts annotate scientific entities in astronomy literature, aiming to see if this method can mimic domain expertise. Results show moderate to fair agreement between the domain expert and two different LLM-assisted non-experts. Additionally, the study evaluates finetuned versus default LLM performance on the task, introduces a scientific entity annotation scheme for astronomy validated by an expert, and releases a dataset of 5,000 annotated astronomy article titles for further research.