New Trends in Computer Vision, Deep Learning and Artificial Intelligence

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: 31 January 2025 | Viewed by 10932

Special Issue Editors


E-Mail Website
Guest Editor
College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China
Interests: computer vision; deep learning

E-Mail Website
Guest Editor
School of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
Interests: medical image analysis; deep learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Computing, National University of Singapore, Singapore 119077, Singapore
Interests: machine learning; high performance computing; parallel and distributed systems; AI applications

Special Issue Information

Dear Colleagues,

In the past decade, deep learning algorithms have dominated in speech, computer vision, and natural language processing, and AI applications have been everywhere in our daily lives. With the amount of labeled data available, a well-trained AI system can perform much better than humans when it comes to easy, repetitive, or determinate tasks such as image recognition, face recognition, translation, etc. Identifying ways to extend AI capabilities to tasks with limited data available and other complex tasks will be particularly important for the next decade.

The purpose of this Special Issue is to gather a collection of articles reflecting new trends in computer vision, deep learning, and artificial intelligence. Topics include but are not limited to the following:

  1. Deep learning with limited data;
  2. Unsupervised deep learning technology;
  3. Deep learning for 3D vision;
  4. Neural rendering and its applications;
  5. Deep learning for efficient detection and segmentation;
  6. Deep learning for video understanding;
  7. Deep learning for language-vision tasks;
  8. Deep learning for visual affective computing;
  9. Deep learning for medical image analysis;
  10. Deep learning training acceleration;
  11. Big AI models;
  12. Industrial AI applications.

Dr. Xiaojiang Peng
Prof. Dr. Linlin Shen
Prof. Dr. Yang You
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • computer vision
  • video understanding
  • 3D vision
  • neural rendering
  • visual affective computing

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 1483 KiB  
Article
Data Quality-Aware Client Selection in Heterogeneous Federated Learning
by Shinan Song, Yaxin Li, Jin Wan, Xianghua Fu and Jingyan Jiang
Mathematics 2024, 12(20), 3229; https://doi.org/10.3390/math12203229 - 15 Oct 2024
Viewed by 729
Abstract
Federated Learning (FL) enables decentralized data utilization while maintaining edge user privacy, but it faces challenges due to statistical heterogeneity. Existing approaches address client drift and data heterogeneity issues. However, real-world settings often involve low-quality data with noisy features, such as covariate drift [...] Read more.
Federated Learning (FL) enables decentralized data utilization while maintaining edge user privacy, but it faces challenges due to statistical heterogeneity. Existing approaches address client drift and data heterogeneity issues. However, real-world settings often involve low-quality data with noisy features, such as covariate drift or adversarial samples, which are usually ignored. Noisy samples significantly impact the global model’s accuracy and convergence rate. Assessing data quality and selectively aggregating updates from high-quality clients is crucial, but dynamically perceiving data quality without additional computations or data exchanges is challenging. In this paper, we introduce the FedDQA (Federated learning via Data Quality-Aware) (FedDQA) framework. We discover increased data noise leads to slower loss reduction during local model training. We propose a loss sharpness-based Data-Quality-Awareness (DQA) metric to differentiate between high-quality and low-quality data. Based on the DQA, we design a client selection algorithm that strategically selects participant clients to reduce the negative impact of noisy clients. Experiment results indicate that FedDQA significantly outperforms the baselines. Notably, it achieves up to a 4% increase in global model accuracy and demonstrates faster convergence rates. Full article
Show Figures

Figure 1

13 pages, 539 KiB  
Article
Implicit Stance Detection with Hashtag Semantic Enrichment
by Li Dong, Zinao Su, Xianghua Fu, Bowen Zhang and Genan Dai
Mathematics 2024, 12(11), 1663; https://doi.org/10.3390/math12111663 - 26 May 2024
Viewed by 929
Abstract
Stance detection is a crucial task in natural language processing and social computing, focusing on classifying expressed attitudes towards specific targets based on the input text. Conventional methods predominantly view stance detection as a task of target-oriented, sentence-level text classification. On popular social [...] Read more.
Stance detection is a crucial task in natural language processing and social computing, focusing on classifying expressed attitudes towards specific targets based on the input text. Conventional methods predominantly view stance detection as a task of target-oriented, sentence-level text classification. On popular social media platforms like Twitter, users often express their opinions through hashtags in addition to textual content within tweets. However, current methods primarily treat hashtags as data retrieval labels, neglecting to effectively utilize the semantic information they carry. In this paper, we propose a large language model knowledge-enhanced stance detection framework (LKESD) for stance detection. LKESD contains three main components: an instruction-prompted background knowledge acquisition module (IPBKA) that retrieves background knowledge of hashtags by providing handcrafted prompts to large language models (LLMs); a graph convolutional feature-enhancement module (GCFEM) is designed to extract the semantic representations of words that frequently co-occur with hashtags in the dataset by leveraging textual associations; an a knowledge fusion network (KFN) is proposed to selectively integrate graph representations and LLM features using a prompt-tuning framework. Extensive experimental results on three benchmark datasets demonstrate that our LKESD method outperforms 2.7% on all setups over compared methods, validating its effectiveness in stance detection tasks. Full article
Show Figures

Figure 1

14 pages, 1881 KiB  
Article
An Examination of Mental Stress in College Students: Utilizing Intelligent Perception Data and the Mental Stress Scale
by Zhixuan Liao, Xiaomao Fan, Wenjun Ma and Yingshan Shen
Mathematics 2024, 12(10), 1501; https://doi.org/10.3390/math12101501 - 11 May 2024
Viewed by 846
Abstract
In order to solve the problems of traditional mental stress detection in college students that are time-consuming, random, and subjective, this paper proposes an intelligent perception-driven mental stress assessment method for college students. First, we analyze the factors in SRQ and SCL-90, which [...] Read more.
In order to solve the problems of traditional mental stress detection in college students that are time-consuming, random, and subjective, this paper proposes an intelligent perception-driven mental stress assessment method for college students. First, we analyze the factors in SRQ and SCL-90, which can be measured by intelligent sensing methods, including sleep, exercise, social interaction, and environment, and then perform feature extraction. Secondly, we use machine learning methods to build a mental stress assessment model. The Shapley additive explanations (SHAP) model is used to explain the training results. Experimental results show that the model proposed in this article can effectively assess the mental stress state of college students. This means that the collection of intelligent perception data based on the mental stress scale can effectively evaluate the mental stress state of college students and provide a new research idea for further developing a non-intrusive and real-time mental stress assessment for college students. Full article
Show Figures

Figure 1

18 pages, 1556 KiB  
Article
Boundary-Match U-Shaped Temporal Convolutional Network for Vulgar Action Segmentation
by Zhengwei Shen, Ran Xu, Yongquan Zhang, Feiwei Qin, Ruiquan Ge, Changmiao Wang and Masahiro Toyoura
Mathematics 2024, 12(6), 899; https://doi.org/10.3390/math12060899 - 18 Mar 2024
Viewed by 878
Abstract
The advent of deep learning has provided solutions to many challenges posed by the Internet. However, efficient localization and recognition of vulgar segments within videos remain formidable tasks. This difficulty arises from the blurring of spatial features in vulgar actions, which can render [...] Read more.
The advent of deep learning has provided solutions to many challenges posed by the Internet. However, efficient localization and recognition of vulgar segments within videos remain formidable tasks. This difficulty arises from the blurring of spatial features in vulgar actions, which can render them indistinguishable from general actions. Furthermore, issues of boundary ambiguity and over-segmentation complicate the segmentation of vulgar actions. To address these issues, we present the Boundary-Match U-shaped Temporal Convolutional Network (BMUTCN), a novel approach for the segmentation of vulgar actions. The BMUTCN employs a U-shaped architecture within an encoder–decoder temporal convolutional network to bolster feature recognition by leveraging the context of the video. Additionally, we introduce a boundary-match map that fuses action boundary inform ation with greater precision for frames that exhibit ambiguous boundaries. Moreover, we propose an adaptive internal block suppression technique, which substantially mitigates over-segmentation errors while preserving accuracy. Our methodology, tested across several public datasets as well as a bespoke vulgar dataset, has demonstrated state-of-the-art performance on the latter. Full article
Show Figures

Figure 1

16 pages, 708 KiB  
Article
Leveraging Chain-of-Thought to Enhance Stance Detection with Prompt-Tuning
by Daijun Ding, Xianghua Fu, Xiaojiang Peng, Xiaomao Fan, Hu Huang and Bowen Zhang
Mathematics 2024, 12(4), 568; https://doi.org/10.3390/math12040568 - 13 Feb 2024
Cited by 1 | Viewed by 1511
Abstract
Investigating public attitudes towards social media is crucial for opinion mining systems to gain valuable insights. Stance detection, which aims to discern the attitude expressed in an opinionated text towards a specific target, is a fundamental task in opinion mining. Conventional approaches mainly [...] Read more.
Investigating public attitudes towards social media is crucial for opinion mining systems to gain valuable insights. Stance detection, which aims to discern the attitude expressed in an opinionated text towards a specific target, is a fundamental task in opinion mining. Conventional approaches mainly focus on sentence-level classification techniques. Recent research has shown that the integration of background knowledge can significantly improve stance detection performance. Despite the significant improvement achieved by knowledge-enhanced methods, applying these techniques in real-world scenarios remains challenging for several reasons. Firstly, existing methods often require the use of complex attention mechanisms to filter out noise and extract relevant background knowledge, which involves significant annotation efforts. Secondly, knowledge fusion mechanisms typically rely on fine-tuning, which can introduce a gap between the pre-training phase of pre-trained language models (PLMs) and the downstream stance detection tasks, leading to the poor prediction accuracy of the PLMs. To address these limitations, we propose a novel prompt-based stance detection method that leverages the knowledge acquired using the chain-of-thought method, which we refer to as PSDCOT. The proposed approach consists of two stages. The first stage is knowledge extraction, where instruction questions are constructed to elicit background knowledge from a VLPLM. The second stage is the multi-prompt learning network (M-PLN) for knowledge fusion, which learns model performance based on the background knowledge and the prompt learning framework. We evaluated the performance of PSDCOT on publicly available benchmark datasets to assess its effectiveness in improving stance detection performance. The results demonstrate that the proposed method achieves state-of-the-art results in in-domain, cross-target, and zero-shot learning settings. Full article
Show Figures

Figure 1

17 pages, 2693 KiB  
Article
SVSeq2Seq: An Efficient Computational Method for State Vectors in Sequence-to-Sequence Architecture Forecasting
by Guoqiang Sun, Xiaoyan Qi, Qiang Zhao, Wei Wang and Yujun Li
Mathematics 2024, 12(2), 265; https://doi.org/10.3390/math12020265 - 13 Jan 2024
Viewed by 903
Abstract
This study proposes an efficient method for computing State Vectors in Sequence-to-Sequence (SVSeq2Seq) architecture to improve the performance of sequence data forecasting, which associates each element with other elements instead of relying only on nearby elements. First, the dependency between two elements is [...] Read more.
This study proposes an efficient method for computing State Vectors in Sequence-to-Sequence (SVSeq2Seq) architecture to improve the performance of sequence data forecasting, which associates each element with other elements instead of relying only on nearby elements. First, the dependency between two elements is adaptively captured by calculating the relative importance between hidden layers. Second, tensor train decomposition is used to address the issue of dimensionality catastrophe. Third, we further select seven instantiated baseline models for data prediction and compare them with our proposed model on six real-world datasets. The results show that the Mean Square Error (MSE) and Mean Absolute Error (MAE) of our SVSeq2Seq model exhibit significant advantages over the other seven baseline models in predicting the three datasets, i.e., weather, electricity, and PEMS, with MSE/MAE values as low as 0.259/0.260, 0.186/0.285 and 0.113/0.222, respectively. Furthermore, the ablation study demonstrates that the SVSeq2Seq model possesses distinct advantages in sequential forecasting tasks. It is observed that replacing SVSeq2Seq with LPRcode and NMTcode resulted in an increase under an MSE of 18.05 and 10.11 times, and an increase under an MAE of 16.54 and 9.8 times, respectively. In comparative experiments with support vector machines (SVM) and random forest (RF), the performance of the SVSeq2Seq model is improved by 56.88 times in the weather dataset and 73.78 times in the electricity dataset under the MSE metric, respectively. The above experimental results demonstrate both the exceptional rationality and versatility of the SVSeq2Seq model for data forecasting. Full article
Show Figures

Figure 1

14 pages, 960 KiB  
Article
OL-JCMSR: A Joint Coding Monitoring Strategy Recommendation Model Based on Operation Log
by Guoqiang Sun, Peng Xu, Man Guo, Hao Sun, Zhaochen Du, Yujun Li and Bin Zhou
Mathematics 2022, 10(13), 2292; https://doi.org/10.3390/math10132292 - 30 Jun 2022
Cited by 1 | Viewed by 1302
Abstract
A surveillance system with more than hundreds of cameras and much fewer monitors strongly relies on manual scheduling and inspections from monitoring personnel. A monitoring method which improves the surveillance performance by analyzing and learning from a large amount of manual operation logs [...] Read more.
A surveillance system with more than hundreds of cameras and much fewer monitors strongly relies on manual scheduling and inspections from monitoring personnel. A monitoring method which improves the surveillance performance by analyzing and learning from a large amount of manual operation logs is proposed in this paper. Compared to fixed rules or existing computer-vision methods, the proposed method can more effectively learn from the operators’ behaviors and incorporate their intentions into the monitoring strategy. To the best of our knowledge, this method is the first to apply a monitoring-strategy recommendation model containing a global encoder and a local encoder in monitoring systems. The local encoder can adaptively select important items in the operating sequence to capture the main purpose of the operator, while the global encoder is used to summarize the behavior of the entire sequence. Two experiments are conducted on two data sets. Compared with att-RNN and att-GRU, the joint coding model in experiment 1 improves the Recall@20 by 9.4% and 4.6%, respectively, and improves the MRR@20 by 5.49% and 3.86%, respectively. In experiment 2, compared with att-RNN and att-GRU, the joint coding model improves by 11.8% and 6.2% on Recall@20, and improves by 7.02% and 5.16% on MRR@20, respectively. The results illustrate the effectiveness of the our model in monitoring systems. Full article
Show Figures

Figure 1

20 pages, 3586 KiB  
Article
A Joint Learning Model to Extract Entities and Relations for Chinese Literature Based on Self-Attention
by Li-Xin Liang, Lin Lin, E Lin, Wu-Shao Wen and Guo-Yan Huang
Mathematics 2022, 10(13), 2216; https://doi.org/10.3390/math10132216 - 24 Jun 2022
Cited by 5 | Viewed by 1743
Abstract
Extracting structured information from massive and heterogeneous text is a hot research topic in the field of natural language processing. It includes two key technologies: named entity recognition (NER) and relation extraction (RE). However, previous NER models consider less about the influence of [...] Read more.
Extracting structured information from massive and heterogeneous text is a hot research topic in the field of natural language processing. It includes two key technologies: named entity recognition (NER) and relation extraction (RE). However, previous NER models consider less about the influence of mutual attention between words in the text on the prediction of entity labels, and there is less research on how to more fully extract sentence information for relational classification. In addition, previous research treats NER and RE as a pipeline of two separated tasks, which neglects the connection between them, and is mainly focused on the English corpus. In this paper, based on the self-attention mechanism, bidirectional long short-term memory (BiLSTM) neural network and conditional random field (CRF) model, we put forth a Chinese NER method based on BiLSTM-Self-Attention-CRF and a RE method based on BiLSTM-Multilevel-Attention in the field of Chinese literature. In particular, considering the relationship between these two tasks in terms of word vector and context feature representation in the neural network model, we put forth a joint learning method for NER and RE tasks based on the same underlying module, which jointly updates the parameters of the shared module during the training of these two tasks. For performance evaluation, we make use of the largest Chinese data set containing these two tasks. Experimental results show that the proposed independently trained NER and RE models achieve better performance than all previous methods, and our joint NER-RE training model outperforms the independently-trained NER and RE model. Full article
Show Figures

Figure 1

Back to TopTop