Topic Editors

School of Computer Science and Technology, Zhoukou Normal University, Zhoukou 466001, China
Prof. Dr. Hong Su
School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
College of Computer and Information Sciences, Jouf University, Sakaka 72388, Saudi Arabia
School of Automobile Engineering, Guilin University of Aerospace Technology, Guilin 541004, China

Research on Deep Neural Networks for Video Motion Recognition

Abstract submission deadline
30 November 2024
Manuscript submission deadline
31 January 2025
Viewed by
2607

Topic Information

Dear Colleagues,

Deep neural networks have been widely used for video motion recognition tasks, such as action recognition, activity recognition, and gesture recognition. This Topic call aims to bridge the gap between theoretical research and practical applications in the field of video motion recognition using deep neural networks. The articles are expected to provide insights into the latest trends and advancements in the field and their potential to address real-world problems in various domains. The issue will also highlight the limitations and open research problems in the area, paving the way for future research directions. The contributions are expected to provide a comprehensive and detailed understanding of the underlying principles and techniques of deep-neural-network-based video motion recognition, facilitating the development of innovative solutions and techniques to overcome the existing challenges in the field. Topics of interest include but are not limited to:

  • Novel deep neural network architectures for video motion recognition;
  • Learning spatiotemporal features for video motion recognition;
  • Transfer learning and domain adaptation for video motion recognition;
  • Large-scale video datasets and benchmarking for video motion recognition;
  • Applications of deep neural networks for video motion recognition, such as human– computer interaction, surveillance, and sports analysis;
  • Applications of explainable artificial intelligence for video motion recognition.

Prof. Dr. Hamad Naeem
Prof. Dr. Hong Su
Prof. Dr. Amjad Alsirhani
Prof. Dr. Muhammad Shoaib Bhutta
Topic Editors

Keywords

  • deep learning
  • video analysis
  • motion detection
  • computer vision
  • neural networks

Participating Journals

Journal Name Impact Factor CiteScore Launched Year First Decision (median) APC
Future Internet
futureinternet
2.8 7.1 2009 13.1 Days CHF 1600 Submit
Information
information
2.4 6.9 2010 14.9 Days CHF 1600 Submit
Journal of Imaging
jimaging
2.7 5.9 2015 20.9 Days CHF 1800 Submit
Mathematics
mathematics
2.3 4.0 2013 17.1 Days CHF 2600 Submit
Symmetry
symmetry
2.2 5.4 2009 16.8 Days CHF 2400 Submit

Preprints.org is a multidiscipline platform providing preprint service that is dedicated to sharing your research from the start and empowering your research journey.

MDPI Topics is cooperating with Preprints.org and has built a direct connection between MDPI journals and Preprints.org. Authors are encouraged to enjoy the benefits by posting a preprint at Preprints.org prior to publication:

  1. Immediately share your ideas ahead of publication and establish your research priority;
  2. Protect your idea from being stolen with this time-stamped preprint article;
  3. Enhance the exposure and impact of your research;
  4. Receive feedback from your peers in advance;
  5. Have it indexed in Web of Science (Preprint Citation Index), Google Scholar, Crossref, SHARE, PrePubMed, Scilit and Europe PMC.

Published Papers (2 papers)

Order results
Result details
Journals
Select all
Export citation of selected articles as:
16 pages, 5429 KiB  
Article
Video WeAther RecoGnition (VARG): An Intensity-Labeled Video Weather Recognition Dataset
by Himanshu Gupta, Oleksandr Kotlyar, Henrik Andreasson and Achim J. Lilienthal
J. Imaging 2024, 10(11), 281; https://doi.org/10.3390/jimaging10110281 - 5 Nov 2024
Viewed by 557
Abstract
Adverse weather (rain, snow, and fog) can negatively impact computer vision tasks by introducing noise in sensor data; therefore, it is essential to recognize weather conditions for building safe and robust autonomous systems in the agricultural and autonomous driving/drone sectors. The performance degradation [...] Read more.
Adverse weather (rain, snow, and fog) can negatively impact computer vision tasks by introducing noise in sensor data; therefore, it is essential to recognize weather conditions for building safe and robust autonomous systems in the agricultural and autonomous driving/drone sectors. The performance degradation in computer vision tasks due to adverse weather depends on the type of weather and the intensity, which influences the amount of noise in sensor data. However, existing weather recognition datasets often lack intensity labels, limiting their effectiveness. To address this limitation, we present VARG, a novel video-based weather recognition dataset with weather intensity labels. The dataset comprises a diverse set of short video sequences collected from various social media platforms and videos recorded by the authors, processed into usable clips, and categorized into three major weather categories, rain, fog, and snow, with three intensity classes: absent/no, moderate, and high. The dataset contains 6742 annotated clips from 1079 videos, with the training set containing 5159 clips and the test set containing 1583 clips. Two sets of annotations are provided for training, the first set to train the models as a multi-label weather intensity classifier and the second set to train the models as a multi-class classifier for three weather scenarios. This paper describes the dataset characteristics and presents an evaluation study using several deep learning-based video recognition approaches for weather intensity prediction. Full article
Show Figures

Figure 1

16 pages, 1535 KiB  
Article
Temporal–Semantic Aligning and Reasoning Transformer for Audio-Visual Zero-Shot Learning
by Kaiwen Zhang, Kunchen Zhao and Yunong Tian
Mathematics 2024, 12(14), 2200; https://doi.org/10.3390/math12142200 - 13 Jul 2024
Cited by 1 | Viewed by 655
Abstract
Zero-shot learning (ZSL) enables models to recognize categories not encountered during training, which is crucial for categories with limited data. Existing methods overlook efficient temporal modeling in multimodal data. This paper proposes a Temporal–Semantic Aligning and Reasoning Transformer (TSART) for spatio-temporal modeling. TSART [...] Read more.
Zero-shot learning (ZSL) enables models to recognize categories not encountered during training, which is crucial for categories with limited data. Existing methods overlook efficient temporal modeling in multimodal data. This paper proposes a Temporal–Semantic Aligning and Reasoning Transformer (TSART) for spatio-temporal modeling. TSART uses the pre-trained SeLaVi network to extract audio and visual features and explores the semantic information of these modalities through audio and visual encoders. It incorporates a temporal information reasoning module to enhance the capture of temporal features in audio, and a cross-modal reasoning module to effectively integrate audio and visual information, establishing a robust joint embedding representation. Our experimental results validate the effectiveness of this approach, demonstrating outstanding Generalized Zero-Shot Learning (GZSL) performance on the UCF101 Generalized Zero-Shot Learning (UCF-GZSL), VGGSound-GZSL, and ActivityNet-GZSL datasets, with notable improvements in the Harmonic Mean (HM) evaluation. These results indicate that TSART has great potential in handling complex spatio-temporal information and multimodal fusion. Full article
Show Figures

Figure 1

Back to TopTop