Multimedia Systems and Signal Processing

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: closed (25 November 2021) | Viewed by 39862

Special Issue Editor


E-Mail Website
Guest Editor
School of Computing, Gachon University, Seongnam-si 13120, Republic of Korea
Interests: big data analytics; multimedia networking; signal processing; statistical inference; resource allocation and optimization in VANETs
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The main concerns in multimedia systems and signal processing involve understanding fundamental tradeoffs between competing resource requirements, developing practical techniques and heuristics for realizing complicated optimization and allocation strategies, and demonstrating innovative mechanisms and frameworks for large-scale multimedia applications. To this end, both theory and practice in various heterogeneous and interrelated fields, including image, video and audio data processing, new sources of multimodal data (text, social, health, etc.), are used. In this Special Issue, we focus on the analysis, design, and implementation of multimedia systems and signal processing, including communications, databases, coding, and several machine learning methods. 

The topics of interest include, but are not limited to:

  • Multimedia communications and networking
  • Multimedia systems and applications for Internet of Things
  • Multimedia databases and digital libraries
  • Big data analytics for multimedia
  • Machine learning for multimedia
  • Multimedia indexing and retrieval
  • Sensors and multimedia
  • Social/health multimedia
  • Image and video processing, compression, and segmentation
  • Image and video coding
  • Music, speech, and audio processing
  • Multidimensional signal processing
  • Deep learning for signal processing

Prof. Dr. Jaeyoung Choi
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 949 KiB  
Article
Inferring the Hidden Cascade Infection over Erdös-Rényi (ER) Random Graph
by Jaeyoung Choi
Electronics 2021, 10(16), 1894; https://doi.org/10.3390/electronics10161894 - 6 Aug 2021
Cited by 1 | Viewed by 1959
Abstract
Finding hidden infected nodes is extremely important when information or diseases spread rapidly in a network because hints regarding the global properties of the diffusion dynamics can be provided, and effective control strategies for mitigating such spread can be derived. In this study, [...] Read more.
Finding hidden infected nodes is extremely important when information or diseases spread rapidly in a network because hints regarding the global properties of the diffusion dynamics can be provided, and effective control strategies for mitigating such spread can be derived. In this study, to understand the impact of the structure of the underlying network, a cascade infection-recovery problem is considered over an Erdös-Rényi (ER) random graph when a subset of infected nodes is partially observed. The goal is to reconstruct the underlying cascade that is likely to generate these observations. To address this, two algorithms are proposed: (i) a Neighbor-based recovery algorithm (NBRA(α)), where 0α1 is a control parameter, and (ii) a BFS tree-source-based recovery algorithm (BSRA). The first one simply counts the number of infected neighbors for candidate hidden cascade nodes and computes the possibility of infection from the neighbors by controlling the parameter α. The latter estimates the cascade sources first and computes the infection probability from the sources. A BFS tree approximation is used for the underlying ER random graph with respect to the sources for computing the infection probability because of the computational complexity in general loopy graphs. We then conducted various simulations to obtain the recovery performance of the two proposed algorithms. As a result, although the NBRA(α) uses only local information of the neighboring infection status, it recovers the hidden cascade infection well and is not significantly affected by the average degree of the ER random graph, whereas the BSRA works well on a local tree-like structure. Full article
(This article belongs to the Special Issue Multimedia Systems and Signal Processing)
Show Figures

Figure 1

13 pages, 2010 KiB  
Article
State Estimation Using a Randomized Unscented Kalman Filter for 3D Skeleton Posture
by Yogendra Rao Musunuri and Oh-Seol Kwon
Electronics 2021, 10(8), 971; https://doi.org/10.3390/electronics10080971 - 19 Apr 2021
Cited by 9 | Viewed by 2635
Abstract
In this study, we propose a method for minimizing the noise of Kinect sensors for 3D skeleton estimation. Notably, it is difficult to effectively remove nonlinear noise when estimating 3D skeleton posture; however, the proposed randomized unscented Kalman filter reduces the nonlinear temporal [...] Read more.
In this study, we propose a method for minimizing the noise of Kinect sensors for 3D skeleton estimation. Notably, it is difficult to effectively remove nonlinear noise when estimating 3D skeleton posture; however, the proposed randomized unscented Kalman filter reduces the nonlinear temporal noise effectively through the state estimation process. The 3D skeleton data can then be estimated at each step by iteratively passing the posterior state during the propagation and updating process. Ultimately, the performance of the proposed method for 3D skeleton estimation is observed to be superior to that of conventional methods based on experimental results. Full article
(This article belongs to the Special Issue Multimedia Systems and Signal Processing)
Show Figures

Figure 1

21 pages, 5444 KiB  
Article
Stereo Matching with Spatiotemporal Disparity Refinement Using Simple Linear Iterative Clustering Segmentation
by Hui-Yu Huang and Zhe-Hao Liu
Electronics 2021, 10(6), 717; https://doi.org/10.3390/electronics10060717 - 18 Mar 2021
Cited by 2 | Viewed by 2076
Abstract
Stereo matching is a challenging problem, especially for computer vision, e.g., three-dimensional television (3DTV) or 3D visualization. The disparity maps from the video streams must be estimated. However, the estimated disparity sequences may cause undesirable flickering errors. These errors result in poor visual [...] Read more.
Stereo matching is a challenging problem, especially for computer vision, e.g., three-dimensional television (3DTV) or 3D visualization. The disparity maps from the video streams must be estimated. However, the estimated disparity sequences may cause undesirable flickering errors. These errors result in poor visual quality for the synthesized video and reduce the video coding information. In order to solve this problem, we here propose a spatiotemporal disparity refinement method for local stereo matching using the simple linear iterative clustering (SLIC) segmentation strategy, outlier detection, and refinements of the temporal and spatial domains. In the outlier detection, the segmented region in the initial disparity is used to distinguish errors in the binocular disparity. Based on the color similarity and disparity difference, we recalculate the aggregated cost to determine adaptive disparities to recover the disparity errors in disparity sequences. The flickering errors are also effectively removed, and the object boundaries are well preserved. Experiments using public datasets demonstrated that our proposed method creates high-quality disparity maps and obtains a high peak signal-to-noise ratio compared to state-of-the-art methods. Full article
(This article belongs to the Special Issue Multimedia Systems and Signal Processing)
Show Figures

Figure 1

16 pages, 2967 KiB  
Article
CNN-Based Acoustic Scene Classification System
by Yerin Lee, Soyoung Lim and Il-Youp Kwak
Electronics 2021, 10(4), 371; https://doi.org/10.3390/electronics10040371 - 3 Feb 2021
Cited by 20 | Viewed by 4704
Abstract
Acoustic scene classification (ASC) categorizes an audio file based on the environment in which it has been recorded. This has long been studied in the detection and classification of acoustic scenes and events (DCASE). This presents the solution to Task 1 of the [...] Read more.
Acoustic scene classification (ASC) categorizes an audio file based on the environment in which it has been recorded. This has long been studied in the detection and classification of acoustic scenes and events (DCASE). This presents the solution to Task 1 of the DCASE 2020 challenge submitted by the Chung-Ang University team. Task 1 addressed two challenges that ASC faces in real-world applications. One is that the audio recorded using different recording devices should be classified in general, and the other is that the model used should have low-complexity. We proposed two models to overcome the aforementioned problems. First, a more general classification model was proposed by combining the harmonic-percussive source separation (HPSS) and deltas-deltadeltas features with four different models. Second, using the same feature, depthwise separable convolution was applied to the Convolutional layer to develop a low-complexity model. Moreover, using gradient-weight class activation mapping (Grad-CAM), we investigated what part of the feature our model sees and identifies. Our proposed system ranked 9th and 7th in the competition for these two subtasks, respectively. Full article
(This article belongs to the Special Issue Multimedia Systems and Signal Processing)
Show Figures

Figure 1

14 pages, 4195 KiB  
Article
Median Filtering Using First-Order and Second-Order Neighborhood Pixels to Reduce Fixed Value Impulse Noise from Grayscale Digital Images
by Ali Salim Nasar Mursal and Haidi Ibrahim
Electronics 2020, 9(12), 2034; https://doi.org/10.3390/electronics9122034 - 1 Dec 2020
Cited by 7 | Viewed by 3673
Abstract
It is essential to restore digital images corrupted by noise to make them more useful. Many approaches have been proposed to restore images affected by fixed value impulse noise, but they still do not perform well at high noise density. This paper presents [...] Read more.
It is essential to restore digital images corrupted by noise to make them more useful. Many approaches have been proposed to restore images affected by fixed value impulse noise, but they still do not perform well at high noise density. This paper presents a new method to improve the detection and removal of fixed value impulse noise from digital images. The proposed method consists of two stages. The first stage is the noise detection stage, where the difference values between the pixels and their surrounding pixels are computed to decide whether they are noisy pixels or not. The second stage is the image denoising stage. In this stage, the original intensity value of the noisy pixels is estimated using only their first-order and second-order neighborhood pixels. These neighboring orders are based on the Euclidean distance between the noisy pixel and its neighboring pixels. The proposed method was evaluated by comparing it with some of the recent methods using 50 images at 18 noise densities. The experimental results confirm that the proposed method outperforms the existing filters, excelling in noise removal capability with structure and edge information preservation. Full article
(This article belongs to the Special Issue Multimedia Systems and Signal Processing)
Show Figures

Figure 1

21 pages, 3341 KiB  
Article
Simplification on Cross-Component Linear Model in Versatile Video Coding
by Sung-Chang Lim, Dae-Yeon Kim and Jungwon Kang
Electronics 2020, 9(11), 1885; https://doi.org/10.3390/electronics9111885 - 9 Nov 2020
Cited by 3 | Viewed by 3093
Abstract
To improve coding efficiency by exploiting the local inter-component redundancy between the luma and chroma components, the cross-component linear model (CCLM) is included in the versatile video coding (VVC) standard. In the CCLM mode, linear model parameters are derived from the neighboring luma [...] Read more.
To improve coding efficiency by exploiting the local inter-component redundancy between the luma and chroma components, the cross-component linear model (CCLM) is included in the versatile video coding (VVC) standard. In the CCLM mode, linear model parameters are derived from the neighboring luma and chroma samples of the current block. Furthermore, chroma samples are predicted by the reconstructed samples in the collocated luma block with the derived parameters. However, as the CCLM design in the VVC test model (VTM)-6.0 has many conditional branches in its processes to use only available neighboring samples, the CCLM implementation in parallel processing is limited. To address this implementation issue, this paper proposes including the neighboring sample generation as the first process of the CCLM, so as to simplify the succeeding CCLM processes. As unavailable neighboring samples are replaced with the adjacent available samples by the proposed CCLM, the neighboring sample availability checks can be removed. This results in simplified downsampling filter shapes for the luma sample. Therefore, the proposed CCLM can be efficiently implemented by employing parallel processing in both hardware and software implementations, owing to the removal of the neighboring sample availability checks and the simplification of the luma downsampling filters. The experimental results demonstrate that the proposed CCLM reduces the decoding runtime complexity of the CCLM mode, with negligible impact on the Bjøntegaard delta (BD)-rate. Full article
(This article belongs to the Special Issue Multimedia Systems and Signal Processing)
Show Figures

Figure 1

10 pages, 432 KiB  
Article
Methodology for Modeling and Comparing Video Codecs: HEVC, EVC, and VVC
by Stefano Battista, Massimo Conti and Simone Orcioni
Electronics 2020, 9(10), 1579; https://doi.org/10.3390/electronics9101579 - 27 Sep 2020
Cited by 10 | Viewed by 4239
Abstract
Online videos are the major source of internet traffic, and are about to become the largest majority. Increasing effort is aimed to developing more efficient video codecs. In order to compare existing and novel video codecs, this paper presents a simple but effective [...] Read more.
Online videos are the major source of internet traffic, and are about to become the largest majority. Increasing effort is aimed to developing more efficient video codecs. In order to compare existing and novel video codecs, this paper presents a simple but effective methodology to model their performance in terms of Rate Distortion (RD). A linear RD model in the dB variables, Peak Signal-to-Noise Ratio (PSNR) and Bitrate (BR), easily allows us to estimate the difference in PSNR or BR between two sets of encoding conditions. Six sequences from the MPEG test set with the same resolution, encoded at different BR and different Quantization Parameters, were used to create the data set to estimate each RD model. Three codecs (HEVC, EVC, and VVC) were compared with this methodology, after estimating their models. Fitting properties of each model and a performance comparison between the models are finally shown and discussed. Full article
(This article belongs to the Special Issue Multimedia Systems and Signal Processing)
Show Figures

Figure 1

28 pages, 3350 KiB  
Article
Bagged Tree Based Frame-Wise Beforehand Prediction Approach for HEVC Intra-Coding Unit Partitioning
by Yixiao Li, Lixiang Li, Yuan Fang, Haipeng Peng and Yixian Yang
Electronics 2020, 9(9), 1523; https://doi.org/10.3390/electronics9091523 - 17 Sep 2020
Cited by 5 | Viewed by 2538
Abstract
High Efficiency Video Coding (HEVC) has achieved about 50% bit-rates saving compared with its predecessor H.264 standard, while the encoding complexity increases dramatically. Due to the introduction of more flexible partition structures and more optional prediction directions, HEVC takes a brute force approach [...] Read more.
High Efficiency Video Coding (HEVC) has achieved about 50% bit-rates saving compared with its predecessor H.264 standard, while the encoding complexity increases dramatically. Due to the introduction of more flexible partition structures and more optional prediction directions, HEVC takes a brute force approach to find the optimal partitioning result which is much more time consuming. Therefore, this paper proposes a bagged trees based fast approach (BTFA) and focuses on the coding unit (CU) size decision for HEVC intra-coding. First, several key features of a target CU are extracted for three-output classifiers. Then, to avoid feature extraction and prediction time over head, our approach is designed frame-wisely, and the procedure is applied parallel with the encoding process. Using the adaptive threshold determination algorithm, our approach achieves 42.04% time saving with negligible 0.92% Bit-Distortion (BD)-rate loss. Furthermore, in order to calculate the optimal thresholds to balance BD-rate loss and complexity reduction, the neural network based mathematical fitting is added to BTFA, which is called the advanced bagged trees based fast approach (ABTFA). Finally, experimental results show that ABTFA achieves 47.87% time saving with only 0.96% BD-rate loss, which outperforms other state-of-the-art approaches. Full article
(This article belongs to the Special Issue Multimedia Systems and Signal Processing)
Show Figures

Figure 1

15 pages, 2717 KiB  
Article
Audio Fingerprint Extraction Based on Locally Linear Embedding for Audio Retrieval System
by Maoshen Jia, Tianhao Li and Jing Wang
Electronics 2020, 9(9), 1483; https://doi.org/10.3390/electronics9091483 - 10 Sep 2020
Cited by 10 | Viewed by 3677
Abstract
With the appearance of a large amount of audio data, people have a higher demand for audio retrieval, which can quickly and accurately find the required information. Audio fingerprint retrieval is a popular choice because of its excellent performance. However, there is a [...] Read more.
With the appearance of a large amount of audio data, people have a higher demand for audio retrieval, which can quickly and accurately find the required information. Audio fingerprint retrieval is a popular choice because of its excellent performance. However, there is a problem about the large amount of audio fingerprint data in the existing audio fingerprint retrieval method which takes up more storage space and affects the retrieval speed. Aiming at the problem, this paper presents a novel audio fingerprinting method based on locally linear embedding (LLE) that has smaller fingerprints and the retrieval is more efficient. The proposed audio fingerprint extraction divides the bands around each peak in the frequency domain into four groups of sub-regions and the energy of every sub-region is computed. Then the LLE is performed for each group, respectively, and the audio fingerprint is encoded by comparing adjacent energies. To solve the distortion of linear speed changes, a matching strategy based on dynamic time warping (DTW) is adopted in the retrieval part which can compare two audio segments with different lengths. To evaluate the retrieval performance of the proposed method, the experiments are carried out under different conditions of single and multiple groups’ dimensionality reduction. Both of them can achieve a high recall and precision rate and has a better retrieval efficiency with less data compared with some state-of-the-art methods. Full article
(This article belongs to the Special Issue Multimedia Systems and Signal Processing)
Show Figures

Figure 1

14 pages, 464 KiB  
Article
Epidemic Source Detection over Dynamic Networks
by Jaeyoung Choi
Electronics 2020, 9(6), 1018; https://doi.org/10.3390/electronics9061018 - 19 Jun 2020
Cited by 2 | Viewed by 3273
Abstract
Epidemic source detection is one of the most crucial problems in statistical inference. For example, currently, the debate continues to reveal when and where the first spread of COVID-19 occured. For this problem, most of the works have assumed a static network topology, [...] Read more.
Epidemic source detection is one of the most crucial problems in statistical inference. For example, currently, the debate continues to reveal when and where the first spread of COVID-19 occured. For this problem, most of the works have assumed a static network topology, that is, the connections between nodes do not change over time. This is impractical because many nodes have some mobility in the network, or the connections can be changed. In this paper, we focus on the dynamic network, in the sense that the node connectivity varies over time. We first introduce a simple dynamic model, named k-flip dynamic such that k > 0 connections in the network may be changed with some probability at each time. Next, we design a proper estimation algorithm using some investigation for the contact information between infected nodes, named dynamic network source estimation (DNSE)(k) for the dynamic model. We perform various simulations for the algorithm compared to several existing source estimation methods. Our results show that the proposed algorithm outperforms and is efficient for finding the epidemic source compared to other methods. Further, we see that the detection probability for our proposed algorithm can be above 45% when we use budget to investigate the contact information from the infected nodes under some practical setting of k. Full article
(This article belongs to the Special Issue Multimedia Systems and Signal Processing)
Show Figures

Figure 1

9 pages, 1006 KiB  
Article
Performance Comparison of Weak Filtering in HEVC and VVC
by Junghyun Lee and Jechang Jeong
Electronics 2020, 9(6), 960; https://doi.org/10.3390/electronics9060960 - 9 Jun 2020
Viewed by 2965
Abstract
This study describes the need to improve the weak filtering method for the in-loop filter process used identically in versatile video coding (VVC) and high efficiency video coding (HEVC). The weak filtering process used by VVC has been adopted and maintained since Draft [...] Read more.
This study describes the need to improve the weak filtering method for the in-loop filter process used identically in versatile video coding (VVC) and high efficiency video coding (HEVC). The weak filtering process used by VVC has been adopted and maintained since Draft Four during H.265/advanced video coding (AVC) standardization. Because the encoding process in the video codec utilizes block structural units, deblocking filters are essential. However, as many of the deblocking filters require a complex calculation process, it is necessary to ensure that they have a reasonable effect. This study evaluated the performance of the weak filtering portion of the VVC and confirmed that it is not functioning effectively, unlike its performance in the HEVC. The method of excluding the whole of weak filtering from VVC, which is a non-weak filtering method, should be considered in VVC standardization. In experimental result in this study, the non-weak filtering method brings 0.40 Y-Bjontegaard-Delta Bit-Rate (BDBR) gain over VVC Test Model (VTM) 6.0. Full article
(This article belongs to the Special Issue Multimedia Systems and Signal Processing)
Show Figures

Figure 1

25 pages, 9089 KiB  
Article
Fast Hole Filling for View Synthesis in Free Viewpoint Video
by Hui-Yu Huang and Shao-Yu Huang
Electronics 2020, 9(6), 906; https://doi.org/10.3390/electronics9060906 - 29 May 2020
Cited by 7 | Viewed by 3455
Abstract
The recent emergence of three-dimensional (3D) movies and 3D television (TV) indicates an increasing interest in 3D content. Stereoscopic displays have enabled visual experiences to be enhanced, allowing the world to be viewed in 3D. Virtual view synthesis is the key technology to [...] Read more.
The recent emergence of three-dimensional (3D) movies and 3D television (TV) indicates an increasing interest in 3D content. Stereoscopic displays have enabled visual experiences to be enhanced, allowing the world to be viewed in 3D. Virtual view synthesis is the key technology to present 3D content, and depth image-based rendering (DIBR) is a classic virtual view synthesis method. With a texture image and its corresponding depth map, a virtual view can be generated using the DIBR technique. The depth and camera parameters are used to project the entire pixel in the image to the 3D world coordinate system. The results in the world coordinates are then reprojected into the virtual view, based on 3D warping. However, these projections will result in cracks (holes). Hence, we herein propose a new method of DIBR for free viewpoint videos to solve the hole problem due to these projection processes. First, the depth map is preprocessed to reduce the number of holes, which does not produce large-scale geometric distortions; subsequently, improved 3D warping projection is performed collectively to create the virtual view. A median filter is used to filter the hole regions in the virtual view, followed by 3D inverse warping blending to remove the holes. Next, brightness adjustment and adaptive image blending are performed. Finally, the synthesized virtual view is obtained using the inpainting method. Experimental results verify that our proposed method can produce a pleasant visibility of the synthetized virtual view, maintain a high peak signal-to-noise ratio (PSNR) value, and efficiently decrease execution time compared with state-of-the-art methods. Full article
(This article belongs to the Special Issue Multimedia Systems and Signal Processing)
Show Figures

Figure 1

Back to TopTop