Applied Sciences

Research

20 pages, 3087 KiB

Open AccessArticle

LogoNet: A Robust Layer-Aggregated Dual-Attention Anchorfree Logo Detection Framework with an Adversarial Domain Adaptation Approach

by Rahul Kumar Jain, Taro Watasue, Tomohiro Nakagawa, Takahiro Sato, Yutaro Iwamoto, Xiang Ruan and Yen-Wei Chen

Appl. Sci. 2021, 11(20), 9622; https://doi.org/10.3390/app11209622 - 15 Oct 2021

Cited by 6 | Viewed by 2660

Abstract

The task of logo detection is desirable and important for various fields. However, it is challenging and difficult to identify logos in complex scenarios as a logo can appear in different styles and platforms. Logo images include diverse contexts, sizes, projective transformation, resolution, [...] Read more.

The task of logo detection is desirable and important for various fields. However, it is challenging and difficult to identify logos in complex scenarios as a logo can appear in different styles and platforms. Logo images include diverse contexts, sizes, projective transformation, resolution, illumination and fonts, which make it more difficult to detect a logo. To address these issues, we presented a deep learning-based algorithm for logo detection called LogoNet. It includes an hourglass like top-down bottom-up feature extraction network, a spatial attention module and an anchorfree detection head similar to CenterNet. In order to improve performance, in this paper, an extended version of LogoNet is proposed, called—Dual-Attention LogoNet, that exploits different attention mechanisms more efficiently. The incorporated channel-wise and spatial attention modules refine and generate robust and balanced feature maps to predict visual and semantic information more accurately. In addition, we propose a lightweight architecture for both LogoNet and Dual-Attention LogoNet for practical applications. The proposed lightweight architecture significantly reduces the number of network parameters and improves the inference time to address the real-time performance while maintaining accuracy. Furthermore, to address the domain shift problem in practical applications, we also propose an adversarial-learning-based domain adaptation approach, which is easily adaptable to any anchorfree detectors. Our attention-based method shows a 1.8% improvement in accuracy compared to the state-of-the-art detection network on the FlickrLogos-32 dataset. Our proposed domain adaptation approach significantly improves performance by 1.3% mAP compared to direct transfer on the target domain without increasing any labeling cost and network parameters. Full article

(This article belongs to the Special Issue Deep Learning Applied to Image Processing)

► Show Figures

Figure 1

21 pages, 16696 KiB

Open AccessArticle

Automatic Processing of Historical Japanese Mathematics (Wasan) Documents

by Yago Diez, Toya Suzuki, Marius Vila and Katsushi Waki

Appl. Sci. 2021, 11(17), 8050; https://doi.org/10.3390/app11178050 - 30 Aug 2021

Viewed by 3322

Abstract

“Wasan” is the collective name given to a set of mathematical texts written in Japan in the Edo period (1603–1867). These documents represent a unique type of mathematics and amalgamate the mathematical knowledge of a time and place where major advances where reached. [...] Read more.

“Wasan” is the collective name given to a set of mathematical texts written in Japan in the Edo period (1603–1867). These documents represent a unique type of mathematics and amalgamate the mathematical knowledge of a time and place where major advances where reached. Due to these facts, Wasan documents are considered to be of great historical and cultural significance. This paper presents a fully automatic algorithmic process to first detect the kanji characters in Wasan documents and subsequently classify them using deep learning networks. We pay special attention to the results concerning one particular kanji character, the “ima” kanji, as it is of special importance for the interpretation of Wasan documents. As our database is made up of manual scans of real historical documents, it presents scanning artifacts in the form of image noise and page misalignment. First, we use two preprocessing steps to ameliorate these artifacts. Then we use three different blob detector algorithms to determine what parts of each image belong to kanji Characters. Finally, we use five deep learning networks to classify the detected kanji. All the steps of the pipeline are thoroughly evaluated, and several options are compared for the kanji detection and classification steps. As ancient kanji database are rare and often include relatively few images, we explore the possibility of using modern kanji databases for kanji classification. Experiments are run on a dataset containing 100 Wasan book pages. We compare the performance of three blob detector algorithms for kanji detection obtaining 79.60% success rate with 7.88% false positive detections. Furthermore, we study the performance of five well-known deep learning networks and obtain 99.75% classification accuracy for modern kanji and 90.4% for classical kanji. Finally, our full pipeline obtains 95% correct detection and classification of the “ima” kanji with 3% False positives. Full article

(This article belongs to the Special Issue Deep Learning Applied to Image Processing)

► Show Figures

Figure 1

13 pages, 1889 KiB

Open AccessArticle

PICCOLO White-Light and Narrow-Band Imaging Colonoscopic Dataset: A Performance Comparative of Models and Datasets

by Luisa F. Sánchez-Peralta, J. Blas Pagador, Artzai Picón, Ángel José Calderón, Francisco Polo, Nagore Andraka, Roberto Bilbao, Ben Glover, Cristina L. Saratxaga and Francisco M. Sánchez-Margallo

Appl. Sci. 2020, 10(23), 8501; https://doi.org/10.3390/app10238501 - 28 Nov 2020

Cited by 44 | Viewed by 5976

Abstract

Colorectal cancer is one of the world leading death causes. Fortunately, an early diagnosis allows for effective treatment, increasing the survival rate. Deep learning techniques have shown their utility for increasing the adenoma detection rate at colonoscopy, but a dataset is usually required [...] Read more.

Colorectal cancer is one of the world leading death causes. Fortunately, an early diagnosis allows for effective treatment, increasing the survival rate. Deep learning techniques have shown their utility for increasing the adenoma detection rate at colonoscopy, but a dataset is usually required so the model can automatically learn features that characterize the polyps. In this work, we present the PICCOLO dataset, that comprises 3433 manually annotated images (2131 white-light images 1302 narrow-band images), originated from 76 lesions from 40 patients, which are distributed into training (2203), validation (897) and test (333) sets assuring patient independence between sets. Furthermore, clinical metadata are also provided for each lesion. Four different models, obtained by combining two backbones and two encoder–decoder architectures, are trained with the PICCOLO dataset and other two publicly available datasets for comparison. Results are provided for the test set of each dataset. Models trained with the PICCOLO dataset have a better generalization capacity, as they perform more uniformly along test sets of all datasets, rather than obtaining the best results for its own test set. This dataset is available at the website of the Basque Biobank, so it is expected that it will contribute to the further development of deep learning methods for polyp detection, localisation and classification, which would eventually result in a better and earlier diagnosis of colorectal cancer, hence improving patient outcomes. Full article

(This article belongs to the Special Issue Deep Learning Applied to Image Processing)

► Show Figures

Graphical abstract

16 pages, 17244 KiB

Open AccessArticle

Single-Shot Object Detection with Split and Combine Blocks

by Hongwei Wang, Dahua Li, Yu Song, Qiang Gao, Zhaoyang Wang and Chunping Liu

Appl. Sci. 2020, 10(18), 6382; https://doi.org/10.3390/app10186382 - 13 Sep 2020

Cited by 2 | Viewed by 3329

Abstract

Feature fusion is widely used in various neural network-based visual recognition tasks, such as object detection, to enhance the quality of feature representation. It is common practice for both the one-stage object detectors and the two-stage object detectors to implement feature fusion in [...] Read more.

Feature fusion is widely used in various neural network-based visual recognition tasks, such as object detection, to enhance the quality of feature representation. It is common practice for both the one-stage object detectors and the two-stage object detectors to implement feature fusion in feature pyramid networks (FPN) to enhance the capacity to detect objects of different scales. In this work, we propose a novel and efficient feature fusion unit, which is referred to as the Split and Combine (SC) Block, that splits the input feature maps into several parts, then processes these sub-feature maps with different emphasis, and finally gradually concatenates the outputs one-by-one. The SC block implicitly encourages the network to focus on features that are more important to the task, thus improving network efficiency and reducing inference computations. In order to prove our analysis and conclusions, a backbone network and an FPN employing this technique are assembled into a one-stage detector and evaluated on the MS COCO dataset. With the newly introduced SC block and other novel training tricks, our detector achieves a good speed-accuracy trade-off on COCO test-dev set, with 37.1% AP (average precision) at 51 FPS and 38.9% AP at 40 FPS. Full article

(This article belongs to the Special Issue Deep Learning Applied to Image Processing)

► Show Figures

Figure 1

12 pages, 1919 KiB

Open AccessArticle

A Convolutional Neural Network for Anterior Intra-Arterial Thrombus Detection and Segmentation on Non-Contrast Computed Tomography of Patients with Acute Ischemic Stroke

by Manon L. Tolhuisen, Elena Ponomareva, Anne M. M. Boers, Ivo G. H. Jansen, Miou S. Koopman, Renan Sales Barros, Olvert A. Berkhemer, Wim H. van Zwam, Aad van der Lugt, Charles B. L. M. Majoie and Henk A. Marquering

Appl. Sci. 2020, 10(14), 4861; https://doi.org/10.3390/app10144861 - 15 Jul 2020

Cited by 12 | Viewed by 3894

Abstract

The aim of this study was to develop a convolutional neural network (CNN) that automatically detects and segments intra-arterial thrombi on baseline non-contrast computed tomography (NCCT) scans. We retrospectively collected computed tomography (CT)-scans of patients with an anterior circulation large vessel occlusion (LVO) [...] Read more.

The aim of this study was to develop a convolutional neural network (CNN) that automatically detects and segments intra-arterial thrombi on baseline non-contrast computed tomography (NCCT) scans. We retrospectively collected computed tomography (CT)-scans of patients with an anterior circulation large vessel occlusion (LVO) from the Multicenter Randomized Clinical Trial of Endovascular Treatment for Acute Ischemic Stroke in the Netherlands trial, both for training (n = 86) and validation (n = 43). For testing we included patients with (n = 58) and without (n = 45) an LVO from our comprehensive stroke center. Ground truth was established by consensus between two experts using both CT angiography and NCCT. We evaluated the CNN for correct identification of a thrombus, its location and thrombus segmentation and compared these with the results of a neurologist in training and expert neuroradiologist. Sensitivity of the CNN thrombus detection was 0.86, vs. 0.95 and 0.79 for the neuroradiologists. Specificity was 0.65 for the network vs. 0.58 and 0.82 for the neuroradiologists. The CNN correctly identified the location of the thrombus in 79% of the cases, compared to 81% and 77% for the neuroradiologists. The sensitivity and specificity for thrombus identification and the rate for correct thrombus location assessment by the CNN were similar to those of expert neuroradiologists. Full article

(This article belongs to the Special Issue Deep Learning Applied to Image Processing)

► Show Figures

Figure 1

15 pages, 1444 KiB

Open AccessArticle

Improvement of Learning Stability of Generative Adversarial Network Using Variational Learning

by Je-Yeol Lee and Sang-Il Choi

Appl. Sci. 2020, 10(13), 4528; https://doi.org/10.3390/app10134528 - 30 Jun 2020

Cited by 4 | Viewed by 3091

Abstract

In this paper, we propose a new network model using variational learning to improve the learning stability of generative adversarial networks (GAN). The proposed method can be easily applied to improve the learning stability of GAN-based models that were developed for various purposes, [...] Read more.

In this paper, we propose a new network model using variational learning to improve the learning stability of generative adversarial networks (GAN). The proposed method can be easily applied to improve the learning stability of GAN-based models that were developed for various purposes, given that the variational autoencoder (VAE) is used as a secondary network while the basic GAN structure is maintained. When the gradient of the generator vanishes in the learning process of GAN, the proposed method receives gradient information from the decoder of the VAE that maintains gradient stably, so that the learning processes of the generator and discriminator are not halted. The experimental results of the MNIST and the CelebA datasets verify that the proposed method improves the learning stability of the networks by overcoming the vanishing gradient problem of the generator, and maintains the excellent data quality of the conventional GAN-based generative models. Full article

(This article belongs to the Special Issue Deep Learning Applied to Image Processing)

► Show Figures

Figure 1

21 pages, 47435 KiB

Open AccessArticle

Towards a Better Understanding of Transfer Learning for Medical Imaging: A Case Study

by Laith Alzubaidi, Mohammed A. Fadhel, Omran Al-Shamma, Jinglan Zhang, J. Santamaría, Ye Duan and Sameer R. Oleiwi

Appl. Sci. 2020, 10(13), 4523; https://doi.org/10.3390/app10134523 - 29 Jun 2020

Cited by 163 | Viewed by 11983

Abstract

One of the main challenges of employing deep learning models in the field of medicine is a lack of training data due to difficulty in collecting and labeling data, which needs to be performed by experts. To overcome this drawback, transfer learning (TL) [...] Read more.

One of the main challenges of employing deep learning models in the field of medicine is a lack of training data due to difficulty in collecting and labeling data, which needs to be performed by experts. To overcome this drawback, transfer learning (TL) has been utilized to solve several medical imaging tasks using pre-trained state-of-the-art models from the ImageNet dataset. However, there are primary divergences in data features, sizes, and task characteristics between the natural image classification and the targeted medical imaging tasks. Therefore, TL can slightly improve performance if the source domain is completely different from the target domain. In this paper, we explore the benefit of TL from the same and different domains of the target tasks. To do so, we designed a deep convolutional neural network (DCNN) model that integrates three ideas including traditional and parallel convolutional layers and residual connections along with global average pooling. We trained the proposed model against several scenarios. We utilized the same and different domain TL with the diabetic foot ulcer (DFU) classification task and with the animal classification task. We have empirically shown that the source of TL from the same domain can significantly improve the performance considering a reduced number of images in the same domain of the target dataset. The proposed model with the DFU dataset achieved F1-score value of 86.6% when trained from scratch, 89.4% with TL from a different domain of the targeted dataset, and 97.6% with TL from the same domain of the targeted dataset. Full article

(This article belongs to the Special Issue Deep Learning Applied to Image Processing)

► Show Figures

Figure 1

13 pages, 2716 KiB

Open AccessArticle

Defect Detection on Rolling Element Surface Scans Using Neural Image Segmentation

by Nico Prappacher, Markus Bullmann, Gunther Bohn, Frank Deinzer and Andreas Linke

Appl. Sci. 2020, 10(9), 3290; https://doi.org/10.3390/app10093290 - 9 May 2020

Cited by 29 | Viewed by 3679

Abstract

The surface inspection of steel parts like rolling elements for roller bearings is an essential component of the quality assurance process in their production. Existing inspection systems require high maintenance cost and allow little flexibility. In this paper, we propose the use of [...] Read more.

The surface inspection of steel parts like rolling elements for roller bearings is an essential component of the quality assurance process in their production. Existing inspection systems require high maintenance cost and allow little flexibility. In this paper, we propose the use of a rapidly retrainable convolutional neural network. Our approach reduces the development and maintenance cost compared to a manually programmed classification system for steel surface defect detection. One of the main disadvantages of neural network approaches is their high demand for labeled training data. To bypass this, we propose the use of simulated defects. In the production of rolling elements, real defects are a rarity. Collecting a balanced dataset thus costs a lot of time and resources. Simulating defects reduces the time required for data collection. It also allows us to automatically label the dataset. This further eases the data collection process compared to existing approaches. Combined, this allows us to train our system faster and cheaper than existing systems. We will show that our system can be retrained in a matter of minutes, minimizing production downtime, while still allowing high accuracy in defect detection. Full article

(This article belongs to the Special Issue Deep Learning Applied to Image Processing)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Deep Learning Applied to Image Processing

Share This Special Issue

Special Issue Editors

Special Issue Information

Benefits of Publishing in a Special Issue

Published Papers (8 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI