Next Article in Journal
Effect of Mannoprotein-Producing Yeast on Viscosity and Mouthfeel of Red Wine
Previous Article in Journal
Detection of Coxiella burnetii in Bulk Tank Milk of Dairy Small Ruminant Farms in Greece
Previous Article in Special Issue
Comparative Analysis of XGB, CNN, and ResNet Models for Predicting Moisture Content in Porphyra yezoensis Using Near-Infrared Spectroscopy
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Enhancing Food Image Recognition by Multi-Level Fusion and the Attention Mechanism

1
School of Information, Beijing Forestry University, Beijing 100083, China
2
Risk Assessment Division 1, China National Center for Food Safety Risk Assessment, Beijing 100022, China
*
Authors to whom correspondence should be addressed.
Foods 2025, 14(3), 461; https://doi.org/10.3390/foods14030461
Submission received: 28 November 2024 / Revised: 23 January 2025 / Accepted: 29 January 2025 / Published: 31 January 2025

Abstract

As a pivotal area of research in the field of computer vision, the technology for food identification has become indispensable across diverse domains including dietary nutrition monitoring, intelligent service provision in restaurants, and ensuring quality control within the food industry. However, recognizing food images falls within the domain of Fine-Grained Visual Classification (FGVC), which presents challenges such as inter-class similarity, intra-class variability, and the complexity of capturing intricate local features. Researchers have primarily focused on deep information in deep convolutional neural networks for fine-grained visual classification, often neglecting shallow and detailed information. Taking these factors into account, we propose a Multi-level Attention Feature Fusion Network (MAF-Net). Specifically, we use feature maps generated by the Convolutional Neural Networks (CNNs) backbone network at different stages as inputs. We apply a self-attention mechanism to identify local features on these feature maps and then stack them together. The feature vectors obtained through the attention mechanism are then integrated with the original input to enhance data augmentation. Simultaneously, to capture as many local features as possible, we encourage multi-scale features to concentrate on distinct local regions at each stage by maximizing the Kullback-Leibler Divergence (KL-divergence) between the different stages. Additionally, we present a novel approach called subclass center loss (SCloss) to implement label smoothing, minimize intra-class feature distribution differences, and enhance the model’s generalization capability. Experiments conducted on three food image datasets—CETH Food-101, Vireo Food-172, and UEC Food-100—demonstrated the superiority of the proposed model. The model achieved Top-1 accuracies of 90.22%, 89.86%, and 90.61% on CETH Food-101, Vireo Food-172, and UEC Food-100, respectively. Notably, our method not only outperformed other methods in terms of the Top-5 accuracy of Vireo Food-172 but also achieved the highest performance in the Top-1 accuracies of UEC Food-100.
Keywords: convolutional neural network; food recognition; self-attention mechanism; feature fusion; food science and technology convolutional neural network; food recognition; self-attention mechanism; feature fusion; food science and technology

Share and Cite

MDPI and ACS Style

Chen, Z.; Wang, J.; Wang, Y. Enhancing Food Image Recognition by Multi-Level Fusion and the Attention Mechanism. Foods 2025, 14, 461. https://doi.org/10.3390/foods14030461

AMA Style

Chen Z, Wang J, Wang Y. Enhancing Food Image Recognition by Multi-Level Fusion and the Attention Mechanism. Foods. 2025; 14(3):461. https://doi.org/10.3390/foods14030461

Chicago/Turabian Style

Chen, Zengzheng, Jianxin Wang, and Yeru Wang. 2025. "Enhancing Food Image Recognition by Multi-Level Fusion and the Attention Mechanism" Foods 14, no. 3: 461. https://doi.org/10.3390/foods14030461

APA Style

Chen, Z., Wang, J., & Wang, Y. (2025). Enhancing Food Image Recognition by Multi-Level Fusion and the Attention Mechanism. Foods, 14(3), 461. https://doi.org/10.3390/foods14030461

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop