A High-Density Crowd Counting Method Based on Convolutional Feature Fusion
Abstract
:1. Introduction
2. Feature Fusion CNN for Crowd Counting
2.1. Density Map Based Crowd Counting
2.2. Ground Truth Density Map
2.3. Network Architecture
2.4. Network Loss
3. Experiments
3.1. Dataset
3.2. Implementation Details
3.3. Evaluation Metrics
3.4. Results and Analysis
4. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Fruin, J.J. Pedestrian Planning and Design; Metropolitan Association of Urban Designers & Environmental Planners: New York, NY, USA, 1971. [Google Scholar]
- Zhan, B.; Monekosso, D.; Remagnino, P.; Velastin, S.; Xu, L.-Q. Crowd analysis: A survey. Mach. Vis. Appl. 2008, 19, 345–357. [Google Scholar] [CrossRef]
- Zeng, L.; Xu, X.; Cai, B.; Qiu, S.; Zhang, T. Multi-scale convolutional neural networks for crowd counting. In Proceedings of the IEEE International Conference on Image Processing, Beijing, China, 17–20 September 2017; pp. 465–469. [Google Scholar]
- Zhang, C.; Li, H.; Wang, X.; Yang, X. Cross-scene crowd counting via deep convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 8–10 June 2015; pp. 833–841. [Google Scholar]
- Leibe, B.; Seemann, E.; Schiele, B. Pedestrian detection in crowded scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 20–26 June 2005; pp. 878–885. [Google Scholar]
- Zhao, T.; Nevatia, R.; Wu, B. Segmentation and tracking of multiple humans in crowded environments. IEEE Trans. Pattern Anal. Mach. Intell. 2008, 30, 1198–1211. [Google Scholar] [CrossRef] [PubMed]
- Ge, W.; Collins, R.T. Marked point processes for crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 2913–2920. [Google Scholar]
- Chan, A.B.; Liang, Z.S.J.; Vasconcelos, N. Privacy preserving crowd monitoring: Counting people without people models or tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 24–26 June 2008; pp. 1–7. [Google Scholar]
- Ryan, D.; Denman, S.; Fookes, C.; Sridharan, S. Crowd counting using multiple local features. In Proceedings of the Digital Image Computing: Techniques and Applications, Melbourne, Australia, 1–3 December 2009; pp. 81–88. [Google Scholar]
- Chan, A.B.; Vasconcelos, N. Bayesian poisson regression for crowd counting. In Proceedings of the IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 27 September–4 October 2009; pp. 545–551. [Google Scholar]
- Lempitsky, V.; Zisserman, A. Learning to count objects in images. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 6–9 December 2010; pp. 1324–1332. [Google Scholar]
- Pham, V.Q.; Kozakaya, T.; Yamaguchi, O.; Okada, R. Count forest: Co-voting uncertain number of targets using random forest for crowd density estimation. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 3253–3261. [Google Scholar]
- Wang, C.; Zhang, H.; Yang, L.; Liu, S.; Cao, X. Deep people counting in extremely dense crowds. In Proceedings of the 23rd ACM International Conference on Multimedia, Brisbane, Australia, 26–30 October 2015; pp. 1299–1302. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105. [Google Scholar]
- Schmidhuber, J.; Meier, U.; Ciresan, D. Multi-column deep neural networks for image classification. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 3642–3649. [Google Scholar]
- Zhang, Y.; Zhou, D.; Chen, S.; Gao, S.; Ma, Y. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 589–597. [Google Scholar]
- Sam, D.B.; Surya, S.; Babu, R.V. Switching convolutional neural network for crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5744–5752. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv, 2014; arXiv:1409.1556. [Google Scholar]
- Kumagai, S.; Hotta, K.; Kurita, T. Mixture of Counting CNNs: Adaptive Integration of CNNs Specialized to Specific Appearance for Crowd Counting. arXiv, 2017; arXiv:1703.09393. [Google Scholar]
- Zhang, L.; Shi, M.; Chen, Q. Crowd counting via scale-adaptive convolutional neural network. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Lake Tahoe, NV, USA, 12–14 March 2018; pp. 1113–1121. [Google Scholar]
- Tang, S.; Pan, Z.; Zhou, X. Low-Rank and Sparse Based Deep-Fusion Convolutional Neural Network for Crowd Counting. Math. Probl. Eng. 2017, 2017, 1–11. [Google Scholar] [CrossRef] [Green Version]
- Han, K.; Wan, W.; Yao, H.; Hou, L. Image Crowd Counting Using Convolutional Neural Network and Markov Random Field. arXiv, 2017; arXiv:1706.03686. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Zeiler, M.D.; Krishnan, D.; Taylor, G.W.; Fergus, R. Deconvolutional networks. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 2528–2535. [Google Scholar]
- Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 818–833. [Google Scholar]
- Yang, B.; Yan, J.; Lei, Z.; Li, S.Z. Convolutional channel features. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 13–16 December 2015; pp. 82–90. [Google Scholar]
- Glorot, X.; Bordes, A.; Bengio, Y. Deep sparse rectifier neural networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 11–13 April 2011; pp. 315–323. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 8–10 June 2015; pp. 3431–3440. [Google Scholar]
- Marsden, M.; McGuinness, K.; Little, S.; O’Connor, N.E. Fully convolutional crowd counting on highly congested scenes. In Proceedings of the 12th International Conference on Computer Vision Theory and Applications, Porto, Portugal, 27 February–1 March 2017; pp. 27–33. [Google Scholar]
- Sindagi, V.A.; Patel, V.M. Cnn-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In Proceedings of the IEEE International Conference on In Advanced Video and Signal Based Surveillance, Lecce, Italy, 29 August–1 September 2017; pp. 1–6. [Google Scholar]
Dataset | Scenes | Num | Resolution | Min | Max | Total |
---|---|---|---|---|---|---|
Part A | different | 482 | different | 33 | 3139 | 241,677 |
Part B | different | 716 | 1024 × 768 | 9 | 578 | 88,488 |
Method | Part A | Part B | ||
---|---|---|---|---|
MAE | MSE | MAE | MSE | |
LBP + RR [4] | 303.2 | 371.0 | 59.1 | 81.7 |
Cross-scene [4] | 181.8 | 277.7 | 32.0 | 49.8 |
MCNN [16] | 110.2 | 173.2 | 26.4 | 41.3 |
FCN [29] | 126.5 | 173.5 | 23.76 | 33.12 |
Cascaded-MTL [30] | 101.3 | 152.4 | 20.0 | 31.1 |
Switching-CNN [17] | 90.4 | 135.0 | 21.6 | 33.4 |
LFCNN [20] | 89.2 | 141.9 | 14.7 | 25.4 |
SaCNN [21] | 86.8 | 139.2 | 16.2 | 25.8 |
CNN-MRF [22] | 79.1 | 130.1 | 17.8 | 26.0 |
FF-CNN | 81.75 | 138.80 | 16.45 | 26.19 |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Luo, H.; Sang, J.; Wu, W.; Xiang, H.; Xiang, Z.; Zhang, Q.; Wu, Z. A High-Density Crowd Counting Method Based on Convolutional Feature Fusion. Appl. Sci. 2018, 8, 2367. https://doi.org/10.3390/app8122367
Luo H, Sang J, Wu W, Xiang H, Xiang Z, Zhang Q, Wu Z. A High-Density Crowd Counting Method Based on Convolutional Feature Fusion. Applied Sciences. 2018; 8(12):2367. https://doi.org/10.3390/app8122367
Chicago/Turabian StyleLuo, Hongling, Jun Sang, Weiqun Wu, Hong Xiang, Zhili Xiang, Qian Zhang, and Zhongyuan Wu. 2018. "A High-Density Crowd Counting Method Based on Convolutional Feature Fusion" Applied Sciences 8, no. 12: 2367. https://doi.org/10.3390/app8122367
APA StyleLuo, H., Sang, J., Wu, W., Xiang, H., Xiang, Z., Zhang, Q., & Wu, Z. (2018). A High-Density Crowd Counting Method Based on Convolutional Feature Fusion. Applied Sciences, 8(12), 2367. https://doi.org/10.3390/app8122367