Headgear Accessories Classification Using an Overhead Depth Sensor
Abstract
:1. Introduction
2. Accessories Classification Algorithm
2.1. Preprocessing of Depth Maps
2.2. Detection of a Person ROI
2.3. Feature Extration and Normalization
- Calculate the histogram of pixels (depth measurements) along the heights of interest (between and ), with a high enough number of intervals. In this work, 20 intervals were used, as proposed in [21], taking into account geometrical considerations (Figure 2 shows this division in 20 slices). That is, we count the number of pixels in 20 slices of 2 cm of height, obtaining a vector , where is the number of pixels in section s (being s = 1, 2, …, 20). The highest section of a given person corresponds to s = 1, as shown in Figure 2.
- Estimate the first top section of the head, . To do this, we first find the section of the head with the maximum number of pixels in the first three elements of : . Then, we assume that:
- Find the five components of the feature vector () that describe the head structure. Each component is composed of two consecutive elements of organized as follows:
- Estimate the maximum section of the shoulders, . Taking into account the human morphology, we assume that can be found starting at 22 cm from the top of the head, i.e.:
- Find the three components of the feature vector () that describe the shoulders’ structure. Each component is composed of two consecutive elements of organized as follows:Each component corresponds to the number of pixels of a section of 4 cm, so that it depends on the person’s height, thus it is necessary to normalize them.
- Determine the normalized feature vector () that will be obtained dividing the components of the feature vector (), by the estimated value of its first component ():
2.4. Calculation of Normalization Coefficients
2.5. Attribute Model Estimation
2.6. Classification
Algorithm 1 Classification | |
1: | Input: feature vector , average vector , scatter matrix , transformation matrix , number of classes . |
2: | For to do |
3: | // Difference between the feature vector and the average vector // |
4: | // Projection of in the transformed space// |
5: | // Recovery of in the original space // |
6: | // Difference between the vector and its recovery // |
7: | // Recovery error = Mahalanobis distance between and // |
8: | If then |
9: | and // Class has now the smallest recovery error // |
10: | End If |
11: | End For |
12: | Output: //Class with the smallest recovery error// |
3. Experimental Work
3.1. Experimental Setup
- Single people scenarios, as the target is the classification of head style/accessories configuration, with a total of 25 different persons involved.
- People without accessories (no hat nor cap) and different hairstyle configurations (short, long and with/without ponytail).
3.2. Performance Metrics
3.3. Results and Discussion
- Level 2 classification grouping: in which five classes were considered, i.e., large size hat, medium size hat, small size hat, cap and no hat/cap.
- Level 1 classification grouping: in which three classes were considered, i.e., hat, cap and no hat/cap.
- Accessory–no accessory classification: in which two classes were considered, i.e., hat/cap, and no hat/cap.
3.4. Computational Cost
4. Conclusions
Acknowledgments
Author Contributions
Conflicts of Interest
References
- Lange, R.; Seitz, P. Solid-state Time-of-Flight range camera. IEEE J. Quantum Electron. 2001, 37, 390–397. [Google Scholar] [CrossRef]
- Sell, J.; O’Connor, P. The Xbox one system on a chip and kinect sensor. IEEE Micro 2014, 34, 44–53. [Google Scholar] [CrossRef]
- Kerker, D.; Jenkins, M.P.; Gross, G.A.; Bisantz, A.M.; Nagi, R. Visual estimation of human attributes: An empirical study of context-dependent human observation capabilities. In Proceedings of the 2014 IEEE International Inter-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision Support (CogSIMA), San Antonio, TX, USA, 3–6 March 2014; Volume 1, pp. 40–46. [Google Scholar]
- Saranya, M.; Cyril, G.L.I.; Santhosh, R.R. An approach towards ear feature extraction for human identification. In Proceedings of the 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), Chennai, India, 3–5 March 2016; Volume 86, pp. 4824–4828. [Google Scholar]
- Kim, S.T.; Kim, D.H.; Ro, Y.M. Spatio-temporal representation for face authentication by using multi-task learning with human attributes. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 2996–3000. [Google Scholar]
- Wang, H.-J.; Lin, Y.-L.; Huang, C.-Y.; Hou, Y.-L.; Hsu, W. Full body human attribute detection in indoor surveillance environment using color-depth information. In Proceedings of the 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance, Krakow, Poland, 27–30 August 2013; pp. 383–388. [Google Scholar]
- Linder, T.; Arras, K.O. Real-time full-body human attribute classification in RGB-D using a tessellation boosting approach. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 1335–1341. [Google Scholar]
- Chen, H.; Gallagher, A.; Girod, B. Describing clothing by semantic attributes. In Proceedings of the European Conference on Computer Vision, Florence, Italy, 7–13 October 2012; pp. 609–623. [Google Scholar]
- Bourdev, L.; Maji, S.; Malik, J. Describing people: A poselet-based approach to attribute classification. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 1543–1550. [Google Scholar]
- Wang, N.; Ai, H. Hair style retrieval by semantic mapping on informative patches. In Proceedings of the First Asian Conference on Pattern Recognition, Beijing, China, 28 November 2011; pp. 110–114. [Google Scholar]
- Linder, T.; Wehner, S.; Arras, K.O. Real-time full-body human gender recognition in (RGB)-D data. In Proceedings of the 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, 26–30 May 2015; pp. 3039–3045. [Google Scholar]
- Yamasaki, T.; Matsunami, T.; Chen, T. Human attribute analysis using a top-view camera based on two-stage classification. IEICE Trans. Inf. Syst. 2013, E96-D, 993–996. [Google Scholar] [CrossRef]
- Bevilacqua, A.; Di Stefano, L.; Azzari, P. People tracking using a time-of-flight depth sensor. In Proceedings of the 2006 IEEE International Conference on Video and Signal Based Surveillance, Sydney, Australia, 22–24 November 2006; p. 89. [Google Scholar]
- Stahlschmidt, C.; Gavriilidis, A.; Velten, J.; Kummert, A. People detection and tracking from a top-view position using a Time-of-Flight camera. In Communications in Computer and Information Science, Proceedings of the International Conference on Multimedia Communications, Services and Security, Kraków, Poland, 6–7 June 2013; Dziech, A., Czyazwski, A., Eds.; Springer: Berlin/Heidelberg, Germay, 2013; Volume 368, pp. 213–223. [Google Scholar]
- Jia, L.; Radke, R.J. Using Time-of-Flight measurements for privacy-preserving tracking in a smart room. IEEE Trans. Ind. Inform. 2014, 10, 689–696. [Google Scholar] [CrossRef]
- Cai, Z.; Yu, Z.L.; Liu, H.; Zhang, K. Counting people in crowded scenes by video analyzing. In Proceedings of the 2014 9th IEEE Conference on Industrial Electronics and Applications, Hangzhou, China, 9–11 June 2014; pp. 1841–1845. [Google Scholar]
- Galčík, F.; Gargalík, R. Real-time depth map based people counting. In Proceedings of the International Conference on Advanced Concepts for Intelligent Vision Systems, Poznań, Poland, 28–31 October 2013; pp. 330–341. [Google Scholar]
- Rauter, M. Reliable human detection and tracking in top-view depth images. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Portland, OR, USA, 23–28 June 2013; pp. 529–534. [Google Scholar]
- Del Pizzo, L.; Foggia, P.; Greco, A.; Percannella, G.; Vento, M. Counting people by RGB or depth overhead cameras. Pattern Recognit. Lett. 2016, 81, 41–50. [Google Scholar] [CrossRef]
- Vera, P.; Monjaraz, S.; Salas, J. Counting pedestrians with a zenithal arrangement of depth cameras. Mach. Vis. Appl. 2016, 27, 303–315. [Google Scholar] [CrossRef]
- Luna, C.A.; Losada, C.; Fuentes-Jimenez, D.; Fernandez-Rincon, A.; Mazo, M.; Macias-Guarasa, J. Robust people detection using depth information from an overhead Time-of-Flight camera. Expert Syst. Appl. 2016, 71, 240–256. [Google Scholar] [CrossRef]
- Jimenez, D.; Pizarro, D.; Mazo, M.; Palazuelos, S. Modelling and correction of multipath interference in Time of Flight cameras. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 893–900. [Google Scholar]
- Jimenez, D.; Pizarro, D.; Mazo, M. Single frame correction of motion artifacts in PMD-based Time of Flight cameras. Image Vis. Comput. 2014, 32, 1127–1143. [Google Scholar] [CrossRef]
- He, Y.; Liang, B.; Zou, Y.; He, J.; Yang, J. Depth errors analysis and correction for Time-of-Flight (ToF) cameras. Sensors 2017, 17, 92. [Google Scholar] [CrossRef] [PubMed]
- Zhu, L.; Wong, K.-H. Human tracking and counting using the kinect range sensor based on adaboost and kalman filter. In Proceedings of the International Symposium on Visual Computing, Rethymnon, Greece, 29–31 July 2013; pp. 582–591. [Google Scholar]
- Matzner, S.; Heredia-Langner, A.; Amidan, B.; Boettcher, E.J.; Lochtefeld, D.; Webb, T. Standoff human identification using body shape. In Proceedings of the 2015 IEEE International Symposium on Technologies for Homeland Security (HST), Waltham, MA, USA, 14–16 April 2015; pp. 1–6. [Google Scholar]
- Bushby, K.M.; Cole, T.; Matthews, J.N.; Goodship, J.A. Centiles for adult head circumference. Arch. Dis. Child. 1992, 67, 1286–1287. [Google Scholar] [CrossRef] [PubMed]
- Seidenari, L.; Varano, V.; Berretti, S.; Del Bimbo, A.; Pala, P. Recognizing actions from depth cameras as weakly aligned multi-part bag-of-poses. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Portland, OR, USA, 23–28 June 2013; pp. 479–485. [Google Scholar]
- Wang, L.; Tan, T.; Ning, H.; Hu, W. Silhouette analysis-based gait recognition for human identification. Pattern Anal. Mach. Intell. 2003, 25, 1505–1518. [Google Scholar] [CrossRef]
- Luna, C.A.; Jiménez, J.A.; Pizarro, D.; Losada, C.; Rodriguez, J.M. GPCA vs. PCA in recognition and 3-D localization of ultrasound reflectors. Sensors 2010, 10, 4825–4837. [Google Scholar] [CrossRef] [PubMed]
- Fernandez-Rincon, A.; Fuentes-Jimenez, D.; Losada-Gutierrez, C.; Marron-Romera, M.; Luna, C.A.; Macias-Guarasa, J.; Mazo, M. Robust people detection and tracking from an overhead Time-of-Flight camera. In Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Porto, Portugal, 27 February–1 March 2017; pp. 556–564. [Google Scholar]
- Macias-Guarasa, J.; Losada-Gutierrez, C.; Fuentes-Jimenez, D.; Garcia-Jimenez, R.; Luna, C.A.; Fernandez-Rincon, A.; Mazo, M. GEINTRA Overhead ToF People Detection (GOTPD1) Database. 2016. Available online: http://www.geintra-uah.org/datasets/gotpd1 (accessed on 31 July 2017).
- Khoshelham, K.; Elberink, S.O. Accuracy and resolution of kinect depth data for indoor mapping applications. Sensors 2012, 12, 1437–1454. [Google Scholar] [CrossRef] [PubMed]
- Sahani, M.; Nanda, C.; Sahu, A.K.; Pattnaik, B. Web-based online embedded door access control and home security system based on face recognition. In Proceedings of the 2015 International Conference on Circuits, Power and Computing Technologies [ICCPCT], Nagercoil, India, 19–20 March 2015. [Google Scholar]
Class | Training Accessories | Testing Accessories | |||
---|---|---|---|---|---|
Caps | 19 × 18, 27 × 18 2 | 18 × 15, 25 × 15 | 18 × 18, 27 × 18 | 20 × 17, 27 × 17 | |
Large size hats 1 | 16 × 16, 38 × 38, 11 | 18 × 13, 38 × 35, 12 | 14 × 14, 34 × 34, 13 | 18 × 13, 36 × 31,14 | 18 × 13, 35 × 32, 13 |
Medium size hats | 18 × 13, 29 × 26, 12 | 18 × 12, 32 × 26, 12 | 13 × 13, 24 × 31, 9 | 17 × 13, 32 × 27, 8 | 18 × 13, 32 × 30, 11 |
Small size hats | 17 × 12, 27 × 23, 11 | 17 × 13, 26 × 24, 13 | 16 × 11, 15 × 20, 10 | 16 × 14, 26 × 23, 9 | 17 × 14, 28 × 24, 13 |
18 × 15, 28 × 25, 10 | 18 × 13, 27 × 25, 14 |
#Sequences | #Frames | Level 2 Grouping | Level 1 Grouping | #Grouping1 Frames |
---|---|---|---|---|
14 | 492 | Short hair | no hat/cap | 684 |
6 | 192 | Long hair | ||
6 | 399 | caps | cap | 399 |
3 | 384 | Large size hat | hat | 1064 |
6 | 289 | Medium size hat | ||
3 | 391 | Small size hat | ||
38 | 2147 | 2147 |
#Sequences (T0 + T1) | #Frames (T0 + T1) | Level 2 Grouping | Level 1 Grouping | #Group1 Frames (T0 + T1) |
---|---|---|---|---|
17 + 16 = 33 | 678 + 460 = 1138 | no hat/cap | no hat/cap | 678 + 460 = 1138 |
34 + 37 = 71 | 1369 + 752 = 2121 | cap | cap | 1369 + 752 = 2121 |
31 + 37 = 68 | 1205 + 1320 = 2525 | Large size hat | hats | 4442 + 4708 = 9150 |
48 + 38 = 96 | 2563 + 1829 = 4392 | Medium size hat | ||
34 + 49 = 83 | 674 + 1559 = 2233 | Small size hat | ||
164 + 177 = 351 | 6489 + 5920 = 12,409 | 6489 + 5920 = 12,409 |
Type | Large Hat | Medium Hat | Small Hat | Cap | No Hat/Cap | |||||
---|---|---|---|---|---|---|---|---|---|---|
large hat | 93.74% | ±0.94% | 6.14% | ±0.94% | 0.12% | ±0.13% | 0.00% | ±0.00% | 0.00% | ±0.00% |
medium hat | 14.73% | ±1.05% | 69.49% | ±1.36% | 12.68% | ±0.98% | 0.00% | ±0.00% | 3.10% | ±0.51% |
small hat | 0.00% | ±0.00% | 0.63% | ±0.33% | 82.98% | ±1.56% | 9.85% | ±1.24% | 6.54% | ±1.03% |
cap | 0.00% | ±0.00% | 0.00% | ±0.00% | 6.69% | ±1.06% | 81.19% | ±1.66% | 12.12% | ±1.39% |
no hat/cap | 0.00% | 0.00% | 0.00% | ±0.00% | 0.09% | ±0.17% | 4.31% | ±1.18% | 95.61% | ±1.19% |
Type | #Frames | Hat | Cap | No Hat/Cap | |||
---|---|---|---|---|---|---|---|
hat | 9150 | 94.51% | ±0.47% | 2.40% | ±0.31% | 3.08% | ±0.35% |
cap | 2121 | 6.69% | ±1.06% | 81.19% | ±1.66% | 12.12% | ±1.39% |
no hat/cap | 1138 | 0.09% | ±0.17% | 4.31% | ±1.18% | 95.61% | ±1.19% |
Frames | 12,409 |
Type | #Frames | Hat/Cap | No Hat/Cap | ||
---|---|---|---|---|---|
hat/cap | 11,271 | 95.22% | ±0.39% | 4.78% | ±0.39% |
no hat/cap | 1138 | 4.39% | ±1.19% | 95.61% | ±1.19% |
#Frames | 12,409 |
© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Luna, C.A.; Macias-Guarasa, J.; Losada-Gutierrez, C.; Marron-Romera, M.; Mazo, M.; Luengo-Sanchez, S.; Macho-Pedroso, R. Headgear Accessories Classification Using an Overhead Depth Sensor. Sensors 2017, 17, 1845. https://doi.org/10.3390/s17081845
Luna CA, Macias-Guarasa J, Losada-Gutierrez C, Marron-Romera M, Mazo M, Luengo-Sanchez S, Macho-Pedroso R. Headgear Accessories Classification Using an Overhead Depth Sensor. Sensors. 2017; 17(8):1845. https://doi.org/10.3390/s17081845
Chicago/Turabian StyleLuna, Carlos A., Javier Macias-Guarasa, Cristina Losada-Gutierrez, Marta Marron-Romera, Manuel Mazo, Sara Luengo-Sanchez, and Roberto Macho-Pedroso. 2017. "Headgear Accessories Classification Using an Overhead Depth Sensor" Sensors 17, no. 8: 1845. https://doi.org/10.3390/s17081845
APA StyleLuna, C. A., Macias-Guarasa, J., Losada-Gutierrez, C., Marron-Romera, M., Mazo, M., Luengo-Sanchez, S., & Macho-Pedroso, R. (2017). Headgear Accessories Classification Using an Overhead Depth Sensor. Sensors, 17(8), 1845. https://doi.org/10.3390/s17081845