Three-Dimensional Mesh Character Pose Transfer with Neural Sparse-Softmax Skinning Blending
Abstract
:1. Introduction
- (1)
- We propose a pose transfer method that utilizes neural networks as a replacement for linear blend skinning (LBS). By mapping the latent space features of two 3D models with no correspondence to the latent space features of a skeleton with an isomorphic relationship, we transform the pose transfer problem between models into a latent transformation learning problem between the corresponding nodes of the source model and the reference model with an isomorphic relationship. This approach results in highly accurate pose features, which are then used by a neural network to generate the target model with preserved details utilizing the concept of linear blend skinning.
- (2)
- An improved neural weight binding method for mesh skinning was proposed by Yang et al. [11]. Sparse-Softmax is used to smooth the skinning weights, and KL divergence is employed for supervision, enabling the network to predict sparse and accurate smooth skinning weights.
- (3)
- Experimental results demonstrate that our method achieves superior performance compared to other deep learning-based pose transfer methods on human and animal meshes from unknown datasets.
2. Related Work
2.1. Deformation Transfer
2.2. Pose Transfer
3. Method
3.1. Rigging
3.1.1. Skinning Weights
3.1.2. Joint Regression
3.2. Neural Pose Transfer
3.2.1. Skinning-Based Pooling
3.2.2. Skeleton Edge Convolution
3.2.3. Skinning-Based Unpooling
4. Training
5. Experiments
5.1. Experimental Details
5.2. Experimental Result
5.3. Limitations and Future Perspectives
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Botsch, M.; Sumner, R.; Pauly, M.; Gross, M. Deformation transfer for detail-preserving surface editing. In Proceedings of the Vision, Modeling & Visualization, Citeseer, Aachen, Germany, 22–24 November 2006; pp. 357–364. [Google Scholar]
- Chen, H.; Tang, H.; Shi, H.; Peng, W.; Sebe, N.; Zhao, G. Intrinsic-extrinsic preserved gans for unsupervised 3d pose transfer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 October 2021; pp. 8630–8639. [Google Scholar]
- Shi, X.; Zhou, K.; Tong, Y.; Desbrun, M.; Bao, H.; Guo, B. Mesh puppetry: Cascading optimization of mesh deformation with inverse kinematics. In ACM SIGGRAPH 2007 Papers; Association for Computing Machinery: New York, NY, USA, 2007; pp. 81–es. [Google Scholar]
- Sumner, R.W.; Popović, J. Deformation transfer for triangle meshes. ACM Trans. Graph. (TOG) 2004, 23, 399–405. [Google Scholar] [CrossRef]
- Zou, X.; Li, G.; Yin, M.; Liu, Y.; Wang, Y. Deformation-Graph-Driven and Deformation Aware Spectral Pose Transfer. J. Comput.-Aided Des. Comput. Graph./Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao 2021, 33, 1234–1245. [Google Scholar] [CrossRef]
- Chen, H.; Tang, H.; Yu, Z.; Sebe, N.; Zhao, G. Geometry-contrastive transformer for generalized 3d pose transfer. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 22 February–1 March 2022; Volume 36, pp. 258–266. [Google Scholar]
- Song, C.; Wei, J.; Li, R.; Liu, F.; Lin, G. 3d pose transfer with correspondence learning and mesh refinement. Adv. Neural Inf. Process. Syst. 2021, 34, 3108–3120. [Google Scholar]
- Wang, J.; Wen, C.; Fu, Y.; Lin, H.; Zou, T.; Xue, X.; Zhang, Y. Neural pose transfer by spatially adaptive instance normalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2020; pp. 5831–5839. [Google Scholar]
- Kingma, D.P.; Welling, M. Auto-encoding variational bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
- Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep learning for 3D point clouds: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 4338–4364. [Google Scholar] [CrossRef] [PubMed]
- Yang, S.; Yin, M.; Li, M.; Li, G.; Chang, K.; Yang, F. 3D mesh pose transfer based on skeletal deformation. Comput. Animat. Virtual Worlds 2023, 34, e2156. [Google Scholar] [CrossRef]
- Magnenat, T.; Laperrière, R.; Thalmann, D. Joint-dependent local deformations for hand animation and object grasping. In Proceedings of the Graphics Interface’88, Edmonton, AB, Canada, 6–10 June 1988. [Google Scholar]
- Baran, I.; Popović, J. Automatic rigging and animation of 3d characters. ACM Trans. Graph. (TOG) 2007, 26, 72–es. [Google Scholar] [CrossRef]
- Aouaidjia, K.; Sheng, B.; Li, P.; Kim, J.; Feng, D.D. Efficient body motion quantification and similarity evaluation using 3-D joints skeleton coordinates. IEEE Trans. Syst. Man Cybern. Syst. 2019, 51, 2774–2788. [Google Scholar] [CrossRef]
- Kamel, A.; Liu, B.; Li, P.; Sheng, B. An investigation of 3D human pose estimation for learning Tai Chi: A human factor perspective. Int. J. Hum. Comput. Interact. 2019, 35, 427–439. [Google Scholar] [CrossRef]
- Karambakhsh, A.; Sheng, B.; Li, P.; Li, H.; Kim, J.; Jung, Y.; Chen, C.P. SparseVoxNet: 3-D object recognition with sparsely aggregation of 3-D dense blocks. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 532–546. [Google Scholar] [CrossRef] [PubMed]
- Li, M.; Yin, M.; Li, G.; Zhao, M.; Yang, F. Point-Cloud Self-Adaptive Pose Transfer Based on Skinning Deformation. J. Comput.-Aided Des. Comput. Graph./Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao 2022, 34, 1673–1683. [Google Scholar]
- Deng, J.; Lu, J.; Zhang, T. Unsupervised Template-assisted Point Cloud Shape Correspondence Network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville TN, USA, 11–15 June 2024; pp. 5250–5259. [Google Scholar]
- Li, W.; Liu, M.; Liu, H.; Wang, P.; Cai, J.; Sebe, N. Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville TN, USA, 11–15 June 2024; pp. 604–613. [Google Scholar]
- Gao, L.; Yang, J.; Qiao, Y.L.; Lai, Y.K.; Rosin, P.L.; Xu, W.; Xia, S. Automatic unpaired shape deformation transfer. ACM Trans. Graph. (TOG) 2018, 37, 237. [Google Scholar] [CrossRef]
- Kovnatsky, A.; Bronstein, M.M.; Bronstein, A.M.; Glashoff, K.; Kimmel, R. Coupled quasi-harmonic bases. In Computer Graphics Forum; Wiley Online Library: Hoboken, NJ, USA, 2013; Volume 32, pp. 439–448. [Google Scholar]
- Lévy, B. Laplace-beltrami eigenfunctions towards an algorithm that “understands” geometry. In Proceedings of the IEEE International Conference on Shape Modeling and Applications 2006 (SMI’06), Washington, DC, USA, 14–16 June 2006; IEEE: Piscataway, NJ, USA; 2006; p. 13. [Google Scholar]
- Yin, M.; Li, G.; Lu, H.; Ouyang, Y.; Zhang, Z.; Xian, C. Spectral pose transfer. Comput. Aided Geom. Des. 2015, 35, 82–94. [Google Scholar] [CrossRef]
- Cosmo, L.; Norelli, A.; Halimi, O.; Kimmel, R.; Rodola, E. Limp: Learning latent shape representations with metric preservation priors. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part III 16. Springer: Berlin/Heidelberg, Germany, 2020; pp. 19–35. [Google Scholar]
- Zhou, K.; Bhatnagar, B.L.; Pons-Moll, G. Unsupervised shape and pose disentanglement for 3d meshes. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part XXII 16. Springer: Berlin/Heidelberg, Germany, 2020; pp. 341–357. [Google Scholar]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
- Song, C.; Wei, J.; Li, R.; Liu, F.; Lin, G. Unsupervised 3d pose transfer with cross consistency and dual reconstruction. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 10488–10499. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Sun, Y.; Liu, Z.; Sarma, S.E.; Bronstein, M.M.; Solomon, J.M. Dynamic graph cnn for learning on point clouds. ACM Trans. Graph. (TOG) 2019, 38, 146. [Google Scholar] [CrossRef]
- Wang, A.; Chen, H.; Lin, Z.; Han, J.; Ding, G. Repvit: Revisiting mobile cnn from vit perspective. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville TN, USA, 11–15 June 2024; pp. 15909–15920. [Google Scholar]
- Chen, J.; Li, C.; Lee, G.H. Weakly-supervised 3D Pose Transfer with Keypoints. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Vancouver, BC, Canada, 17–24 June 2023; pp. 15156–15165. [Google Scholar]
- Sun, J.; Chen, Z.; Kim, T.K. MAPConNet: Self-supervised 3D Pose Transfer with Mesh and Point Contrastive Learning. arXiv 2023, arXiv:2304.13819. [Google Scholar]
- Liu, S.; Gai, S.; Da, F.; Waris, F. Geometry-aware 3D pose transfer using transformer autoencoder. Comput. Vis. Media 2024, 10, 1063–1078. [Google Scholar] [CrossRef]
- Zhao, T.; Zeng, H.; Zhang, B.; Fan, B.; Li, C. Point-voxel dual stream transformer for 3d point cloud learning. Vis. Comput. 2023, 40, 5323–5339. [Google Scholar] [CrossRef]
- Li, P.; Aberman, K.; Hanocka, R.; Liu, L.; Sorkine-Hornung, O.; Chen, B. Learning skeletal articulations with neural blend shapes. ACM Trans. Graph. (TOG) 2021, 40, 130. [Google Scholar] [CrossRef]
- Loper, M.; Mahmood, N.; Romero, J.; Pons-Moll, G.; Black, M.J. SMPL: A Skinned Multi-Person Linear Model. ACM Trans. Graph. 2015, 34, 851–866. [Google Scholar] [CrossRef]
- Zuffi, S.; Kanazawa, A.; Jacobs, D.W.; Black, M.J. 3D menagerie: Modeling the 3D shape and pose of animals. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6365–6373. [Google Scholar]
- Bogo, F.; Romero, J.; Pons-Moll, G.; Black, M.J. Dynamic FAUST: Registering human bodies in motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6233–6242. [Google Scholar]
- Pons-Moll, G.; Romero, J.; Mahmood, N.; Black, M.J. Dyna: A model of dynamic human shape in motion. ACM Trans. Graph. (TOG) 2015, 34, 120. [Google Scholar] [CrossRef]
- Bhatnagar, B.L.; Tiwari, G.; Theobalt, C.; Pons-Moll, G. Multi-garment net: Learning to dress 3d people from images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Long Beach, CA, USA, 15–20 June 2019; pp. 5420–5430. [Google Scholar]
- Xu, Z.; Zhou, Y.; Kalogerakis, E.; Landreth, C.; Singh, K. Rignet: Neural rigging for articulated characters. arXiv 2020, arXiv:2005.00559. [Google Scholar] [CrossRef]
- Ding, P.; Cui, Q.; Wang, H.; Zhang, M.; Liu, M.; Wang, D. Expressive forecasting of 3d whole-body human motions. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; Volume 38, pp. 1537–1545. [Google Scholar]
Method | PMD () ↓ | EL () ↓ | CD () ↓ | MaxDist () ↓ |
---|---|---|---|---|
NPT | 8.01 | 16.63 | 2.63 | 8.89 |
Yang et al. [11] | 0.62 | 1.03 | 0.32 | 0.52 |
3D-CoreNet | 0.58 | 0.92 | 0.53 | 1.02 |
MAPConNet | 0.53 | 0.88 | 0.15 | 0.36 |
Ours | 0.44 | 0.83 | 0.14 | 0.32 |
Module | PMD () ↓ | EL () ↓ |
---|---|---|
9.99 | 3.86 | |
0.61 | 0.92 | |
0.51 | 0.89 | |
0.46 | 0.84 | |
FULL | 0.44 | 0.83 |
Test | SWL () ↓ | PMD () ↓ |
---|---|---|
GTidx | 1.22 | 0.56 |
Original | 2.09 | 0.44 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, S.; Yin, M.; Li, M.; Zhan, F.; Hua, B. Three-Dimensional Mesh Character Pose Transfer with Neural Sparse-Softmax Skinning Blending. Electronics 2025, 14, 589. https://doi.org/10.3390/electronics14030589
Liu S, Yin M, Li M, Zhan F, Hua B. Three-Dimensional Mesh Character Pose Transfer with Neural Sparse-Softmax Skinning Blending. Electronics. 2025; 14(3):589. https://doi.org/10.3390/electronics14030589
Chicago/Turabian StyleLiu, Siqi, Mengxiao Yin, Ming Li, Feng Zhan, and Bei Hua. 2025. "Three-Dimensional Mesh Character Pose Transfer with Neural Sparse-Softmax Skinning Blending" Electronics 14, no. 3: 589. https://doi.org/10.3390/electronics14030589
APA StyleLiu, S., Yin, M., Li, M., Zhan, F., & Hua, B. (2025). Three-Dimensional Mesh Character Pose Transfer with Neural Sparse-Softmax Skinning Blending. Electronics, 14(3), 589. https://doi.org/10.3390/electronics14030589