研究生: |
李謙 Li, Chian |
---|---|
論文名稱: |
基於深度度量學習的太極拳相似性分析 Tai-Chi Chan Similarity Analysis with Deep Metric Learning |
指導教授: |
藍崑展
Lan, Kun-Chan |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 醫學資訊研究所 Institute of Medical Informatics |
論文出版年: | 2024 |
畢業學年度: | 112 |
語文別: | 英文 |
論文頁數: | 127 |
中文關鍵詞: | 動作捕捉 、深度學習 、度量學習 、太極 |
外文關鍵詞: | Motion Capture, Deep Learning, Metric Learning, Tai Chi |
相關次數: | 點閱:49 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
太極拳是一種傳統的中國武術,對健康有顯著的益處,特別是在改善慢性疾病和增強老年人的平衡能力方面。然而,傳統的訓練方法既具挑戰性又耗時,需要有經驗的老師糾正動作。為了嘗試解決這些問題,我們提出了一種利用度量學習的電腦輔助太極拳訓練方法。此方法旨在提供一個太極拳動作相似性計算方法,只要穿戴幾個IMU設備便能讓使用者知道自己的動作和老師的動作差了多少,減少對老師在場的依賴,也讓學習時間表更加彈性,使太極拳練習者更容易和有效地學習太極拳。
現有的作品主要使用基於關鍵點的姿勢估計來分析太極拳的動作。然而,這些方法主要著重於動作,而忽略了所涉及的力量。因此我們決定使用IMU設備來做為動作捕捉方案。而在動作相似性計算方面,被廣泛使用的DTW雖然捕捉到了時間特徵,但卻忽略了如身體部位的協調、動作的節奏等更為複雜的特徵。相較之下,我們的方法採用度量學習,讓那些複雜特徵能被捕捉到,提供更全面的太極動作分析。
Tai Chi is a traditional Chinese martial art that has significant health benefits, particularly in improving chronic conditions and enhancing balance in the elderly. However, traditional training methods are both challenging and time-consuming, requiring experienced teacher to correct movements. To address these issues, we propose a computer-assisted Tai Chi training method utilizing metric learning. This approach aims to provide a similarity calculation method for Tai Chi movements, allowing users to assess how closely their movements match those of the teacher by wearing only a few IMU devices. This reduces the dependency on the teacher's presence and makes the training schedule more flexible, making it easier and more effective for practitioners to learn Tai Chi.
Existing works primarily use keypoint-based pose estimation to analyze Tai Chi movements. However, these methods focus mainly on the movements themselves while neglecting the forces involved. Therefore, we decided to use IMU devices for motion capture. Regarding movement similarity calculation, while the widely used DTW captures temporal features, it overlooks more complex features such as body coordination and movement rhythm. In contrast, our method employs metric learning to capture these complex features, providing a more comprehensive analysis of Tai Chi movements.
[1] X. Feng, X. Lu, and X. Si, "Taijiquan Auxiliary Training and Scoring Based on Motion Capture Technology and DTW Algorithm," International Journal of Ambient Computing and Intelligence (IJACI), vol. 14, no. 1, pp. 1-15, 2023.
[2] C. Wei et al., "Online 8-Form Tai Chi Chuan Training and Evaluation System Based on Pose Estimation," in 2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), 2022: IEEE, pp. 366-371.
[3] Y. Guo, "Enhanced Design of a Tai Chi Teaching Assistance System Integrating DTW Algorithm and SVM," EAI Endorsed Transactions on Scalable Information Systems, vol. 11, no. 5, 2024.
[4] D. J. Berndt and J. Clifford, "Using dynamic time warping to find patterns in time series," in Proceedings of the 3rd international conference on knowledge discovery and data mining, 1994, pp. 359-370.
[5] M. Tits, S. Laraba, E. Caulier, J. Tilmanne, and T. Dutoit, "UMONS-TAICHI: A multimodal motion capture dataset of expertise in Taijiquan gestures," Data in brief, vol. 19, pp. 1214-1221, 2018.
[6] H. Sakoe and S. Chiba, "Dynamic programming algorithm optimization for spoken word recognition," IEEE transactions on acoustics, speech, and signal processing, vol. 26, no. 1, pp. 43-49, 1978.
[7] X. Li, L. Zou, and H. Li, "Tai chi movement recognition and precise intervention for the elderly based on inertial measurement units and temporal convolutional neural networks," Sensors, vol. 24, no. 13, p. 4208, 2024.
[8] R.-Y. Li, "Learning Tai-Chi Chun through The Body Sensor Network," M.S. thesis, Dept. of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan, 2019.
[9] F. Schroff, D. Kalenichenko, and J. Philbin, "Facenet: A unified embedding for face recognition and clustering," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 815-823.
[10] H. Coskun, D. J. Tan, S. Conjeti, N. Navab, and F. Tombari, "Human motion analysis with deep metric learning," in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 667-683.
[11] X. Li, P. Zhang, C. Wang, and S. Wu, "Similarity Measurement Human Actions with GNN," in 2022 7th International Conference on Image, Vision and Computing (ICIVC), 2022: IEEE, pp. 825-830.
[12] N. Messina, J. Sedmidubsky, F. Falchi, and T. Rebok, "Text-to-motion retrieval: Towards joint understanding of human motion data and natural language," in Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023, pp. 2420-2425.
[13] Y.-S. Chen and K.-H. Cheng, "BiCLR: Radar-Camera-based Cross-Modal Bi-Contrastive Learning for Human Motion Recognition," IEEE Sensors Journal, 2023.
[14] S. Shi and C. Jung, "Deep metric learning for human action recognition with slowfast networks," in 2021 International Conference on Visual Communications and Image Processing (VCIP), 2021: IEEE, pp. 1-5.
[15] Y. Nie, N. H. Nguyen, P. Sinthong, and J. Kalagnanam, "A time series is worth 64 words: Long-term forecasting with transformers," arXiv preprint arXiv:2211.14730, 2022.
[16] B. Lim, S. Ö. Arık, N. Loeff, and T. Pfister, "Temporal fusion transformers for interpretable multi-horizon time series forecasting," International Journal of Forecasting, vol. 37, no. 4, pp. 1748-1764, 2021.
[17] H. Zhou et al., "Informer: Beyond efficient transformer for long sequence time-series forecasting," in Proceedings of the AAAI conference on artificial intelligence, 2021, vol. 35, no. 12, pp. 11106-11115.
[18] H. Oh Song, Y. Xiang, S. Jegelka, and S. Savarese, "Deep metric learning via lifted structured feature embedding," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 4004-4012.
[19] K. Sohn, "Improved deep metric learning with multi-class n-pair loss objective," Advances in neural information processing systems, vol. 29, 2016.
[20] J. L. Elman, "Finding structure in time," Cognitive science, vol. 14, no. 2, pp. 179-211, 1990.
[21] A. Graves and A. Graves, "Long short-term memory," Supervised sequence labelling with recurrent neural networks, pp. 37-45, 2012.
[22] K. Cho, "Learning phrase representations using RNN encoder-decoder for statistical machine translation," arXiv preprint arXiv:1406.1078, 2014.
[23] A. v. d. Oord et al., "Wavenet: A generative model for raw audio," arXiv preprint arXiv:1609.03499, 2016.
[24] S. Bai, J. Z. Kolter, and V. Koltun, "An empirical evaluation of generic convolutional and recurrent networks for sequence modeling," arXiv preprint arXiv:1803.01271, 2018.
[25] Y. Chen, Y. Kang, Y. Chen, and Z. Wang, "Probabilistic forecasting with temporal convolutional neural network," Neurocomputing, vol. 399, pp. 491-501, 2020.
[26] A. Vaswani, "Attention is all you need," arXiv preprint arXiv:1706.03762, 2017.
[27] A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf, and A. Smola, "A kernel two-sample test," The Journal of Machine Learning Research, vol. 13, no. 1, pp. 723-773, 2012.
[28] R. Caruana, "Multitask learning," Machine learning, vol. 28, pp. 41-75, 1997.