簡易檢索 / 詳目顯示

研究生: 李謙
Li, Chian
論文名稱: 基於深度度量學習的太極拳相似性分析
Tai-Chi Chan Similarity Analysis with Deep Metric Learning
指導教授: 藍崑展
Lan, Kun-Chan
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 醫學資訊研究所
Institute of Medical Informatics
論文出版年: 2024
畢業學年度: 112
語文別: 英文
論文頁數: 127
中文關鍵詞: 動作捕捉深度學習度量學習太極
外文關鍵詞: Motion Capture, Deep Learning, Metric Learning, Tai Chi
相關次數: 點閱:49下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 太極拳是一種傳統的中國武術,對健康有顯著的益處,特別是在改善慢性疾病和增強老年人的平衡能力方面。然而,傳統的訓練方法既具挑戰性又耗時,需要有經驗的老師糾正動作。為了嘗試解決這些問題,我們提出了一種利用度量學習的電腦輔助太極拳訓練方法。此方法旨在提供一個太極拳動作相似性計算方法,只要穿戴幾個IMU設備便能讓使用者知道自己的動作和老師的動作差了多少,減少對老師在場的依賴,也讓學習時間表更加彈性,使太極拳練習者更容易和有效地學習太極拳。
    現有的作品主要使用基於關鍵點的姿勢估計來分析太極拳的動作。然而,這些方法主要著重於動作,而忽略了所涉及的力量。因此我們決定使用IMU設備來做為動作捕捉方案。而在動作相似性計算方面,被廣泛使用的DTW雖然捕捉到了時間特徵,但卻忽略了如身體部位的協調、動作的節奏等更為複雜的特徵。相較之下,我們的方法採用度量學習,讓那些複雜特徵能被捕捉到,提供更全面的太極動作分析。

    Tai Chi is a traditional Chinese martial art that has significant health benefits, particularly in improving chronic conditions and enhancing balance in the elderly. However, traditional training methods are both challenging and time-consuming, requiring experienced teacher to correct movements. To address these issues, we propose a computer-assisted Tai Chi training method utilizing metric learning. This approach aims to provide a similarity calculation method for Tai Chi movements, allowing users to assess how closely their movements match those of the teacher by wearing only a few IMU devices. This reduces the dependency on the teacher's presence and makes the training schedule more flexible, making it easier and more effective for practitioners to learn Tai Chi.

    Existing works primarily use keypoint-based pose estimation to analyze Tai Chi movements. However, these methods focus mainly on the movements themselves while neglecting the forces involved. Therefore, we decided to use IMU devices for motion capture. Regarding movement similarity calculation, while the widely used DTW captures temporal features, it overlooks more complex features such as body coordination and movement rhythm. In contrast, our method employs metric learning to capture these complex features, providing a more comprehensive analysis of Tai Chi movements.

    摘要 i Abstract ii 致謝 iii Contents iv List of Table vi List of Figure vi Chapter 1 Introduction 1 Chapter 2 Related work 6 2.1 Computer-aided Tai-Chi Chun training 6 2.1.1 Keypoints-based Pose Estimation 6 2.1.2 IMU-based motion capture 6 2.2 Metric Learning 7 2.2.1 Embedding Method 7 2.2.2 Triplet Loss 8 2.3 Deep Learning for time series data 9 2.4 Similarity of time series data 9 Chapter 3 Method 11 3.1 System Architecture 11 3.2 Embedding Model 12 3.2.1 Channel-independence 12 3.2.2 Transformer Backbone 13 3.3 Triplet Loss 14 3.3.1 MMD 14 3.3.2 Our Loss Function 15 Chapter 4 Experiment Result 17 4.1 Experiment design 17 4.1.1 Device Design 17 4.1.2 Body Sensor Network Configuration 18 4.1.3 Tai-Chi Chun Dataset 19 4.2 Data Preprocessing 20 4.2.1 Remove the data without labeled form 20 4.2.2 Remove unreasonable data 20 4.2.3 Create two different datasets for two tasks. 21 4.3 Model Training 23 4.3.1 Data splitting 23 4.3.2 Training 25 4.4 Results 27 4.4.1 Action Classification 27 4.4.2 Similarity definition 42 4.4.3 CDF 43 4.5 Ablation experiment 62 4.5.1 Action Classification 62 Chapter 5 Discussion 68 5.1 Data Splitting 68 5.2 Model Training 69 5.2.1 Task A 69 5.2.2 Task B 70 5.3 Result 72 5.3.1 Action Classification 72 5.3.2 CDF 84 5.4 Impact of interpolation parameters 102 5.4.1 Action Classification 102 Chapter 6 Conclusion and Future work 108 6.1 Limitation of this work 108 6.2 Conclusion 108 6.3 Future work 108 References 110

    [1] X. Feng, X. Lu, and X. Si, "Taijiquan Auxiliary Training and Scoring Based on Motion Capture Technology and DTW Algorithm," International Journal of Ambient Computing and Intelligence (IJACI), vol. 14, no. 1, pp. 1-15, 2023.
    [2] C. Wei et al., "Online 8-Form Tai Chi Chuan Training and Evaluation System Based on Pose Estimation," in 2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), 2022: IEEE, pp. 366-371.
    [3] Y. Guo, "Enhanced Design of a Tai Chi Teaching Assistance System Integrating DTW Algorithm and SVM," EAI Endorsed Transactions on Scalable Information Systems, vol. 11, no. 5, 2024.
    [4] D. J. Berndt and J. Clifford, "Using dynamic time warping to find patterns in time series," in Proceedings of the 3rd international conference on knowledge discovery and data mining, 1994, pp. 359-370.
    [5] M. Tits, S. Laraba, E. Caulier, J. Tilmanne, and T. Dutoit, "UMONS-TAICHI: A multimodal motion capture dataset of expertise in Taijiquan gestures," Data in brief, vol. 19, pp. 1214-1221, 2018.
    [6] H. Sakoe and S. Chiba, "Dynamic programming algorithm optimization for spoken word recognition," IEEE transactions on acoustics, speech, and signal processing, vol. 26, no. 1, pp. 43-49, 1978.
    [7] X. Li, L. Zou, and H. Li, "Tai chi movement recognition and precise intervention for the elderly based on inertial measurement units and temporal convolutional neural networks," Sensors, vol. 24, no. 13, p. 4208, 2024.
    [8] R.-Y. Li, "Learning Tai-Chi Chun through The Body Sensor Network," M.S. thesis, Dept. of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan, 2019.
    [9] F. Schroff, D. Kalenichenko, and J. Philbin, "Facenet: A unified embedding for face recognition and clustering," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 815-823.
    [10] H. Coskun, D. J. Tan, S. Conjeti, N. Navab, and F. Tombari, "Human motion analysis with deep metric learning," in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 667-683.
    [11] X. Li, P. Zhang, C. Wang, and S. Wu, "Similarity Measurement Human Actions with GNN," in 2022 7th International Conference on Image, Vision and Computing (ICIVC), 2022: IEEE, pp. 825-830.
    [12] N. Messina, J. Sedmidubsky, F. Falchi, and T. Rebok, "Text-to-motion retrieval: Towards joint understanding of human motion data and natural language," in Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023, pp. 2420-2425.
    [13] Y.-S. Chen and K.-H. Cheng, "BiCLR: Radar-Camera-based Cross-Modal Bi-Contrastive Learning for Human Motion Recognition," IEEE Sensors Journal, 2023.
    [14] S. Shi and C. Jung, "Deep metric learning for human action recognition with slowfast networks," in 2021 International Conference on Visual Communications and Image Processing (VCIP), 2021: IEEE, pp. 1-5.
    [15] Y. Nie, N. H. Nguyen, P. Sinthong, and J. Kalagnanam, "A time series is worth 64 words: Long-term forecasting with transformers," arXiv preprint arXiv:2211.14730, 2022.
    [16] B. Lim, S. Ö. Arık, N. Loeff, and T. Pfister, "Temporal fusion transformers for interpretable multi-horizon time series forecasting," International Journal of Forecasting, vol. 37, no. 4, pp. 1748-1764, 2021.
    [17] H. Zhou et al., "Informer: Beyond efficient transformer for long sequence time-series forecasting," in Proceedings of the AAAI conference on artificial intelligence, 2021, vol. 35, no. 12, pp. 11106-11115.
    [18] H. Oh Song, Y. Xiang, S. Jegelka, and S. Savarese, "Deep metric learning via lifted structured feature embedding," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 4004-4012.
    [19] K. Sohn, "Improved deep metric learning with multi-class n-pair loss objective," Advances in neural information processing systems, vol. 29, 2016.
    [20] J. L. Elman, "Finding structure in time," Cognitive science, vol. 14, no. 2, pp. 179-211, 1990.
    [21] A. Graves and A. Graves, "Long short-term memory," Supervised sequence labelling with recurrent neural networks, pp. 37-45, 2012.
    [22] K. Cho, "Learning phrase representations using RNN encoder-decoder for statistical machine translation," arXiv preprint arXiv:1406.1078, 2014.
    [23] A. v. d. Oord et al., "Wavenet: A generative model for raw audio," arXiv preprint arXiv:1609.03499, 2016.
    [24] S. Bai, J. Z. Kolter, and V. Koltun, "An empirical evaluation of generic convolutional and recurrent networks for sequence modeling," arXiv preprint arXiv:1803.01271, 2018.
    [25] Y. Chen, Y. Kang, Y. Chen, and Z. Wang, "Probabilistic forecasting with temporal convolutional neural network," Neurocomputing, vol. 399, pp. 491-501, 2020.
    [26] A. Vaswani, "Attention is all you need," arXiv preprint arXiv:1706.03762, 2017.
    [27] A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf, and A. Smola, "A kernel two-sample test," The Journal of Machine Learning Research, vol. 13, no. 1, pp. 723-773, 2012.
    [28] R. Caruana, "Multitask learning," Machine learning, vol. 28, pp. 41-75, 1997.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE