簡易檢索 / 詳目顯示

研究生: 江宇軒
Chiang, Yu-Hsuan
論文名稱: 運用MoveNet建置肢體動作分類預測模型
Using MoveNet to Construct the Body Movements Prediction Model
指導教授: 陳牧言
Chen, Mu-Yen
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系碩士在職專班
Department of Engineering Science (on the job class)
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 56
中文關鍵詞: 動作分類MoveNet肢體檢測模型關節點分類一維卷積神經網路
外文關鍵詞: Action Classification, MoveNet, Limb Detection Model, Keypoints Classification, One-Dimensional Convolutional Neural Network
相關次數: 點閱:234下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 動作分析可進一步了解受測者現階工作情形或者運動狀態,可用於危險動作減少憾事發生或運動競技動作預測分析;動作分析有兩種主要進行方式,分別是穿戴式設備的動作分析及電腦視覺技術的動作分析,由於穿戴式設備有其侷限性,因此本研究採用電腦視覺技術搭配神經網路進行動作分類。
    本研究使用網路公開健身教學影片,定義和標記23組分解動作,每種分解動作擁有十張以上的照片;神經網路為兩階段模型,第一階段使用MoveNet進行關節點偵測,第二階段使用關節點座標位置搭配Conv1D模型進行類別分類;為了避免受測者位置影響預測結果,因此對關節點座標進行收斂處理;損失函數使用Triplet-Center loss,優化演算法使用Adam。
    研究結果MoveNet + Conv1D,訓練樣本誤差為0.2,準確率為100%;測試樣本誤差為0.62,準確率為92%。使用Triplet-Center loss使得訓練樣本和測試樣本在t-SNE空間維度散佈圖,大部分的類別皆可被明確區分。
    MoveNet + Conv1D模型可以達到相當高的精準度;使用神經網路模型事半功倍,後續可容易推廣到各種運動領域、動作判別場景或工作空間避免危險動作發生,以致憾事。

    Motion analysis can be used to further understand the subject’s current work situation or exercise state, and can be used for dangerous actions to reduce regrets or to predict and analyze sports competitive actions; there are two main ways of motion analysis, namely motion analysis of wearable devices and computer vision. For the motion analysis of technology, due to the limitations of wearable devices, this study uses computer vision technology with neural networks for motion classification.
    This research uses the online public fitness teaching video to define and label 23 groups of decomposition actions, each of which has more than ten photos; the neural network is a two-stage model, the first stage uses MoveNet for keypoints detection, and the second stage uses those keypoints. In Second stage, the Conv1D model is used for classification with those keypoints coordinate. In order to avoid those keypoints coordinate of the subject from affecting the prediction result, those coordinate is converged. The Conv1D model uses Triplet-Center loss, and the optimization algorithm is Adam.
    The research results of MoveNet + Conv1D, the training sample error is 0.2, the accuracy rate is 100%; the test sample error is 0.62, the accuracy rate is 92%. Using Triplet-Center loss makes training samples and test samples spread out in the t-SNE spatial dimension, and most of the categories can be clearly distinguished.
    The MoveNet + Conv1D model can achieve a very high accuracy; using the neural network model can do more with less, and it can be easily extended to various sports fields, action discrimination scenarios or workspaces to avoid dangerous actions, which is a pity.

    摘要 I Extend Abstract II 致謝 V 目錄 VII 表目錄 IX 圖目錄 X 第一章 緒論 1 1.1 研究背景 1 1.2 研究動機 3 1.3 研究目的 4 1.4 章節概要 5 第二章 文獻探討 6 2.1 Convolutional Neural Network 6 2.2 Loss Function 8 2.2.1 Triplet loss 8 2.2.2 Center loss 10 2.2.3 Triplet-Center loss 12 2.3 Pose Estimation 13 2.3.1 OpenPose 13 2.3.2 MoveNet 15 2.4 YOLOv4 18 2.5 VGGNet 20 2.6 t-SNE 22 第三章 研究方法 23 3.1 研究流程 23 3.2 肢體檢測模型 24 3.3 座標收斂處理 26 3.4 類別標記 27 3.5 肢體動作分類 27 3.6 模型比較 28 第四章 實驗結果與討論 29 4.1 實驗設計 29 4.1.1 實驗環境 29 4.1.2 實驗資料集 30 4.1.3 參數設定 34 4.2 實驗結果 39 4.2.1 MoveNet + Conv1D 39 4.2.2 YOLOv4 44 4.2.3 VGG16 45 4.3 結果討論 46 第五章 結論 47 5.1 研究貢獻 47 5.2 研究限制 48 5.3 未來展望 49 參考文獻 51 附表 55

    [1] Chadelat, J.-M., “For REASON […] is nothing but Reckoning” : the Postulates of Hobbes’s and Descartes’s Rationalism. Revue LISA / LISA e-journal, 2014.
    [2] Silver, D., et al., Mastering the game of Go with deep neural networks and tree search. Nature, 2016. 529: p. 484-489.
    [3] Abercrombie, G., et al. Alexa, Google, Siri: What are Your Pronouns? Gender and Anthropomorphism in the Design and Perception of Conversational Assistants. 2021. arXiv:2106.02578.
    [4] Nemire, B., NVIDIA CEO Shares Current State of Artificial Intelligence. 2016.
    [5] Deng, J., et al. ImageNet: A large-scale hierarchical image database. in 2009 IEEE Conference on Computer Vision and Pattern Recognition. 2009.
    [6] Krizhevsky, A., I. Sutskever, and G.E. Hinton, ImageNet classification with deep convolutional neural networks, in Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1. 2012, Curran Associates Inc.: Lake Tahoe, Nevada. p. 1097–1105.
    [7] Borràs, R., À. Lapedriza, and L. Igual. Depth Information in Human Gait Analysis: An Experimental Study on Gender Recognition. in Image Analysis and Recognition. 2012. Berlin, Heidelberg: Springer Berlin Heidelberg.
    [8] Lin, Y.C., et al., Assessment of Shoulder Range of Motion Using a Wearable Inertial Sensor Network. IEEE Sensors Journal, 2021. 21(13): p. 15330-15341.
    [9] 林悅, 成大醫院GO APP 復健防跌有GO讚. 2017.
    [10 蔡一如, 運動與動作科學研究室. 2022.
    [11] Cao, Z., et al. OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. 2018. arXiv:1812.08008.
    [12] Cao, Z., et al. Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. 2016. arXiv:1611.08050.
    [13] Bajpai, R. and D. Joshi, MoveNet: A Deep Neural Network for Joint Profile Prediction Across Variable Walking Speeds and Slopes. IEEE Transactions on Instrumentation and Measurement, 2021. 70: p. 1-11.
    [14] 衛生福利部, 預防嬰兒猝死 「仰睡為首要」! 給寶寶安全睡眠環境. 2017.
    [15] Stolze, H., et al., Gait analysis during treadmill and overground locomotion in children and adults. Electroencephalography and Clinical Neurophysiology/Electromyography and Motor Control, 1997. 105(6): p. 490-497.
    [16] Garganta, J., Trends of tactical performance analysis in team sports: bridging the gap between research, training and competition. Revista Portuguesa de Ciências do desporto, 2009. 9(1).
    [17] Finch, C., A new framework for research leading to sports injury prevention. Journal of science and medicine in sport, 2006. 9(1-2): p. 3-9.
    [18] Mohammed Alotaibi, N., K. Reed, and M. Shaban Nadar, Assessments used in occupational therapy practice: An exploratory study. Occupational Therapy in Health Care, 2009. 23(4): p. 302-318.
    [19] Khushhal, A., et al., Validity and Reliability of the Apple Watch for Measuring Heart Rate During Exercise. Sports Medicine International Open, 2017. 1: p. E206-E211.
    [20] Zhang, Z., Microsoft Kinect Sensor and Its Effect. IEEE Multimedia - IEEEMM, 2012. 19: p. 4-10.
    [21] He, X., et al. Triplet-center loss for multi-view 3d object retrieval. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
    [22] LeCun, Y., et al., Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998. 86(11): p. 2278-2324.
    [23] Schroff, F., D. Kalenichenko, and J. Philbin FaceNet: A Unified Embedding for Face Recognition and Clustering. 2015. arXiv:1503.03832.
    [24] Huang, G.B., et al., Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments, in Workshop on Faces in 'Real-Life' Images: Detection, Alignment, and Recognition. 2008: Marseille, France.
    [25] Wen, Y., et al. A Discriminative Feature Learning Approach for Deep Face Recognition. 2016. Cham: Springer International Publishing.
    [26] He, X., et al. Triplet-Center Loss for Multi-View 3D Object Retrieval. 2018. arXiv:1803.06189.
    [27] Toshev, A. and C. Szegedy DeepPose: Human Pose Estimation via Deep Neural Networks. 2013. arXiv:1312.4659.
    [28] Martinez, J., et al. A simple yet effective baseline for 3d human pose estimation. 2017. arXiv:1705.03098.
    [29] Fathi, A., et al., G-RMI Object Detection. 2016.
    [30] Fang, H.-S., et al. RMPE: Regional Multi-person Pose Estimation. 2016. arXiv:1612.00137.
    [31] Wang, J., et al. Deep High-Resolution Representation Learning for Visual Recognition. 2019. arXiv:1908.07919.
    [32] Nie, X., et al. Single-Stage Multi-Person Pose Machines. 2019. arXiv:1908.09220.
    [33] Howard, A.G., et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. 2017. arXiv:1704.04861.
    [34] Howard, A., et al. Searching for MobileNetV3. 2019. arXiv:1905.02244.
    [35] Sandler, M., et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks. 2018. arXiv:1801.04381.
    [36] Lin, T.-Y., et al. Feature Pyramid Networks for Object Detection. 2016. arXiv:1612.03144.
    [37] Ronny Votel and Na Li, G.R., Next-Generation Pose Detection with MoveNet and TensorFlow.js. 2021.
    [38] Jo, B. and S. Kim, Comparative Analysis of OpenPose, PoseNet, and MoveNet Models for Pose Estimation in Mobile Devices. Traitement du Signal, 2022. 39(1): p. 119-124.
    [39] Girshick, R., et al. Rich feature hierarchies for accurate object detection and semantic segmentation. 2013. arXiv:1311.2524.
    [40] Wang, C.-Y., et al. CSPNet: A New Backbone that can Enhance Learning Capability of CNN. 2019. arXiv:1911.11929.
    [41] Bochkovskiy, A., C.-Y. Wang, and H.-Y.M. Liao YOLOv4: Optimal Speed and Accuracy of Object Detection. 2020. arXiv:2004.10934.
    [42] Redmon, J. and A. Farhadi YOLOv3: An Incremental Improvement. 2018. arXiv:1804.02767.
    [43] He, K., et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. 2014. arXiv:1406.4729.
    [44] Liu, S., et al. Path Aggregation Network for Instance Segmentation. 2018. arXiv:1803.01534.
    [45] Hùng, V., tensorflow-yolov4-tflite. 2022.
    [46] Simonyan, K. and A. Zisserman Very Deep Convolutional Networks for Large-Scale Image Recognition. 2014. arXiv:1409.1556.
    [47] Nair, V. and G.E. Hinton, Rectified linear units improve restricted boltzmann machines, in Proceedings of the 27th International Conference on International Conference on Machine Learning. 2010, Omnipress: Haifa, Israel. p. 807–814.
    [48] Maaten, L.v.d. and G.E. Hinton, Visualizing Data using t-SNE. Journal of Machine Learning Research, 2008. 9: p. 2579-2605.
    [49] tensorflow.org, Pose Detection. 2021.
    [50] Wong, E., 15 分鐘高强度全身肌肉徒手訓練 15 Min Full Body No Equipment workout| 無需器材又能在家做的運動|有效針對全身肌肉的訓練|男女都適合的訓練. 2020.
    [51] Xu, G., et al., A One-Dimensional CNN-LSTM Model for Epileptic Seizure Recognition Using EEG Signal Analysis. Frontiers in neuroscience, 2020. 14: p. 578126-578126.
    [52] Liao, A.B.a.C.-Y.W.a.H.-Y.M., YOLOv4 library.
    [53] 張家銘, YOLOv4 產業應用心得整理 2021.

    下載圖示 校內:2025-08-03公開
    校外:2025-08-03公開
    QR CODE