簡易檢索 / 詳目顯示

研究生: 李俊穎
Li, Chun-Ying
論文名稱: 基於2D姿態預測之3D骨架重建
3D Skeleton Reconstruction Based On 2D Pose Estimation
指導教授: 蘇文鈺
Su, Wen-Yu Alvin
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 21
中文關鍵詞: 深度神經網路3D骨架
外文關鍵詞: Deep neural network, 3D skeleton
相關次數: 點閱:89下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 社群媒體的發達,讓許多人會透過網路上的影片學習新知,運動就是其中熱門的項目之一,人們會透過影片去學習如跳舞、瑜珈、健身等等運動。但影片是平面的,用來學習立體空間中的動作會顯得不夠完善,因此在我們研究中提出一個方法,能辨識影片中的人物,並在3D環境中重建其骨架動作。

    相關的研究中,大部分是先透過儀器捕捉人在動作中各關節點的座標,以此建立大量的動作及影像資料庫,接著再透過以影像作為輸入的深度神經網路進行3D骨架重建的訓練。

    在本研究中,我們則是透過蒐集3D人形動畫,以這些動畫為基礎,建立一個模擬的動作資料庫。資料庫中包含動畫之2D以及3D資料。2D資料來自於,我們使用已訓練好的網路模型對3D動畫進行2D姿態預測的結果,3D資料則是透過遊戲引擎來獲取動畫角色在運動中,各個標記點的位置。

    相比相關研究以影像作為深度神經網路輸入,我們改以估計的2D骨架作為輸入,是想透過運動學的方式,讓深度神經網路學習將2D骨架重建為3D骨架,並以此得到更好的姿態及動作還原,我們在訓練的損失函數(Loss function )中,加入了包含骨頭長度不變,姿態誤差方向,運動誤差方向等等項目,透過這樣的限制來讓深度神經網路能更好的重建一個在姿態及動作還原度都理想的3D骨架。

    在測試中,我們在不同的動作上呈現出不同的還原效果,我們以此結果來探討如何能更進一步提升3D骨架重建的精準度。

    The development of online social media allows people to learning new skill from online video. Dance, yoga, and fitness are some kind of sport that people would learning online. But, it is not easy for people to understand 3D motion from 2D video. In this thesis, we provide an approach for reconstructing 3D skeleton from target human in video, which can help people to understand the 3D motion.

    Most of the related work, first capture the positions of joints in the movement with multiview camera or depth camera, and then building a large motion and image dataset Second, performing 3D skeleton reconstruction through the deep neural network based on the input image.

    In this thesis, we collect 3D humanoid animations and build a simulated dataset based on these animations. The dataset contains 3D data extract from animations. We use the well-trained model to estimate the 2D pose of the animation video, and take the result of estimation as 2D data. 3D data are the extraction from the character’s joints positions through 3D game engine.

    Compared with the related research, which using the image-based deep neural network, We use the result of 2D pose estimation as the input of model. We want to use the kinematics method to let the deep neural network learn to reconstruct the 2D skeleton into a 3D skeleton, and get better performing on reconstruction of pose and motion.

    In experiment, we present different reconstruction effect on different motion. We use the result of experiment to explore how can we improve the accuracy of 3D skeleton reconstruction based on 2d pose.

    LIST OF TABLES VI LIST OF FUGURES VII Chapter 1 Introduction 1 Chapter 2 Related Work 2 2.1 OpenPose 2 2.2 VNect 2 2.3 3D Human Pose Estimation 2 2.4 Comparison 3 Chapter 3 Dataset 4 3.1 3D animation resource 4 3.2 3D position data 5 3.3 2D estimation data 6 3.4 Skeleton model 7 Chapter 4 Method 9 4.1 Architecture 10 4.2 Learning objective 11 4.2.1 Mean square error 11 4.2.2 Fixe bone length 12 Chapter 5 Experiment 15 5.1 Results 17 5.2 Unity data experiment 17 Chapter 6 Discussion 18 Chapter 7 Conclusion 19 References 20

    [1] Z. Cao, T. Simon, S-E. Wei, and Y. Sheikh,” Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields,” The IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2017

    [2] D. Mehta, S. Sridhar, O. Sotnychenko, H. Rhodin, M. Shafiei, H. P . Seidel, W . Xu, D. Casas, and C. Theobalt,” Vnect :Real-time 3D Human Pose Estimation with a Single RGB Camera,”ACM Transactions on Graphics(TOG),vol 36,No.44,2017

    [3] K. He, X. Zhang, S. Ren, J. Sun,” Deep Residual Learning for Image Recognition,” The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778,2016

    [4] C. Chen, D. Ramanan,”3D Human Pose Estimation =2D Pose estimation + matching,” The IEEE Conference on Computer Vision and Pattern Recognition(CVPR),pp 7035-7043,2017

    [5] mixamo.com,’’,2008. [online]. available: https://www.mixamo.com.[Accessed:03- Feb- 2018].

    下載圖示 校內:2024-06-22公開
    校外:2024-06-22公開
    QR CODE