成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	李原毅 Lee, Yuan-Yi
論文名稱：	沉浸式人機互動模型開發之研究 Research and development of immersive human-computer interaction system
指導教授：	蕭世文 Hsiao, Shih-Wen
學位類別：	碩士 Master
系所名稱：	規劃與設計學院 - 工業設計學系 Department of Industrial Design
論文出版年：	2020
畢業學年度：	108
語文別：	英文
論文頁數：	81
中文關鍵詞：	人機介面、沉浸式互動、手勢辨識、手勢追蹤、卷積神經網路
外文關鍵詞：	Human-machine interface, immersive interaction, gesture recognition, gesture tracking, convolutional neural network.
相關次數：	點閱：254 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

科技發達的時代，人機互動一直是重要的研究發展重點之一，互動模式發展歷程從圖像化介面(GUI)一直到現在的自然介面(NUI)，操作模式也發展出不同互動模式，從鍵盤指令、觸控發展到自然語言控制，如果能運用自然語言來取代鍵盤滑鼠指令來進行更自然的人機互動，便能達到具有沉浸的體驗感受。本研究以此為出發點，建立一個沉浸式人機互動模型，以常見網路攝影機為資料收集與系統控制儀器，將收集的影像經過膚色擷取、去躁、形態學處理與輪廓提取將影像進行預處理，並定義六個常見自然手勢語言(無手勢出現、握拳、手掌全開、食指伸出、食指與中指伸出、拇指與小指伸出)，再輸入12層的卷積神經網路進行學習訓練，訓練結果顯示手勢辨識準確率達98%以上具有高辨識度，並結合運用幀差法與核相關濾波演算法理論所組成的手勢追蹤使整體操作感提高，完成系統操作。最後以自己撰寫3D操作模型與使用者進行驗證，使用者可以匯入自己建立的模型並運用不同的自然手勢來控制模型進行放大、縮小、移動與旋轉等指令，讓使用者在操作模型上具有沉浸感，結果顯示使用者使用該系統時具有高度沉浸感體驗。

With the development of science and technology, human-computer interaction has been one of the important researches. The development process of interactive mode has changed from the graphical interface (GUI) to the current natural interface (NUI). The operation mode has also been created to different interactive modes from keyboard control to natural gesture control. If natural gesture language can be used to replace traditional commands, we can achieve more immersive experience.
Based on this point, this study establishes an immersive human-computer interactive model, uses webcam as data collection and system control instruments, and carries out image collection through skin color acquisition, dinoising, morphological processing, and contour extraction methods. And define six common natural gesture languages (none, fist, one, two, five, swing), and then input 12 layers of convolutional neural network to training. The result showed gesture recognition system has high recognition accuracy rate of 98%. Then we combined using temporal difference method and Kernelized Correlation Filters theory composed of gesture tracking to improve the overall sense of operation, and finally use 3D modeling which can import their own models and use different natural gestures to control the model to zoom in, zoom out, move, and rotate commands. It allows users to have more immersion sense in the operation model than the traditional one.

摘要	ii
SUMMARY	iii
ACKNOWLEDGEMENTS	iv
Table of Contents	v
List of Tables	viii
List of Figures	ix
CHAPTER 1 INTRODUCTION	1
1 Background	1
2 Motivation	5
3 Purpose	6
4 Research framework	7
5 Research framework diagram	9
6 Limitations of the Study	10
CHAPTER 2 LITERATURE REVIEW	11
1 Human-computer interaction	11
2 Immersive experience and immersive technology	12
3 Gesture capture	14
3.1 Marked point recognition	15
3.2 Single camera	15
3.3 Multiple cameras	15
3.4 Depth sensor	15
3.5 Glove sensor	16
3.6 Wrist sensor	16
3.7 Non-wearable sensors	16
4 Machine learning	17
4.1 Segmentation based on image gestures	19
4.2 Based on image feature extraction	19
4.3 Identification based on image classification	19
4.4 Gesture recognition based on deep learning	20
5 Gesture tracking theory	23
CHAPTER 3 THEORIES OF THE RESEARCH	27
1 Skin color extraction	27
1.1 Denoising	28
1.2 Morphology	30
1.3 Contour extraction	33
2 Convolutional Neural Network	33
2.1 Convolutional layer	35
2.2 Pooling layer	36
2.3 Fully connected layer	37
2.4 Dropout	37
2.5 Activation function	38
3 Temporal Difference method	38
4 Kernelized Correlation Filters	39
4.1 Linear regression	40
4.2 Learning	41
4.3 Testing	41
4.4 Updating	42
CHAPTER 4 RESEARCH METHODS	43
1 Experimental environment and instruments	44
2 Hand feature extraction	46
2.1 Image reading	47
2.2 Skin color detection	47
2.3 Denoising	49
2.4 Threshold	50
2.5 Contour extraction	52
2.6 Convolutional Neural Network Training	53
3 Gesture tracking	63
4 Immersive operation	65
CHAPTER 5 RESULT	67
CHAPTER 6 CONCLUSION	73
1 Research results and contributions	73
2 Follow-up research and development	73
REFERENCES	75
Chinese References	80


                                    

Anderson, F., Grossman, T., Matejka, J., Fitzmaurice, G., (2013). YouMove: enhancing movement training with an augmented reality mirror. In: Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology. ACM, pp. 311e320.
Arango Paredes, J.D., Munoz, B., Agredo, W., Ariza-Araujo, Y., Orozco, J.L., Navarro, A., (2015). A reliability assessment software using Kinect to complement the clinical evaluation of Parkinson's disease. In: Engineering in Medicine and Biology Society (EMBC), 2015 37th Annual International Conference of the IEEE. IEEE, pp. 6860e6863.
Caivano, J. L. . (1994). Color and Sound - Physical and Psychophysical Relations. Color Research and Application, 19(2), 126-133.
Camps-Valls, G., Gómez-Chova, L., Calpe, J., Soria, E., Martín, J. D., Alonso, L., & Moreno, J. . (2004). Robust support vector method for hyperspectral data classification and knowledge discovery. IEEE Trans. Geosci. Remote Sens., 42(7), p.1530-1542.
Chittaro, L., Sioni, R., Crescentini, C., & Fabbro, F. (2017). Mortality salience in virtual reality experiences and its effects on users' attitudes towards risk. International Journal of Human-Computer Studies, 101, 10e22.
Datcu, D., Lukosch, S., & Brazier, F. (2015). On the usability and effectiveness of different interaction types in augmented reality. International Journal of Humancomputer Interaction, 31, 193e209.
Dhanalakshmi, P. , Palanivel, S. , & Ramalingam, V. . (2009). Classification of audio signals using SVM and RBFNN. Expert Systems with Applications, 36(3), 6069-6075. doi: DOI 10.1016/j.eswa.2008.06.126
Elmezain, M., Al-Hamadi, A., Appenrodt, J., Michaelis, B., 2008. A Hidden Markov Model-based Continuous Gesture Recognition System for Hand Motion Trajectory. In: Pattern Recognition, pp. 1e4. ICPR 2008. 19th International Conference on, IEEE, 2008.
Enyedy, N., Danish, J. A., & DeLiema, D. (2015). Constructing liminal blends in a collaborative augmented-reality learning environment. International Journal of Computer-Supported Collaborative Learning, 10, 7e34.
Enyedy, N., Danish, J. A., Delacruz, G., & Kumar, M. (2012). Learning physics through play in an augmented reality environment. International Journal of Computer Supported Collaborative Learning, 7, 347e378.
Erol, Bebis, Nicolescu, R.D. Boyle, X. Twombly. (2007), Vision-based hand pose estimation: a review. Comput. Vis. Image Underst., 108, pp. 52-73
Esmaili, S. , Krishnan, S. , & Raahemifar, K. . (2004). Content based audio classification and retrieval using joint time–frequency analysis. Paper presented at the IEEE international conference on acoustics, speech and signal processing.
G. V. Pail, G. J. Beach, C. J. Cohen, and C. J. Jacobus, (2006). Tracking and Gesture Recognition System Particularly Suited to Vehicular Control Applications
Głomb, Romaszewski, Sochan, Opozda. (2011). Unsupervised parameter selection for gesture recognition with vector quantization and hidden Markov models. Proceedings of the IFIP Conference on Human–computer Interaction., 6949, pp. 170-177
Goh, D. H. L., Lee, C. S., & Razikin, K. (2016). Interfaces for accessing location-based information on mobile devices: An empirical evaluation. Journal of the Association for Information Science and Technology, 67, 2882e2896.
Hsiao, K. F., Chen, N. S., & Huang, S. Y. (2012). Learning while exercising for science education in augmented reality among adolescents. Interactive Learning Environments, 20, 331e349
Jin, S.-A. A. (2013). The moderating role of sensation seeking tendency in robotic haptic interfaces. Behaviour & Information Technology, 32, 862e873.
Kapuscinski, T., Oszust, M., Wysocki, M., 2014. Hand gesture recognition using time- of-flight camera and viewpoint feature histogram. In: Intelligent Systems in Technical and Medical Diagnostics. Springer, pp. 403e414.
Katsuki, Y., Yamakawa, Y., Ishikawa, M., 2015. High-speed human/robot hand interaction system. In: Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-robot Interaction Extended Abstracts. ACM, pp. 117e118.
Knoblich, G. , Jordan, J. S. . (2003). Action coordination in groups and individual: learning anticipatory control. Journal of Experimental Psychology learning Memory and Cognition, 29, p.1066-1016.
Liu, H., & Wang, L. J. I. J. o. I. E. (2018). Gesture recognition for human-robot collaboration: A review. 68, 355-367.

Li, D. G. , Sethi, I. K. , Dimitrova, N. , & McGee, T. . (2001). Classification of general audio data for content-based retrieval. Pattern Recognition Letters, 22(5), 533-544. doi: Doi 10.1016/S0167-8655(00)00119-7
Mammone, R. J. , Zhang, X. , & Ramachandran, R. P. . (1996). Robust speaker recognition: a feature-based approach. IEEE Signal Processing Magazine, 13(5), 1053-5888.
Matsumoto, Y., Zelinsky, A., 2000. An algorithm for real-time stereo vision implementation of head pose and gaze direction measurement. In: Automatic Face and Gesture Recognition, pp. 499e504. Proceedings. Fourth IEEE International Conference on,IEEE, 2000.
Mitra, S., Acharya, T., 2007. Gesture Recognition: a Survey, Systems, Man, and Cy bernetics, Part C: Applications and Reviews. IEEE Transactions on, 37, pp.
N. Greggio, A. Bernardino, C. Laschi, P. Dario, J. (2012) .Santos-Victor Fast estimation of Gaussian mixture models for image segmentation. Machine Vision & Applications, 23 (4), pp. 773-789
Obdrzalek, S., Kurillo, G., Ofli, F., Bajcsy, R., Seto, E., Jimison, H., Pavel, M., 2012. Accuracy and robustness of Kinect pose estimation in the context of coaching of elderly population.
Polys, N. F., Bowman, D. A., & North, C. (2011). The role of Depth and Gestalt cues in information-rich virtual environments. International Journal of HumanComputer Studies, 69, 30e51.
Roberts, Lawrence G. (1980). Machine perception of three-dimensional solids. New York: Garland Pub.
Rovira, A., & Slater, M. (2017). Reinforcement Learning as a tool to make people move to a specific location in Immersive Virtual Reality. International Journal of Human-computer Studies, 98, 89e94.
Sakai, K. , Hikosaka, O. , Nakamura, K. . (2004). Emergence of rhythm during motor learning. Trends in Cognitive Sciences, 8, p.547-553
Sen, A. , & Srivastava, M. . (1990). Regression Analysis: Theory, Methods, and Applications. New York: Springer.
Shih Wen Hsiao, Rong Qi Chem, Wan Lee Leng.(2015).Applying riding-posture optimization on bicycle frame design. Applied Ergonomics, Volume 51, November 2015, Pages 69-79.
Shih-Wen Hsiao, Chu-Hsuan Lee, Meng-Hua Yang, Rong-Qi Chen..(2017). User interface based on natural interaction design for seniors. Computers in Human Behavior, Volume 75, Pages 147-159
Solomatine, D. P., & Shrestha, D. L. (2004). AdaBoost.RT: A boosting algorithm for regression problems. . Paper presented at the Proc. IEEE Int. Joint Conf., Neural Networks.
Starner, T., Weaver, J., Pentland, A., 1998. Real-time american sign language recognition using desk and wearable computer based video. In: Pattern Analysis and Machine Intelligence, pp. 1371e1375. IEEE Transactions on.
Starner, T.E., 1995. Visual Recognition of American Sign Language Using Hidden Markov Models.
Tay, Francis E. H. , & Cao, L. . (2001). Application of support vector machines in financial time series forecasting. Omega, 29(4), 309-317.
Van Amsterdam, B., Nakawala, H., De Momi, E., & Stoyanov, D, (2019), Weakly supervised recognition of surgical gestures, in International Conference on Robotics and Automation. IEEE, ,pp. 9565–9571

Vapnik, V. , Goldwich, S. , & Smola, A. . (1997). Support vector method for function approximation, regression estimation, and signal processing: Cambride:MIT press.
Vapnik, V. . (1995). The nature of statistical learning theory.Springer. New York.
Velius, G. . (1988). Variants of cepstrum based speaker identity verification. Paper presented at the Acoustics, Speech, and Signal Processing New York, NY.
Weng C., Li Y., Zhang M., Guo K., Tang X., Pan Z. (2010). Robust hand posture recognition integrating multi-cue hand tracking Proceedings of the Entertainment for Education. Digital Techniques and Systems, International Conference on E-Learning and Games, Edutainment, pp. 497-508
Wojciechowski, R., & Cellary, W. (2013). Evaluation of learners' attitude toward learning in ARIES augmented reality environments. Computers & Education, 68, 570e585.
Xiuhui Wang, Ke Yan. Immersive human–computer interactive virtual environment using large-scale display system. Future Generation Computer Systems, In press, corrected proof, Available online 2017
Zhao, X., You, X., Shi, C., & Gan, S. (2015). Hypnosis therapy using augmented reality technology: Treatment for psychological stress and anxiety. Behaviour & Information Technology, 34, 646e653.
Zhihan Lv, Xiaoming Li, Wenbin Li. Virtual reality geographical interactive scene semantics research for immersive geography learning. Neurocomputing, Volume 254, 2017, Pages 71-78
許頌伶(2016)，利用三為模型訓練類神經網路得手勢辨識技術，交通大學資訊科學與工程研究所碩士論文。
謝汝欣(2016)，基於深度卷積神經網路之手勢辨識技術研究，交通大學電子工程研究所碩士論文。
洪屹呈(2015)，基於卷積神經網路的門禁與自動標記系統，國立交通大學電控工程研究所碩士論文
陳昱丞(2015)，基於卷積神經網路之車牌辨識系統，國立交通大學電控工程研究所碩士論文
陳昱丞(2015)，基於卷積神經網路之車牌辨識系統，國立交通大學電控工程研究所碩士論文
劉建賢(2011)，使用加速度計和陀螺儀之跌倒偵測系統，大同大學資訊工程研究所碩士論文。
李健銘(2010)，即時手語辨識，國立成功大學工程科學研究所碩士論文
張維哲(1992)，人工神經網路，全欣出版社，台北: 全欣資訊圖書股份有限公司

校內：2025-08-29公開
校外：不公開電子論文尚未授權公開，紙本請查館藏目錄

簡易檢索 / 詳目顯示

相關論文