簡易檢索 / 詳目顯示

研究生: 柯志昂
Ko, Chih-Ang
論文名稱: 手勢跨越顏面部位的台灣手語辨識
Recognition of Hand-over-Face TSL Gestures
指導教授: 謝璧妃
Hsieh, Pi-Fei
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2009
畢業學年度: 97
語文別: 英文
論文頁數: 64
中文關鍵詞: 等位函數手語辨識遮蔽
外文關鍵詞: level set, occlusion, sign language recognition
相關次數: 點閱:93下載:18
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 手勢是人類自然的溝通方式,因此辨識手勢一直是人機介面研究領域努力的目標。手語基本上由一組具特定意義的手勢建構而成,這個特性讓手語成為手勢辨識的一個特別應用。手語辨識的目的即在於透過手勢辨識作為聽語障者和正常人之間溝通的橋樑。台灣手語中,許多手語在顏面部位比劃,其中有些相同的手勢因落在不同的顏面部位而表示不同的手語意義。因此顏面部位手語的辨識,不只牽涉到手型的識別,還牽涉到手在顏面部位落點的判定。由於手與臉顏色近似,當手遮蔽臉時,前後物體特徵相似造成視覺上的混淆,無法擷取完整的手型輪廓,也會影響到手在顏面落點的判斷。我們的研究即需要解決這個手臉遮蔽的問題。
    在我們的研究中,使用等位函數模型來追蹤手形的變化與回復。事先將未發生遮蔽的完整手形儲存起來當作形狀先驗資料,以備遮蔽發生時回復缺陷的手型輪廓。至於手在顏面地區落點的決定,我們使用力場的方式來描述區域結構。基於力場的方法能夠避免傳統區域像素分析法所遇到物體材質相近造成前後背景混淆的問題。隨著時間在影像位置上,利用混合高斯模型得到區域結構的變化,以此偵測影像中遮蔽的物體。藉由整合基於力場法與臉部分割法,利用分割後測試電荷(test charges)集結成流域的特性,可得到手在顏面部位的確切位置。
    實驗中選取了相同的手勢但不同的顏面落點之台灣手語。結果顯示手在顏面落點的判定可以得到62.4%正確率,手型辨識79.5%正確率,綜合得到顏面部位手語73.3%辨識率。

    A sign language can be completely defined by a finite set of specific gestures. This characteristic has made sign recognition an appealing application of gesture recognition. In Taiwanese Sign Language (TSL), a hand-over-face gesture may convey different meanings depending on the positions of the hand in the face. Therefore, recognition of these sign words involves determining the hand position in addition to the hand shape. To recognize a sign in the facial region, a major difficulty arises because of the similar colors between the hand and the face. In the presence of occlusion, it is difficult to extract the complete contour of the hand shape and to locate the position of the hand
    In this work, we use level set based models to track hands and to recover the contours and use a force-field based facial segmentation method to determine the position of a hand on the face. The force-field model measures regional structures in the image without encountering the problem of pixel-based analyses. Monitoring the changes in the regional image structures over time provides a means to detect the occlusion between hands and the face. Along with the proposed partition of facial parts, we can use the detected changes to determine the position of the hand in the facial region.
    In our experiment, we chose the sign words that perform the same sign gesture while involving touching different facial regions. We obtained an identification accuracy of 62.4% for hand positions, a recognition accuracy of 79.5% for hand shapes, and an overall recognition accuracy of 73.3% for sign words. The results show the feasibility of the proposed method for recognizing the sign words.

    1. Introduction 1 1.1 Background 1 1.2 Objective 3 2. Related Work 7 2.1 Object Tracking 7 2.2 Active Contour Models 10 3. Method 14 3.1 Level Set Tracking 14 3.1.1 Introduction 14 3.1.2 Definition of Level Set Functions 15 3.1.3 Chan-Vese based Tracking Model 17 3.1.4 Level Set Model without Re-initialization 19 3.1.5 Narrow Band 19 3.1.6 Level Set Recovery Model 20 3.2 Detection of Hand-Face Occlusion 24 3.2.1 Potential Energy 25 3.2.2 Force Field 28 3.2.3 Formation of Channels and Wells 29 3.2.4 Geodesic Distance Feature 32 3.2.5 Determination of Hand Location 36 4. Experimental Results 41 4.1 Dataset Description 41 4.2 Recovery of Hand Shapes 43 4.3 Force Field Method 47 4.4 Facial Partition and Hand Position Determination 52 4.5 Recognition of Hand Shapes 56 4.6 Discussion 59 5. Conclusions 61 References 62

    [1] L. Gupta and S. Ma, “Gesture-based interaction and communication: automated classification of hand gesture contours,” IEEE Trans. Systems, Man, and Cybernetics, vol. 31, no. 1, pp.114–120, Feb. 2001.
    [2] M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: active contour models,” Int. Journal Computer Vision, vol. 1, no. 4, pp. 321–331, 1987.
    [3] C. Xu and J. L. Prince, “Gradient vector flow: a new external force for snakes,” in Proc. IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 66–71, June 1997.
    [4] R. Malladi, J. A. Sethian, and B. C. Vemuri, “Shape modeling with front propagation: a level set approach,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 2, pp. 158–175, Feb. 1995.
    [5] T. F. Chan and L. A. Vese, “Active contours without edges,” IEEE Trans. Image Processing, vol. 10, no. 2, pp. 266–277, Feb. 2001.
    [6] N. Habili, C. C. Lim, and A. Moini, “Segmentation of the face and hands in sign language video sequences using color and motion cues,” IEEE Trans. Circuits and System for Video Technology, vol. 14, no. 8, pp. 1086–1096, Aug. 2002.
    [7] Y. Hamada, N. Shimada, and Y. Shirai, “Hand shape estimation under complex backgrounds for sign language recognition,” in Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition, pp.589–594, May. 2004.
    [8] A. Yilmaz, X. Li, and M. Shah, “Contour-based object tracking with occlusion handling in video acquired using mobile cameras,” IEEE Trans. Pattern Analysis Machine Intelligence, vol. 26, no. 11, pp. 1531–1536, Nov. 2004.
    [9] Y. Fu, A. T. Erdem, and A. M. Tekalp, “Tracking visible boundary of objects using occlusion adaptive motion snake,” IEEE Trans. Image Processing, vol. 9, no. 12, pp. 2051–2060, Dec. 2000.
    [10] P. Smith, N. da Vitoria Lobo, and M. Shah, “Resolving hand over face occlusion,” Image and Vision Computing, vol. 25, no. 9, pp. 1432–1448, Sept. 2007.
    [11] N. Paragios and R. Deriche, “Geodesic active regions and level set methods for motion estimation and tracking,” IEEE Trans. Computer Vision and Image Understanding, vol. 97, pp. 259–282, Oct. 2004.
    [12] M. Haag and H.H. Nagel, “Combination of edge element and optical flow estimates for 3D-model-based vehicle tracking in traffic image sequences,” Int. Journal of Computer Vision, vol.35, no.3,pp.295–319,Dec.1999.
    [13] D. Comaniciu and P. Meer, “Mean shift: a robust approach toward feature space analysis,” IEEE Trans. Pattern Analysis Machine Intelligence, vol. 24, pp. 603–619, 2002.
    [14] L. Bretzner, I. Laptev, and T. Lindeberg, “Hand gesture recognition using multi-scale color features, hierarchical models and particle filtering,” in Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition, pp.405–410, May 2002.
    [15] S. K. Weng, C. M. Kuo, S. K. Tu, “Video object tracking using adaptive Kalman filter,” Journal of Visual Communication and Image Representation, vol. 17,pp. 1190–1208, June 2006.
    [16] M. Bertalmio, G.Sapiro, and G.Randall, “Morphing active contours,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no.7, July 2000.
    [17] A. R. Mansouri, “Region tracking via level set PDEs without motion computation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 947–961, July 2002.
    [18] V. Caselles, R. Kimmel, and G. Sapiro, “Geodesic active contours,” Int. Journal of Computer Vision, vol. 22, no. 1, pp. 61–79, Feb. 1997.
    [19] S. C. Zhu and A.Yuille, “Region competition: Unifying snakes, region growing, and Bayes/MDL for multiband image segmentation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 9, pp. 884–900, Sep. 1996.
    [20] A. A. Amini, S. Tehrani, and T. E. Weymouth, “Using dynamic programming for minimizing the energy of active contours in the presence of hard constraints,” in Proc. IEEE Second Int. Conf. on Computer Vision, pp. 95–99, Dec. 1988.
    [21] A. A. Amini, T. E. Weymouth, and R. C. Jain, “Using dynamic programming for solving variational problems in vision,” IEEE Trans. Pattern Analysis Machine Intelligence, vol. 12, no. 9, pp. 855–867, Sept. 1990.
    [22] V. M. Yedid and J. C. Marin, “Active contours for the movement and motility analysis of biological objects,” in Proc. IEEE Int. Conf. on Image Processing, vol. 1, pp.196–199, 2000.
    [23] S. Osher and J.A. Sethian, “Fronts Propagating with Curvature Dependent Speed: Algorithms Based on Hamilton-Jacobi Formulations,” Journal of Computational Physics, vol. 79, no. 1,pp. 12-49, 1988.
    [24] C. Li, C. Xu, C. Gui, and M. D. Fox, “Level set evolution without re-initialization: a new variational formulation,” in Proc. IEEE Int. Conf. on Computer and Pattern Recognition, vol. 1, pp. 430–436, June 2005.
    [25] D. Adalsteinsson and J. A. Sethian, “A fast level set method for propagating interfaces,” Journal of Computational Physics, vol. 118, no. 2, pp.269–277, May 1995.
    [26] S. H. Lin, P. F. Hsieh, and C. H. Wu, “Facial phoneme extraction for Taiwanese sign language recognition,” Springer-Verlag Lecture Notes in Computer Science ISSN, vol. 3748, pp.187–194, Oct. 2005.
    [27] 史文漢、丁立芬,手能生橋,第一冊~第二冊,中華民國聾人協會發行,2004.

    [28] C. Stauffer and E. Grimson, “Learning patterns of activity using real-time tracking,” IEEE Trans. Pattern Analysis Machine Intelligence, vol. 22, no. 8, pp. 747–757, 2000.

    下載圖示 校內:立即公開
    校外:2008-12-26公開
    QR CODE