研究生: |
林士豪 Lin, Shi-Hou |
---|---|
論文名稱: |
台灣手語之臉部特徵萃取與辨識 Facial Phoneme Extraction for Taiwanese Sign Language Recognition |
指導教授: |
謝璧妃
Hsieh, Pi-Fuei |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2005 |
畢業學年度: | 93 |
語文別: | 英文 |
論文頁數: | 51 |
中文關鍵詞: | 特徵點 、臉部表情 |
外文關鍵詞: | feature points, facial expression |
相關次數: | 點閱:91 下載:4 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在台灣,台灣手語是聽障人士溝通的基本工具之一。然而對於正常人來說,必須學習台灣手語才能和聽障人士溝通無疑不是件容易的事情。因此設計一套台灣手語辨識系統來做為溝通媒介,不但對於和聽障人士溝通有很大的幫助,也可以利用這套系統進行台灣手語學習。
台灣手語辨識系統的設計主要分成三個子系統,分別處理手型、軌跡、和臉部表情。本論文主要是針對臉部表情子系統進行研究與發展。由語言學構音研究發現,台灣手語大部分的臉部表情是由眉毛、眼睛、嘴巴,三個音素(phoneme)所組成的,因此在我們的臉部表情辨識子系統中將臉部表情細分成眉毛、眼睛、嘴巴三個臉部音素分別處理和辨識。
在臉部表情辨識子系統中,我們假設在影帶序列中第一張影像是自然表情。首先我們利用膚色和人體測量學找出三個臉部音素在影帶序列中第一張影像的粗略位置。然後利用可變動模板(deformable template)方法將三個臉部音素的輪廓擷取出來,並且根據台灣手語特性,定義19個特徵點來描述這些輪廓。再次根據台灣手語特性,利用簡化過的可變動模板方法追蹤影帶序列中眉毛和眼睛的特徵點。由於嘴巴音素屬於非剛性變動,所以利用光學流動(optical flow)和角偵測(corner detection)混合法來追蹤嘴巴的特徵點更為適合。根據特徵點的追蹤,我們可以得到某個表情完整表達後三個臉部音素特徵點的最終位置。將這些特徵點的位移正規化並且定義出辨識所需要的特徵後,即可利用這些特徵進行臉部表情的辨識。
我們選擇了七個可以獨立表達完整語意的臉部表情進行實驗,並且讓每個受測者拍攝這七個表情做為我們的實驗樣本。從收集的影帶樣本中每次隨機選出一個受測者來做測試並以其他受測者的影帶為訓練樣本,經過多次測試平均,本臉部表情辨識子系統效能可以達到76.2%辨識率。
We have developed a system that recognizes the facial expressions in Taiwanese Sign Language (TSL) using a phoneme-based strategy. A facial expression is decomposed into three facial phonemes, including eyebrow, eye, and mouth. A fast method is proposed for locating the areas facial phonemes. The shapes of the phonemes were then matched by the deformable template method, giving feature points representing the corresponding phonemes. The trajectories of the feature points were tracked along the video image sequence and combined to recognize the type of facial expression. The tracking techniques and the feature points used have been tailored for the facial features in TSL. According to each special need, different tracking techniques were applied to different facial phonemes. We regard the eyebrows as a rigid object and assume the inner corner and the outside corner of an eye are two fixed points in an image sequence. Therefore, the template matching methods can be modified for speed up the tracking of eyebrows and eyes. However, the motion of the mouth is not rigid. Therefore, the mouth was tracked using the optical flow method taking lips as homogeneous patches. In the experiment, we combined the recognition results of the facial phonemes based on the maximum likelihood decision rule to decide the type of facial expression. The average recognition rate was 76.2%.
[1] P. Ekman, “Facial expression and emotion,” American Psychologist, Vol. 48, no. 4, pp. 384-392, Apr. 1993.
[2] C. Padgett, G.W. Cottrell, and R. Adolphs, “Categorical perception in facial emotion classification,” Proceedings of the 18th Annual Conference of the Cognitive Science Society, 1996, pp. 249–253.
[3] M. J. Black and Y. Yacoob, "Recognizing facial expression in image sequences using local parameterized models of image motion," Int'l J. Computer Vision, Vol. 25, no. 1, pp. 23-48, 1997.
[4] P. Ekman, W. V. Friesen, “Facial action coding system investigator’s guide,” Consulting Psychologist Press, Palo Alto, CA, 1978.
[5] A. Kapoor, Y. Qi, and R. W. Picard, “Fully Automatic Upper Facial Action Recognition,” IEEE International Workshop on Analysis and Modeling of Faces and Gestures, Oct. 2003.
[6] A. Yuille, D. Cohen and P. Hallinan, “Feature extraction from faces using deformable templates,” Proc. IEEE Conf. CVPR, pp. 104-109, Jun. 1989.
[7] M. A. Shackleton and W. J. Welsh, “Classification of facial features for recognition,” Proc. CVPR, Hawaii, pp. 573-579, 1991.
[8] K. Lam and H. Yan, “Locating and extracting the eye in human face images,” Pattern Recognition, Vol. 29, no. 5, pp. 771-779, 1996
[9] J. Deng and F. Lai, “Region-based template deformation and masking for eye-feature extraction and description,” Pattern Recognition, Vol. 30, no. 3, pp. 403-419, 1997.
[10] L. Yin and A. Basu, “Generating realistic facial expressions with wrinkles for model-based coding,” Computer Vision and Image Understanding, Vol. 84, no. 2, pp. 201-240, Nov. 2001.
[11] F. Hara, K. Tanaka, H. Kobayashi, and A.Tange, “Automatic feature extraction of facial organs and contour,” Proc. IEEE Int’l Workshop on Robot and Human Communication, Vol. 29 Sept.-1, pp. 386-391, Oct. 1997.
[12] B.D. Lucas, T. Kanade, “An iterative image registration technique with an application in stereo vision,” Proceedings of the Seventh International Joint Conference on Artificial Intelligence, Vancouver, BC, pp. 674–679, 1981.
[13] Y. Tian, T. Kanade, and J. F. Cohn, “Recognizing action units for facial expression analysis,” IEEE Trans. Pattern Anal. Machine Intell., Vol. 23, no. 2, pp. 97-115, Feb. 2001.
[14] B. Horn and B. Schunck, “Determining optical flow,’ Artif. Intell., Vol. 17 ,pp. 185-203, Aug. 1981.
[15] L. Yin, A. Basu, and M.T. Yourst, “Active tracking and cloning of facial expressions using spatio-temporal information,” IEEE International Conference on Tools with Artificial Intelligence, Washington D.C., pp. 347-354, Nov. 2002.
[16] L. Rabiner, B.H. Juang, Fundamentals of Speech Recognition, Prentice-Hall, Englewood Cliffs, NJ, 1993.
[17] H. Kobayashi, A. Tange, and F. Hara, “Real-time recognition of six basic facial expressions,” Proceedings of IEEE International Workshop on Robot and Human Communication, pp. 179-186, 1995
[18] J. Lien, T. Kanade, J. Cohn, and C. C. Li, “Detection, tracking and classification of action units in facial expression,” Journal of Robotics and Autonomous Systems, Vol. 31, no. 3, pp. 131-146, May. 2000.
[19] S. Kimura and M. Yachida, “Facial expression recognition and its degree estimation,” Proc. Computer Vision and Pattern Recognition, pp. 295-300, 1997.
[20] T. Otsuka and J. Ohya, “Spotting Segments Displaying Facial Expression from Image Sequences Using HMM,” Proc. Int'l Conf. Automatic Face and Gesture Recognition, pp. 442-447, 1998.
[21] M. Wang, Y. Iwai, and M. Yachida, “Expression Recognition from Time-Sequential Facial Images by Use of Expression Change Model,” Proc. Int'l Conf. Automatic Face and Gesture Recognition, pp. 324-329, 1998.
[22] Y. Zhu, L. C. De Silva, and C. C. Ko, “Using moment invariants and HMM in facial expression recognition,” Pattern Recognition Letters Vol. 23, pp. 83-91, Jan. 2002.
[23] Hiroshi Sako and Anthony V. W. Smith, “Real-time facial expression recognition based features postions and dimensions,” IEEE proc. Of ICPR, 1996
[24] T. Sakaguchi and S. Morishima, “Face feature extraction from spatial frequency for dynamic expression recognition,” IEEE proc. Of ICPR, 1996
[25] K. Mase, “Recognition of Facial expression from optical flow,” IEICE Trans., Vol. E74, no. 10, pp. 3,474-3,483, 1991.
[26] Y. Moses, D. Reynard, and A. Blake, “Determining facial expressions in real time,” Proc. Int'l Conf. Automatic Face and GestureRecognition, pp. 332-337, 1995.
[27] M. Rosenblum, Y. Yacoob, and L. Davis, “Human emotion recognition from motion using a radial basis function network architecture,” Proc. IEEE Workshop on Motion of Non-Rigid and Articulated Objects, pp. 43-49, 1994.
[28] Y. Yacoob and L. Davis, “Recognizing facial expressions by spatio-temporal analysis,” Proc. Int'l Conf. Pattern Recognition, Vol. 1, pp. 747-749, 1994.
[29] H. Kobayashi and F. Hara, “Recognition of six basic facial expressions and their strength by neural network,” Proc. Int'l Workshop Robot and Human Comm., pp. 381-386, 1992.
[30] K. Matsuno, C.W. Lee, and S. Tsuji, “Recognition of Facial Expression with Potential Net,” Proc. Asian Conf. Computer Vision, pp. 504-507, 1993.
[31] A. Rahardja, A. Sowmya, and W.H. Wilson, “A neural network approach to component versus holistic recognition of facial expressions in images,” SPIE, Intelligent Robots and Computer Vision X: Algorithms and Techniques, Vol. 1,607, pp. 62-70, 1991.
[32] H. Ushida, T. Takagi, and T. Yamaguchi, “Recognition of facial expressions using conceptual fuzzy Sets,” Proc. Conf. Fuzzy Systems, Vol. 1, pp. 594-599, 1993.
[33] P. Vanger, R. Honlinger, and H. Haken, “Applications of synergetics in decoding facial expression of emotion,” Proc. Int'l Conf. Automatic Face and Gesture Recognition, pp. 24-29, 1995.
[34] C. Padgett and G.W. Cottrell, “Representing face images for emotion classification,” Proc. Conf. Advances in Neural Information Processing Systems, pp. 894-900, 1996.
[35] Z. Zhang, M. Lyons, M. Schuster, and S. Akamatsu, “Comparison between geometry-based and Gabor Wavelets-Based facial expression recognition using multi-layer perceptron,” Proc. Int'l Conf. Automatic Face and Gesture Recognition, pp. 454-459, 1998.
[36] J. Zhao and G. Kearney, “Classifying facial emotions by backpropagation neural networks with fuzzy Inputs,” Proc. Conf. Neural Information Processing, Vol. 1, pp. 454-457, 1996.