| 研究生: |
廖又儀 Liao, Yu-Yi |
|---|---|
| 論文名稱: |
具有群組記憶體之類小腦神經控制器網路技術在臉部表情辨識上的應用 The Application of Cerebella Model Articulation Controller with Clustering Memory Technique on Facial Expression Recognition |
| 指導教授: |
戴顯權
Tai, Shen-Chuan |
| 學位類別: |
博士 Doctor |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2012 |
| 畢業學年度: | 100 |
| 語文別: | 英文 |
| 論文頁數: | 98 |
| 中文關鍵詞: | 具有群組記憶體之類小腦神經網路 、臉部表情辨識系統 、離散餘弦轉換 、對比增強 、分區曝光系統 、均數位移群聚法 、可調式之對數演算法 |
| 外文關鍵詞: | Cerebella Model Articulation Controller with Clustering Memory, Facial Expression Recognition system, Discrete Cosine Transform, contrast enhancement, Zone system, Mean Shift Clustering, adaptive logarithmic algorithm |
| 相關次數: | 點閱:112 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
人類是社群的動物,在人與人溝通的過程中,臉部表情扮演著重要的角色,而臉部表情經由臉部肌肉的運動以反應人類內心情感、心理和認知等狀態,因此在探討人類溝通的課題中,臉部表情辨識在近些年成為一個極為重要的研究主題。本論文中,應用具有群組記憶體之類小腦神經控制器網路於臉部表情辨識系統。我們使用日本女性臉部表情(Japanese Female Facial Expression; JAFFE)資料庫作為研究的範本資料庫,在此資料庫中的影像都是灰階且正面的臉部表情影像。在本論文中,臉部表情辨識系統只針對前置影像處理,臉部表情特徵擷取及臉部表情辨識三個步驟進行研究。臉部表情影像在經過雜訊移除、對比增強、邊緣偵測…等前置處理後得到”頭框”形式的修剪影像,接著進行二維離散餘弦轉換取得臉部表情特徵,將差異影像(有表情影像減去無表情影像)擷取一區塊大小的離散餘弦所轉換的低頻係數當成具有群組記憶體控制之類小腦神經控制器網路的輸入向量。在訓練或辨識階段,經由查找表(Look-up table)之非線性映對快速得到輸出。實驗結果展示不同區塊大小的低頻係數和不同群組大小的權重記憶體的辨識率。在測試階段CMAC-CM花0.028秒去辨識一張測試影像,測試影像的平均辨識率為92.86%。
光源的強度會影響影像的品質。強度不足的光源會造成暗色調或低對比,導致臉部特徵的細節會被掩蓋。在前置處理階段,本論文使用對數轉換法對影像做增強。對數函式會將像素值具有大變化之影像的動態範圍壓縮,但對某些影像增強處理有其限制,例如:暗色調影像的明亮區、低對比度的明色調影像。故提出兩種目標強度策略的對比增強演算法,分別為在HSV色彩空間中具有均數位動群聚法 (策略一)和在RGB色彩空間中具有可調式之對數法(策略二),以解決對數轉換法的缺點。演算法之基本概念是假設一張影像是由參考強度層及特徵強度層所組成。參考強度層可視為經由低通濾波器所得到的影像整體表示或背景。特徵強度層則可經由影像減去參考強度層得到。目標強度層則是被定義成接近人類視覺系統的校調目標。它可由兩種策略所得到。策略一,根據分區曝光系統,亮度動態範圍最大值的2/3當成目標強度層。特徵強度層之亮度與均數位移群聚法比對後得到補償。策略二,目標強度層是利用與人類視覺特性相似的可調式之對數計算而得。對比度可經由每一像素的特徵值代入給定的代數定義中以得到調整。本演算法除了使用於增強臉部影像,亦使用於增強自然影像及混濁影像。利用等高線圖和結構相似性指標為比較標準,與其他演算法相較之下,用本論文提出的方法所得到之實驗結果相對較佳。
Humans are social animals, and their facial expressions play a significant role in communication. Facial expressions are facial movements in response to the internal affective state, psychological state, and cognitive activity of people. Therefore, facial expression recognition has become an active research topic in recent years. This dissertation presents a facial expression recognition (FER) system based on the cerebella model articulation controller with a clustering memory (CMAC-CM). The Japanese Female Facial Expression (JAFFE) database is used as the facial expression database to study the FER problem. All JAFFE images are gray images with frontal face views. The proposed FER system emphasizes the three phases of preprocessing, facial feature extraction, and facial expression recognition. Facial expression images are automatically preprocessed, including noise removal, contrast enhancement, and edge detection, to yield a cropped image with a “head-in-the-box” format. The facial expression features are extracted by the 2D Discrete Cosine Transform (DCT). A block of lower-frequency DCT coefficients is then obtained by subtracting a neutral image from a given expression image and rearranged as input vectors to be fed into the CMAC-CM to rapidly obtain output using nonlinear mapping with a look-up table in the training or recognition phase. The experimental results show recognition rates with various block sizes of coefficients in the lower frequency and cluster sizes of weight memory. A mean recognition rate of 92.86% is achieved for the testing images. CMAC-CM takes .028 s for the test image in the testing phase.
Light source luminance affects image quality. An insufficient light source causes a low-key or low-contrast color image and hides the boundaries or details of facial features. In the preprocessing phase, log transformation was used to enhance an image in the proposed FER system. The log function compresses the dynamic range of the image with large variations in pixel values, but it cannot successfully enhance certain images, such as low-key images with the brightest areas, or high-key images with low-contrast images. Therefore, this dissertation proposes contrast enhancement with two target intensity strategies, calibrating with Mean Shift Clustering in HSV and calibrating with the adaptive logarithmic algorithm in RGB, to solve the log function shortcoming. The basic notion behind the proposed algorithm is that an image consists of a reference intensity level and characteristic intensity level. The reference level is considered a general or background intensity level obtained by a low-pass filter. Characteristic intensity level can be calculated by subtracting the reference intensity level from the given image. The target intensity level is the adjusted target, defined to approach the human visual system, and can be calculated by two strategies. In the first strategy, according to the Zone system, two-thirds of the maximum domain luminance is regarded as the target intensity level. The illumination is compensated after calibrating with the Mean Shift Clustering. In the second strategy, the target intensity level is created by the adaptive logarithmic function to approach human vision. The contrast is enhanced by the given algebraic definition of the pixel characteristic value. The proposed method has been applied to numerous facial images, natural images, and turbid images. The experimental results of the proposed method obtain enhanced performance compared to other methods in subjective evaluations with a contour plot and objective evaluations by SSIM.
[1] Ansel Adams, “The Negative - The Ansel Adams Photography Series 2”, Little Brown and Company, New York Boston, pp.47-98, 2002.
[2] James S. Albus, “A theory of cerebellar function”, Mathematical Biosciences, vol. 10, no. 1-2, pp. 25-61, 1971
[3] James S. Albus, “A new approach to manipulator control: The cerebellar model articulation controller (CMAC)” , Journal of Dynamic Systems, Measurement, and Control, vol. 97, no. 3, pp. 220-227, 1975.
[4] James S. Albus, “Data storage in the cerebellar model articulation controller (CMAC)”, Journal of Dynamic Systems, Measurement, and Control, vol. 97, no. 3, pp. 228-233, 1975.
[5] Azeddine Beghdadi and Alain Le Negrate, “Contrast enhancement technique based on local detection of edges”, Computer vision, Graphics, and Image processing, vol. 46, no.2, pp. 162-174, May 1989.
[6] Claude C. Chibelushi and Fabrice Bourel, “Facial expression recognition: A brief tutorial overview”, On-Line Compendium of Computer Vision, 2003
[7] Dorin Comaniciu and Peter Meer, “Mean shift: A robust approach toward feature space analysis”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603-619, 2002.
[8] Ying Dai, Yoshitaka Shibata, Tomoyuki Ishii, Koji Katamachi, K. Noguchi, N. Kakizaki and Dawei Cai, “An associate memory model of facial expression and its application in facial expression recognition of patients on bed”, Proceedings IEEE International Conference on Multimedia and Expo, pp. 772-775, 2001
[9] Gianluca Donato, Marian Stewart Bartlett, Joseph C. Hager, Paul Ekman and Terrence J. Sejnowski, “Classifying facial actions”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, no.10, pp. 974-989, 1999.
[10] Paul Ekman and Wallace V. Friesen, “Constants across cultures in the face and emotion”, Journal of Personality and Social Psychology, vol.17, no.2, pp. 124-129, 1971.
[11] Paul Ekman, “Facial expression and emotion”, The American Psychologist, vol.48, no.4, pp. 384-392, 1993.
[12] Paul Ekman and Wallace V. Friesen, “Facial Action Coding System”, Consulting psychologists Press, Palo Alto, CA, 1978.
[13] Beat Fasel and Juergen Luettin,” Automatic facial expression analysis: A survey”, Pattern Recognition, vol.36, no.2, pp. 259-275, 2003.
[14] Rafael C. Gonzalez and Richard E. Woods, “Digital Image Processing, 2nd Edition”, Prentice-Hall, 2002.
[15] Qin Guang, “The Research on a New Image Enhancement Algorithm Based on Retinex Theory”, Advanced Research on Computer Education, Simulation and Modeling, Communications in Computer and Information Science, Springer Berlin Heidelberg, vol.176, pp. 336-342, 2011.
[16] Guodong Guo and Charles R. Dyer, “Learning from examples in the small sample case: Face expression recognition”, IEEE Transactions on Systems, Men and Cybernetics — Part B: Cybernetics, vol. 35, no.3, pp. 477-288, 2005.
[17] Heechul Han and Kwanghoon Sohn, “Automatic Illumination and Color Compensation Using Mean Shift and Sigma Filter”, IEEE Transactions on Consumer Electronics, vol. 55, no. 3, pp. 978-986, 2009.
[18] Chao He, Lixin Xu and Yuhe Zhang, “Learning convergence of CMAC algorithm”, Neural Processing Letters, vol.14, no.1, pp. 61-74, 2001.
[19] Wu-Chih Hsieh, “Hardware Implementation Based on CMAC Network for Facial Expression Recognition”, Master Thesis, National Chin-Yi University of Technology, 2008.
[20] Yuan-Pao Hsu, Kao-Shing Hwang and Jinn-Shyan Wang, “An associative architecture of CMAC for mobile robot motion control”, Journal of Inform. Science and Engineering. vo.18, pp. 145-161, 2002.
[21] Yinpeng Jin, Laura Fayad and Andrew Laine, “Contrast enhancement by multi-scale adaptive histogram equalization”, Proceedings of SPIE vol. 4478, pp. 206-213, 2001
[22] Daniel J. Jobson, Zia-ur Rahman, Glenn A. Woodell, “Properties and performance of a Center/Surround Retinex”, IEEE Transactions on Image Processing, vol. 6, no. 3, pp. 451-462, 1997.
[23] Daniel J. Jobson, Zia-ur Rahman, Glenn A. Woodell, “A Multiscale Retinex for Bridging the Gap Between Color Images and the Human Observation of Scenes”, IEEE Transactions on Image Processing, vol. 6, no. 7, pp. 965-976, 1997.
[24] Daniel J. Jobson, Zia-ur Rahman, and Glenn A. Woodell, “Feature visibility limits in the non-linear enhancement of turbid images”, Proceedings of SPIE, pp. 24-30, 2003.
[25] Jar-Shone Ker, Yau-Hwang Kuo, Rong-Chang Wen and Bin-Da Liu, “Hardware implementation of CMAC neural network with reduced storage requirement”, IEEE Transactions Neural Networks, vol.8, no.6, pp. 1545-1556, 1997.
[26] A. Kolcz and N. M. Allinson, “Application of the CMAC input encoding scheme in the Ntuple approximation network”, IEE Proceedings of Computers and Digital Techniques, vol.141, no.3, pp. 177-183, 1994.
[27] Agnes Lacroix, Michele Guidetti, Bernadette Roge and Judy Reilly, “Recognition of emotional and nonemotional facial expressions: A comparison between Williams syndrome and autism”, Research in Developmental Disabilities, vol. 30, pp. 976-985, 2009.
[28] Edwin H. Land, John J. McCANN, “Lightness and Retinex theory”, Journal of the Optical society of America, vol. 61, no. 1, pp.1-11, 1971.
[29] Edwin H. Land, “An alternative technique for the computation of the designator in the Retinex theory of color vision”, Proceedings of the National Academy of Science USA, vol. 83, pp. 2078-3080, 1986.
[30] Wei-Song Lin, Chin-Pao Hung and Mang-Hui Wang, “CMAC-based fault diagnosis of power transformers”, Proceedings International Joint Conference Neural Networks, vol.1, pp. 896-899, 2002.
[31] Jzau-Sheng Lin, Wu-Chih Hsieh, Yu-Yi Liao, “Facial expression recognition using DWT and CMAC network with clustering memory”, Proceedings Third Intelligent Living Technology Conference, pp. 283-290, 2008.
[32] Jzau-Sheng Lin, Shao-Han Liou, Wu-Chih Hsieh, Yu-Yi Liao, Hong Chao Wang, and Qing Hua Lan, “Facial expression recognition based on field programmable gate array”, The Fifth International Conference Information Assurance and Security, pp. 547-550, 2009.
[33] Shao-Han Liu and Jzau-Sheng Lin, “Character recognition based on CMAC with field programmable gate array”, Journal of Technology vo. 20, no.4, pp. 357-366, 2005.
[34] Yuan Luo, Marina L. Gavrilova and Patrick S. P. Wang, “Facial metamorphosis using geometrical methods for biometric applications”, International Journal of Pattern Recognition Artificial Intelligence, vol.22, no.3, pp. 555-584, 2008.
[35] Michael J. Lyons, Shigeru Akamatsu, Miyuki Kamachi, and Jiro Gyoba, “Coding facial expressions with Gabor wavelets”, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition, vol. 36, no. 12, pp.200-205, 1998.
[36] Michael J. Lyons, Julien Budynek and Shigeru Akamatsu, “Automatic classification of single facial images”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, no.12, pp. 1357-1362, 1999.
[37] Liying Ma and Khashayar Khorasani, “Facial expression recognition using constructive feedforward neural networks”, IEEE Transactions on Systems, Men and Cybernetics — Part B: Cybernetics, vol.34, no.3, pp. 1588-1594, 2004.
[38] Liying Ma, Yegui Xiao, Khashayar Khorasani and Rabab Kreidieh Ward, “A new facial expression recognition technique using 2-D DCT and K-means algorithm”, IEEE International Conference on Image Processing, vol. 2, pp. 1269-1272, 2004.
[39] W. T. Miller, F. H. Glanz and L. G. Kraft,” CMAC: An associative neural network alternative to backpropagation”, Proceedings of IEEE Special Issue on Neural Networks, vol.78, no.10, pp. 1561-1567, 1990.
[40] John P. Oakely and Hong Bu “Correction of Simple Contrast Loss in Color Images”, IEEE Transactions on Image Processing vol.16, no.2, pp.511-522, 2007.
[41] Karen A. Panetta, Eric J. Wharton, and Sos s. Agaian, “Human Visual System-Based Image Enhancement and Logarithmic Contrast Measure”, IEEE Transactions on Systems, Man and Cybernetics-Part B: Cybernetics, vol.38, no.1, pp. 174-188, 2008.
[42] Maja Pantic and Leon J. M. Rothkrantz, “Automatic analysis of facial expressions: The state of the art”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vo.22, no.12, pp.1424-1445, 2000.
[43] Duc Truong Pham and Liu Xing, “Neural Networks for Identification Prediction and Control”, Springer-Verlag, Great Britain, 1997.
[44] Z. Rahman, D.J. Jobson, and G.A. Woodell , "Multi-scale Retinex for color image enhancement,", International Conference on Image Processing, vol.3, pp.1003-1006 , 1996.
[45] Zia-ur Rahman, Daniel J. Jobson, and Glenn A. Woodell , “Multi-scale Retinex for color rendition and dynamic range compression”, Proceeding of SPIE, vol. 2847, pp. 183-191, 1996.
[46] Hadi Seyedarabi, Ali Aghagolzadeh and Sohrab Khanmohammadi, “Facial expression animation and lip tracking using facial characteristic points and deformable model”, International Journal of Information Technology, vol.1, no.4, pp. 165-168, 2004.
[47] Gaurav Sharma, "Digital Color Imaging Handbook", CRC Press, Webster, New York, 2003.
[48] Frank Y. Shih, Chao-Fa Chuang and Patrick S. P. Wang, “Performance comparisons of facial expression recognition in JAFFE database”, International Journal of Pattern Recognition Artificial Intelligence, vol. 22, no.3, pp.445-459, 2008.
[49] Chathura R. De Silva, Surendra Ranganath and Liyanage C. De Silva, “Cloud basis function neural network: A modified RBF network architecture for holistic facial expression recognition”, Patterns Recognition, vol.41, pp.1241-1253, 2008.
[50] Jean-Luc Starck, Fionn Murtagh, Emmanuel J. Candes, David L. Donoho, “Gray and Color Image Contrast Enhancement by the Curvelet Transform” IEEE Transactions on Image Processing, vol. 12, no. 6, pp. 706-717, June 2003.
[51] Ying-Li Tian, Takeo Kanade and Jeffrey F. Cohn, “Recognizing action units for facial expression analysis”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.23, no.2, pp.1-19, 2001.
[52] Chun-Ming Tsai, Zong-Nu Yeh, “Contrast Enhancement by Automatic and Parameter-Free Piecewise Linear Transformation for Color Images”, IEEE Transactions on Consumer Electronics, vol. 54, no. 2, pp. 213-219, May 2008
[53] Numan Unaldi, Samil Temel, Vijayan K. Asari, and Zia-ur Rahman,” An Automatic Wavelet-based Nonlinear Image Enhancement Technique for Aerial Imagery”, 4th International Conference on Recent Advances in Space Technologies, Istanbul, Turkey, pp. 307-312, 2009.
[54] Zhou Wang, Alan Conrad Bovid, Hamid Rahim Sheikh, Eero P. SimonCelli, “Image Quality Assessment: From Error Visibility to Structural Similarity”, IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, 2004.
[55] Yiu-Fai Wong and Athanasios Sideris, “Learning convergence in the cerebellar model articulation controller”, IEEE Transactions on Neural Networks, vol. 3, no.1, pp. 115-121, 1992.
[56] Yegui Xiao, N. P. Chandrasiri, Yoshiaki Tadokoro and Masaomi Oda, “Recognition of facial expression using 2D DCT and neural network”, Electronics and Communication in Japan, Part 3 Fundamental Electronic Science, vol.82, no.7, pp. 1077-1086, 1999.
[57] Yegui Xiao, Liying Ma and Khashayar Khorasani, “A new facial expression recognition technique using 2-D DCT and neural networks based decision tree”, International Joint Conference on Neural Networks, pp. 2421-2428, 2006.
[58] Shu Yao and Bo Zhang, “The learning convergence of CMAC in cyclic learning, Proceedings International Joint Conference on Neural Networks”, vol. 3, pp. 2583-2586, 1993.
[59] Zhengyou Zhang, Michael Lyons, Michael Schuster and Shigeru Akamatsu, “Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron”, IEEE International Conference on Automatic Face and Gesture Recognition, pp. 454-459, 1998.
[60] Zhengyou Zhang, “Feature-based facial expression recognition: Sensitivity analysis and experiments with a multi-layer perceptron”, International Journal of Pattern Recognition Artificial Intelligence, vol.13, no.6, pp.893-911, 1999.
[61] Yi-Shu Zhai, and Xiao-Ming Liu, “An Improved Fog-degraded Image Enhancement Algorithm”, Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition, Beijing, China pp. 522-526, 2007.
[62] The Japanese Female Facial Expression Database [online]. Available: http:www.kasrl.org/jaffe.html
[63] Test image – Nasa Images [online]. Available: http://dragon.larc.nasa.gov/retinex/pao/news/ or http://dragon.larc.nasa.gov/retinex/fog_haze/
[64] Test image – a pretty lady image [online]. Available: http://r0k.us/graphics/kodak/kodim04.html