| 研究生: |
蘇奕中 Su, I-Chung |
|---|---|
| 論文名稱: |
基於長短期記憶之卷積神經網路步態識別保全系統 Security System based on Gait Recognition using Convolutional LSTM |
| 指導教授: |
王宗一
Wang, Tzone I |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 工程科學系 Department of Engineering Science |
| 論文出版年: | 2019 |
| 畢業學年度: | 107 |
| 語文別: | 中文 |
| 論文頁數: | 38 |
| 中文關鍵詞: | 步態識別 、深度學習 、長短期記憶之卷積神經網路 、身分辨識 |
| 外文關鍵詞: | Gait Recognition, Deep Learning, Convolutional LSTM, Human Identification |
| 相關次數: | 點閱:193 下載:20 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
基於電腦視覺的步態身分辨識技術一直都受到相當程度的重視,比起其他的生物辨識技術如:臉部、指紋、瞳孔等,步態特徵的捕捉不需要目標對象的配合來提供樣本,因此能夠不被對象察覺。這樣的性質使得步態辨識技術在保全、防範犯罪、及追蹤嫌疑犯等方面上有著相當大的優勢;步態辨識技術如用在保全系統上,因為人的走路姿勢不容易被模仿,可以避免傳統保全系統所使用的識別卡或密碼被盜用的風險。本研究建立了一結合卷積神經網路(CNN)及具長短期記憶之卷積神經網路(Convolutional LSTM)之人物步態辨識系統來比對目標人物之步態,並將之應用在一保全系統來驗證其可用性。
不同的人走路動作除了有步態上的不同外,走路的速度也不一樣,本系統為了同時提取空間及時間上的資訊,使用了CNN及Convolutional LSTM來進行步態特徵的提取;先使用每秒30幀的攝影機影像,以背景減除的方法來獲得每個時間點目標人物的輪廓影像,再以目標人物一個走路週期為單位來表示步態,所獲得的序列影像最後做為系統的輸入,將每個時間點的影像先透過CNN來提取局部的空間資訊,再由ConvLSTM更進一步提取空間與時間上的關聯資訊,這些特徵序列最後由時序匯集(Temporal Pooling)作總結並輸出總體的特徵,並將之與資料庫內已儲存的各個人物特徵進行相似度計算,最後判斷目標人物的身份;本研究將CNN和Conv-LSTM同時訓練並使用孿生神經網路架構,以OU-ISIR, Large Population的資料集作為訓練資料,結果在千人規模下身份辨識率可達80%以上,證明本研究所採用的方法是有效的。本研究除使用數據庫大量資料驗證所建立之步態身分辨識技術外,也建立一實驗性之保全系統,以驗證技術實用化之可行性。
Computer vision based human identification by gait has being receiving a lot of attentions. Comparing to other biometric methods using like face, finger print and iris, capturing gait features needs no cooperation from a target person. Because imitation of gait is difficult, using gait recognition technology on security system can avoid the risks like stolen ID card and password in traditional security system. These make gait recognition technology great advantageous in security, criminal prevention and suspects tracking. This study constructs a human identification system based on gait recognition by using Convolutional Neural Network and Convolutional Long Short-Term Memory Neural Network. The feasibility of the system is verified by an experimental security system.
Beside gait differences, walking speed may be different between people. This study uses a CNN for extracting gait spatial features and a ConvLSTM for capturing temporal features. Using a 30 FPS camera to obtain raw video of a walking person, the system uses image background subtraction to obtain the silhouettes of the person for every time step. Silhouettes of one walking cycle is used to represent the gait of a person, and is input to a neural network. In the neural network, convolutional layers will first extract local spatial features and the ConvLSTM layer will further extract spatio-temporal features. The output sequential feature maps will be converted to a fixed size vector by temporal-pooling and will be compared in the similarity with all data in the database to find out the person’s identity. The CNN and ConvLSTM layers are jointly trained using the Siamese architecture. The OU-ISIR, Large Population dataset is split into a training subset and a testing subset. The model is proved to have identification accuracy over 80%, under the scale of a thousand people. The result confirms the method in this study is effective. Except using the large dataset to verify the neural network model, this study also implements an experimental security system to verify the feasibility of the practical applications of the system.
Lake, B.M. & Salakhutdinov, R & Gross, J & Tenenbaum, J.B. One shot learning of simple visual concepts. Proceedings of the 33rd Annual Conference of the Cognitive Science Society, 2011.
Jane Bromley, Isabelle Guyon, Yann LeCun, Eduard Säckinger, and Roopak Shah. Signature Verification using a “Siamese" Time Delay Neural Network. In NIPS. 737–744, 1994.
R. Hadsell, S. Chopra, and Y. LeCun. Dimensionality reduction by learning an invariant mapping. In CVPR, volume 2, pages 1735–1742, 2006.
Y. LeCun, K. Kavukvuoglu, and C. Farabet, “Convolutional networks and applications in vision,” in Proc. Int. Symp. Circuits Syst., pp. 253–256,2010.
T. Wolf, M. Babaee and G. Rigoll, "Multi-view gait recognition using 3D convolutional neural networks," 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, pp. 4165-4169,2016.
N. McLaughlin, J. M. d. Rincon and P. Miller, "Recurrent Convolutional Network for Video-Based Person Re-identification," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, pp. 1325-1334,2016.
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
ang Feng, Yuncheng Li and Jiebo Luo, "Learning effective Gait features using LSTM," 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, pp. 325-330,2016.
Wang, Yuxuan & Skerry-Ryan, R.J. & Stanton, Daisy & Wu, Yonghui & J. Weiss, Ron & Jaitly, Navdeep & Yang, Zongheng & Xiao, Ying & Chen, Zhifeng & Bengio, Samy & Le, Quoc & Agiomyrgiannakis, Yannis & Clark, Rob & A. Saurous, Rif. Tacotron: Towards End-to-End Speech Synthesis. 4006-4010. 10.21437/Interspeech.2017-1452.
X. Shi, Z. Chen, H. Wang, D. Yeung “Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting”Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1 Pages 802-810, 2015.
Haruyuki Iwama, Mayu Okumura, Yasushi Makihara, and Yasushi Yagi, ``The OU-ISIR Gait Database Comprising the Large Population Dataset and Performance Evaluation of Gait Recognition,' IEEE Trans. on Information Forensics and Security, Vol. 7, No. 5, pp. 1511-1521, Oct. 2012.
Z. Wu, Y. Huang, L. Wang, X. Wang and T. Tan, "A Comprehensive Study on Cross-View Gait Based Human Identification with Deep CNNs," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 2, pp. 209-226, 1 Feb. 2017.
C. Zhang, W. Liu, H. Ma and H. Fu, "Siamese neural network based gait recognition for human identification," 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, pp. 2832-2836,2016.
Z. Zivkovic, "Improved adaptive Gaussian mixture model for background subtraction," Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., Cambridge, pp. 28-31 Vol.2, 2004.
V. Nair and G. E. Hinton, “Rectified linear units improve restricted Boltzmann machines,” in Proc. 27th Int. Conf. Mach. Learn., pp. 807–814, 2010.
C. BenAbdelkader, R. Cutler, and L. Davis, "Stride and cadence as a biometric in automatic person identification and verification," in Proceedings of the Fifth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 372-377, 2002.
A. F. Bobick and A. Y. Johnson, "Gait recognition using static, activity-specific parameters," in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. I-423-I-430 vol.1, 2001.
A. F. Bobick and J. W. Davis, "The recognition of human movement using temporal templates," IEEE Transactions on Pattern Analysis and Machine Intelligence vol. 23, pp. 257-267, 2001.
J. Han and B. Bhanu, "Individual recognition using gait energy image," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, pp. 316-322, 2006.
Silver, David & Huang, Aja & Maddison, Christopher & Guez, Arthur & Sifre, Laurent & van den Driessche, George & Schrittwieser, Julian & Antonoglou, Ioannis & Panneershelvam, Veda & Lanctot, Marc & Dieleman, Sander & Grewe, Dominik & Nham, John & Kalchbrenner, Nal & Sutskever, Ilya & Lillicrap, Timothy & Leach, Madeleine & Kavukcuoglu, Koray & Graepel, Thore & Hassabis, Demis. Mastering the game of Go with deep neural networks and tree search. Nature. 529. 484-489. 10.1038/nature16961. 2016