| 研究生: |
彭志祥 Peng, Jr-Shiang |
|---|---|
| 論文名稱: |
基於SMO快速演算法應用於語者辨識系統之軟硬體協同IP設計與實現 Hardware and Software Co-design of Silicon Intellectual Property Module Based on Sequential Minimal Optimization algorithm for Speaker Recognition |
| 指導教授: |
王駿發
Wang, Jhing-Fa |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2010 |
| 畢業學年度: | 98 |
| 語文別: | 英文 |
| 論文頁數: | 51 |
| 中文關鍵詞: | 語者辨識 、軟硬體協同 |
| 外文關鍵詞: | SMO, Speaker Recognition, Hardware and Software Co-design |
| 相關次數: | 點閱:73 下載:3 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
語者識別在傳統分類方法上最普遍的方法為使用支援向量機(Support Vector Machine) ,但訓練時間和運算量上一直是此技術的瓶頸所在。因此本論文目的主要在於,對語者識別系實現加速並開發於嵌入式系統中,我們使用循序最小最佳化(Sequential Minimal Optimization)演算法,並以軟硬體協同設計之方法為實現。
系統的實現主要分為硬體、軟體和軟硬體溝通部份,在軟體部份,採用軟體定點加速來同時實現線性預測倒頻譜係數(Linear Prediction Cepstral Coefficients)演算法和一對一最高計票量分析(One vs. one highest vote analysis)演算法來進行語音特徵的擷取和辨識,而在硬體部份,語者模型的訓練採用SMO演算法以實現,並以啟發式選擇(Heuristics selection)與有效率使快取之方法為設計加速之考量,在軟硬體溝通部份以節省資料之頻寬需求設計可減少5%傳輸量,最後在我們的實驗結果下發現此系統可減少90%訓練時間,並且辨識率仍可達到92.7%,使此系統同時具有低運算時間及高辨識的功效。
This thesis proposes a hardware/software co-design IP for embedded text-independent speaker recognition system to increase convenient life through portable speech application. In hardware part, the Sequential Minimal Optimization (SMO) algorithm is adopted for accelerating SVM training to create speaker models. In software part, we modify our lab’s previous fixed-point arithmetic design for both the Linear Prediction Cepstral Coefficients (LPCC) and the one vs. one highest voting analysis algorithm.
Two schemes, the heuristics selection and the efficient cache utilization method are proposed to implement the SMO algorithm into hardware design for decreasing the training time. Moreover, a specific design is proposed to efficiently utilize the bus bandwidth and reduce delivering time for about 5% between software and hardware communications. Finally, our simulation/emulation results show that 90% of training time is reduced while the recognition accuracy rate can achieve 92.7%.
[1] D. Reynold and R.C. Rose, “Robust Text Independent Speaker Identification Using Gaussian Mixture Speaker Models,” Proc. IEEE Tran. Speech and Audio Processing, vol. 3, Jan. 1995, pp. 72-83.
[2] Lukáˇs Burget, Pavel Matˇejka, Petr Schwarz, Member, Ondrˇej Glembek, Student, and Jan Honza Cˇ ernocký, “Analysis of Feature Extraction and Channel Compensation in a GMM Speaker Recognition System,” IEEE transactions on speech, audio and language processing, vol. 15, no. 7, pp. 1979-1985, september 2007.
[3] Mikyong Ji, Sungtak Kim, Hoirin Kim, Member, IEEE, Keun-Chang Kwak, and Young-Jo Cho, “Reliable Speaker Identification Using Multiple Microphones in Ubiquitous Robot Companion Environment,” 16th IEEE International Conference on Robot & Human Interactive Communication, 2007.
[4] Qin Jin, Tanja Schultz, and Alex Waibel, “Far-Field Speaker Recognition,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 7, Sep. 2007.
[5] Wan Vincent and Renals Steve, “Speaker verification using sequence discriminant support vector machines,” IEEE transactions on speech and audio processing, vol. 13, No. 2, march 2005.
[6] William M. Campbell, Joseph P. Campbell, Terry P. Gleason, Douglas A. Reynolds, and Wade Shen, “Speaker Verification Using Support Vector Machines and High-Level Features,” IEEE transactions on speech , audio and language processing, vol. 15, no. 7, september 2007.
[7] J.C. Wang, C.H.Yang, J.F. Wang, and H.P. Lee, “Robust speaker identification and verification,” IEEE Compu. Intell. Mag., pp.52-59, May 2007.
[8] C. M. Bishop, Pattern Recognition and Machine Learning, New York, NY : Springer Science+Business Media, pp. 325-358, 2006.
[9] Michael Feld, “Embedded Modules for Speaker Classification,” IEEE Conference on Semantic Computing, ICSC, pp.370-377, Aug. 2008.
[10] Dong Wang, Liang Zhang, Jia Liu, and Runsheng Liu, “Embedded Speech Recognition System on 8-Bit MCU Core,” IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04), vol. 5, V- 301-4 vol.5 , May. 2004.
[11] Yan Chen, Qingyang Hong, XiaoYang Chen, Caihong Zhang, “Real-Time Speaker Verification Based on GMM-UBM for PDA,” Fifth IEEE International Symposium on Embedded Computing, Publication Date: 6-8, pp.243-246, Oct. 2008.
[12] B. Tydlitat, J.Navratil, J.W. Pelecanos, G.N. Ramaswamy, ”Text-Independent Speaker Verification in Embedded Environments,” IEEE International Conference on Acoustics, Speech amd Signal Processing, vol. 4, pp. IV-293-IV-296, April 2007.
[13] G. Arfan, M. Martin, M. Liam and H. Jim., “ Hardware/Software Co-Design for Spike Based Recognition,” IJCNN, vol.1 pp. 12 - 17, 2007.
[14] S.Y. Peng, B.A. Minch and P. Hasler, “Analog VLSI implementation of support vector machine learning and classification,” IEEE Int. symp. Circuits and Systems (ISCAS), pp. 860-863, May 2008.
[15] D. Anguita, A. Boni, and S. Ridella, “A digital architecture for support vector machines: Theory, algorithm, and FPGA implementation” IEEE Trans. on Neural Networks, vol. 14 no. 5, pp. 993-1009, Sep. 2003.
[16] S. Dey, M. Kedia, N. Agarwal and A. Basu, "Embedded Support Vector Machine : Architectural Enhancements and Evaluation," 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems (VLSID'07), pp.685-690, 2007
[17] T. W. Kuan, J. F. Wang, J. C. Wang, and G. H. Gu, “VLSI Design of Sequential Minimal Optimization Algorithm for SVM Learning, ” Proc. IEEE Int. Conf. on Circuits and Systems(ISCAS), vol. 5, pp. 2509 - 2512. 2009
[18] C. Cortes and V. Vapnik, “ Support vector networks,” Machine Learning, vol. 20, pp. 273-297, 1995.
[19] J. C. Platt, “Fast training of support vector machines using sequential minimal optimization,” in Advances in Kernel Methods: Support Vector Machines, B. Schölkopf, C. Burges, and A. Smola, Eds. Cambridge, MA: MIT Press, 1998..
[20] .J.C.Platt, “Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines.” Technical Report MSR-TR-98-14, Microsoft Research, 1998.
[21] Chin-Lung Hart SU, Jyh-Shing Roger Jang, “Speech Recognition on 32-bit Fixed-point Processors: Implementation & Discussions,” Master’s Thesis, Tsing Hua University, Hsinchu City, Taiwan. 2005.
[22] J.F.Wang, T.W. Kuan, and T.W.Sun, “Dynamic Fixed-Point Arithmetic Design of Embedded SVM-Based Speaker Identification , ” ISNN2010
[23] 財團法人國家實驗研究院 國家晶片系統設計中心 http://www.cic.org.tw/
校內:2020-12-31公開