簡易檢索 / 詳目顯示

研究生: 彭志祥
Peng, Jr-Shiang
論文名稱: 基於SMO快速演算法應用於語者辨識系統之軟硬體協同IP設計與實現
Hardware and Software Co-design of Silicon Intellectual Property Module Based on Sequential Minimal Optimization algorithm for Speaker Recognition
指導教授: 王駿發
Wang, Jhing-Fa
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2010
畢業學年度: 98
語文別: 英文
論文頁數: 51
中文關鍵詞: 語者辨識軟硬體協同
外文關鍵詞: SMO, Speaker Recognition, Hardware and Software Co-design
相關次數: 點閱:73下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 語者識別在傳統分類方法上最普遍的方法為使用支援向量機(Support Vector Machine) ,但訓練時間和運算量上一直是此技術的瓶頸所在。因此本論文目的主要在於,對語者識別系實現加速並開發於嵌入式系統中,我們使用循序最小最佳化(Sequential Minimal Optimization)演算法,並以軟硬體協同設計之方法為實現。
    系統的實現主要分為硬體、軟體和軟硬體溝通部份,在軟體部份,採用軟體定點加速來同時實現線性預測倒頻譜係數(Linear Prediction Cepstral Coefficients)演算法和一對一最高計票量分析(One vs. one highest vote analysis)演算法來進行語音特徵的擷取和辨識,而在硬體部份,語者模型的訓練採用SMO演算法以實現,並以啟發式選擇(Heuristics selection)與有效率使快取之方法為設計加速之考量,在軟硬體溝通部份以節省資料之頻寬需求設計可減少5%傳輸量,最後在我們的實驗結果下發現此系統可減少90%訓練時間,並且辨識率仍可達到92.7%,使此系統同時具有低運算時間及高辨識的功效。

    This thesis proposes a hardware/software co-design IP for embedded text-independent speaker recognition system to increase convenient life through portable speech application. In hardware part, the Sequential Minimal Optimization (SMO) algorithm is adopted for accelerating SVM training to create speaker models. In software part, we modify our lab’s previous fixed-point arithmetic design for both the Linear Prediction Cepstral Coefficients (LPCC) and the one vs. one highest voting analysis algorithm.
    Two schemes, the heuristics selection and the efficient cache utilization method are proposed to implement the SMO algorithm into hardware design for decreasing the training time. Moreover, a specific design is proposed to efficiently utilize the bus bandwidth and reduce delivering time for about 5% between software and hardware communications. Finally, our simulation/emulation results show that 90% of training time is reduced while the recognition accuracy rate can achieve 92.7%.

    CHAPTER 1 1 INTRODUCTION 1 1.1 BACKGROUND 1 1.2 RELATIVE WORK 1 1.3 MOTIVATION 2 1.4 THESIS ORGANIZATION 4 CHAPTER 2 5 TRAINING PHASE ALGORITHM 5 2.1 SYSTEM OVERVIEW 5 2.2 OVERVIEW OF SVM ALGORITHM 6 2.3 OVERVIEW OF SMO ALGORITHM 7 CHAPTER 3 13 TESTING PHASE ALGORITHM 13 3.1 OVERVIEW OF FEATURE EXTRACTION ALGORITHM 13 3.1.1 End Point Detection 14 3.1.2 Pre-Emphasis 14 3.1.3 Frame Blocking 15 3.1.4 Hamming Windows 15 3.1.5 Linear Predictive Cepstrual Coefficients (LPCC) 15 3.2 OVERVIEW OF SPEAKER RECOGNITION ALGORITHM 16 CHAPTER 4 19 HW/SW CO-DESIGN AND IMPROVEMENT 19 4.1 INTRODUCTION 19 4.1.1 HW/SW Co-design 20 4.1.2 HW/SW Partitioning 21 4.1.3 AMBA Protocol 22 4.1.4 EASY Platform 24 4.2 ACCELERATION IMPLEMENTATION BY FIXED-POINT 26 4.2.1 Fixed Point Format for Software Implementation 27 4.2.2 Floating-system vs. Fixed-system 29 4.3 IMPROVED HARDWARE ARCHITECTURE FOR SMO ALGORITHM 32 4.3.1 Hardware Architecture for SMO Algorithm 32 4.3.2 Cache table vs. Non-cache table 34 4.3.3 Heuristic choice vs. Non-heuristic choice 35 4.4 IMPROVED COMMUNICATION BETWEEN HW/SW 36 4.4.1 Package/Unpack for HW/SW System 36 4.4.2 Package system vs. Non- Package System 38 CHAPTER 5 39 EXPERIMENTAL RESULTS AND COMPARISONS 39 5.1 INTERDICTION TO EXPERIMENTAL ENVIRONMENT 39 5.2 INTRODUCTION TO CDK EMBEDDED SYSTEM 40 5.3 FPGA IMPLEMENTATION 41 5.4 SIMULATION & EMULATION RESULTS 41 5.5 NINE PERSONS OF NIST EXPERIMENTAL RESULTS 43 CHAPTER 6 47 CONCLUSION AND FUTURE WORK 47 6.1 CONCLUSION 47 6.2 FUTURE WORK 47 REFERENCES 48

    [1] D. Reynold and R.C. Rose, “Robust Text Independent Speaker Identification Using Gaussian Mixture Speaker Models,” Proc. IEEE Tran. Speech and Audio Processing, vol. 3, Jan. 1995, pp. 72-83.
    [2] Lukáˇs Burget, Pavel Matˇejka, Petr Schwarz, Member, Ondrˇej Glembek, Student, and Jan Honza Cˇ ernocký, “Analysis of Feature Extraction and Channel Compensation in a GMM Speaker Recognition System,” IEEE transactions on speech, audio and language processing, vol. 15, no. 7, pp. 1979-1985, september 2007.
    [3] Mikyong Ji, Sungtak Kim, Hoirin Kim, Member, IEEE, Keun-Chang Kwak, and Young-Jo Cho, “Reliable Speaker Identification Using Multiple Microphones in Ubiquitous Robot Companion Environment,” 16th IEEE International Conference on Robot & Human Interactive Communication, 2007.
    [4] Qin Jin, Tanja Schultz, and Alex Waibel, “Far-Field Speaker Recognition,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 7, Sep. 2007.
    [5] Wan Vincent and Renals Steve, “Speaker verification using sequence discriminant support vector machines,” IEEE transactions on speech and audio processing, vol. 13, No. 2, march 2005.
    [6] William M. Campbell, Joseph P. Campbell, Terry P. Gleason, Douglas A. Reynolds, and Wade Shen, “Speaker Verification Using Support Vector Machines and High-Level Features,” IEEE transactions on speech , audio and language processing, vol. 15, no. 7, september 2007.
    [7] J.C. Wang, C.H.Yang, J.F. Wang, and H.P. Lee, “Robust speaker identification and verification,” IEEE Compu. Intell. Mag., pp.52-59, May 2007.
    [8] C. M. Bishop, Pattern Recognition and Machine Learning, New York, NY : Springer Science+Business Media, pp. 325-358, 2006.
    [9] Michael Feld, “Embedded Modules for Speaker Classification,” IEEE Conference on Semantic Computing, ICSC, pp.370-377, Aug. 2008.
    [10] Dong Wang, Liang Zhang, Jia Liu, and Runsheng Liu, “Embedded Speech Recognition System on 8-Bit MCU Core,” IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04), vol. 5, V- 301-4 vol.5 , May. 2004.
    [11] Yan Chen, Qingyang Hong, XiaoYang Chen, Caihong Zhang, “Real-Time Speaker Verification Based on GMM-UBM for PDA,” Fifth IEEE International Symposium on Embedded Computing, Publication Date: 6-8, pp.243-246, Oct. 2008.
    [12] B. Tydlitat, J.Navratil, J.W. Pelecanos, G.N. Ramaswamy, ”Text-Independent Speaker Verification in Embedded Environments,” IEEE International Conference on Acoustics, Speech amd Signal Processing, vol. 4, pp. IV-293-IV-296, April 2007.
    [13] G. Arfan, M. Martin, M. Liam and H. Jim., “ Hardware/Software Co-Design for Spike Based Recognition,” IJCNN, vol.1 pp. 12 - 17, 2007.
    [14] S.Y. Peng, B.A. Minch and P. Hasler, “Analog VLSI implementation of support vector machine learning and classification,” IEEE Int. symp. Circuits and Systems (ISCAS), pp. 860-863, May 2008.
    [15] D. Anguita, A. Boni, and S. Ridella, “A digital architecture for support vector machines: Theory, algorithm, and FPGA implementation” IEEE Trans. on Neural Networks, vol. 14 no. 5, pp. 993-1009, Sep. 2003.
    [16] S. Dey, M. Kedia, N. Agarwal and A. Basu, "Embedded Support Vector Machine : Architectural Enhancements and Evaluation," 20th International Conference on VLSI Design held jointly with 6th International Conference on Embedded Systems (VLSID'07), pp.685-690, 2007
    [17] T. W. Kuan, J. F. Wang, J. C. Wang, and G. H. Gu, “VLSI Design of Sequential Minimal Optimization Algorithm for SVM Learning, ” Proc. IEEE Int. Conf. on Circuits and Systems(ISCAS), vol. 5, pp. 2509 - 2512. 2009
    [18] C. Cortes and V. Vapnik, “ Support vector networks,” Machine Learning, vol. 20, pp. 273-297, 1995.
    [19] J. C. Platt, “Fast training of support vector machines using sequential minimal optimization,” in Advances in Kernel Methods: Support Vector Machines, B. Schölkopf, C. Burges, and A. Smola, Eds. Cambridge, MA: MIT Press, 1998..
    [20] .J.C.Platt, “Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines.” Technical Report MSR-TR-98-14, Microsoft Research, 1998.
    [21] Chin-Lung Hart SU, Jyh-Shing Roger Jang, “Speech Recognition on 32-bit Fixed-point Processors: Implementation & Discussions,” Master’s Thesis, Tsing Hua University, Hsinchu City, Taiwan. 2005.
    [22] J.F.Wang, T.W. Kuan, and T.W.Sun, “Dynamic Fixed-Point Arithmetic Design of Embedded SVM-Based Speaker Identification , ” ISNN2010
    [23] 財團法人國家實驗研究院 國家晶片系統設計中心 http://www.cic.org.tw/

    無法下載圖示 校內:2020-12-31公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE