簡易檢索 / 詳目顯示

研究生: 黃國惟
Huang, Kuo-Wei
論文名稱: 基於 eSLZ000 16bit_DSP 之嵌入式語音辨識系統設計與實現
An Embedded System Design for Speech Recognition based on ELAN eSLZ000 16bit_DSP
指導教授: 王駿發
Wang, Jhing-Fa
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2007
畢業學年度: 95
語文別: 英文
論文頁數: 54
中文關鍵詞: 語音辨識嵌入式系統
外文關鍵詞: Autocorrelation, DTW, LPCC, esLZ000, Speech recognition, Embedded system
相關次數: 點閱:86下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 語音辨識已經是一項發展多年的技術,對於人類生活便利性的改善上,也有長足的影響,有鑑於近年來消費性電子產品皆往小型,低耗電的嵌入式系統發展,本實驗室多年在語音辨識方面的演算法研究,也希望能實現於可攜性高的嵌入式系統上,在比較過多家的DSP發展系統之後,發現義隆電子所提供的eSLZ000 ICEeSL-U發展版對於語音訊號的辨識,擁有不錯的效能,且價格非常的有競爭力,因此我們選擇eSLZ000為語音辨識系統的發展平台。

    本辨識系統的核心是使用LPCC (Linear Prediction Cepstrum Coefficient)演算法來進行語音特徵的擷取,而進行特徵值比對則是使用DTW (Dynamic Time Warping)演算法來算出最後辨識的結果為何,而在實作到eSLZ000 ICEeSL-U發展版時,要針對有限的運算功能及記憶體空間進行規劃及考量,並對程式進行最佳化以達到Real-time的目的。

    經由整體分析之後,我們發現最耗費時間的部分在於進行LPCC轉換;更進一步可說是耗費在自相關Autocorrelation係數的計算上,於是我們運用了lookup table,乘累加運算器的硬體化,以及使用定點化來大幅降低運算所需的時間,使系統的反應時間在四個database字詞辨識之下,可以低於1.63秒以下。

    最後,希望藉由本嵌入式系統的實現,可以提昇語音辨識的應用層面,提供國內消費性電子產業使用,增進人類生活的便利性。

    Speech recognition is developed for increasing the convenience for human in recent years. Our lab is devoted to improve the performance of speech recognition algorithm for several years. Because the current trend of the consumer electronic is miniaturization and low power, we want to implement a speech recognition algorithm in a portable embedded system. After comparing the efficiency of many development boards, we find the Elan eSLZ000 ICEeSL-U development board is suitable for speech recognition. These are the reasons why we choose the eSLZ000 to be our target device for recognition system.

    The kernel of our recognition algorithm is Linear Prediction Cepstrum Coefficient, we use it as our speech signal feature. After feature extracting, the Dynamic Time Warping algorithm will be used to compute the distance among every testing pairs. The limited operating function and memory capacity of the eSLZ000 have to been considered while implementing the developing board. Finally, we also optimizes to our code with many mechanisms for the real-time consideration.

    After analyzing the entire system, we found that the LPCC consumes so many clock cycles; especially the Autocorrelation part. We use the look-up table, assembly MAC instruction and the fixed point transformation to get a huge improvement of response time. The final processing time of our system is 1.63 second for four reference patterns.

    Finally, we wish the proposed system can be used as the application of interactive device for improving the consumer electronic industry.

    摘 要 i ABSTRACT iii ACKNOWLEDGEMENTS v CONTENTS vi LIST OF FIGURES viii LIST OF TABLES x CHAPTER 1 Introduction - 1 - 1.1 Background - 1 - 1.2 Previous works - 2 - 1.2.1 ITRI Speech Recognition system (based on 8051 & DSP) - 2 - 1.2.2 Voice Me (HOTECK) - 3 - 1.2.3 Voicedex (HOTECK) - 4 - 1.3 Motivation - 5 - 1.4 Organization of Thesis - 7 - CHAPTER 2 Related Works - 8 - 2.1 Speech recognition procedure - 8 - 2.1.1 Endpoint detection - 9 - 2.1.2 Feature extraction - 14 - 2.1.3 DTW (Dynamic Time Warping) - 16 - CHAPTER 3 Embedded System Design based on eSLZ000 - 18 - 3.1 Overview of eSLZ000 - 20 - 3.2 Hardware architecture - 30 - 3.2.1 External SPI Flash - 32 - 3.2.2 Microphone - 33 - 3.3 Software Development Environment - 34 - 3.3.1 Integrated Development Environment - 34 - CHAPTER 4 System Implementation and Verification - 36 - 4.1 Software Implementation - 36 - 4.1.1 System control flow - 37 - 4.1.2 Memory allocation - 44 - 4.1.3 Finite Impulse Response (FIR) Filter - 45 - 4.2 Architecture of the entire system - 47 - 4.3 System Verification - 48 - CHAPTER 5 Conclusions and Future Works - 50 - REFERENCES - 52 - 作 者 簡 歷 - 54 -

    [1]Pro-Chuan Lin, Jhing-Fa Wang, Shun-Chieh Lin, and Ming-Hua Mo, “An Embedded System Design for Ubiquitous Speech Interactive Applications Based on a Cost Effective SPCE061A Micro Controller”, Accepted to the 3rd IFIP International Conference on Ubiquitous Intelligence and Computing (UIC-06) and published to LECTURE NOTES IN COMPUTER SCIENCE (LNCS)
    [2]Yung-Shing Kuo, Jhing-Fa Wang, “Embedded System Design based on SPCE061A for Interactive Spoken Dialogue Learning System with a Programmable Dialogue”
    [3]Ming-Hua Mo, Jhing-Fa Wang, “An Embedded System Design for Ubiquitous and Anthropomorphic Speech Interaction based on a Cost Effective SPCE061A Micro Controller”
    [4]K. Sukun, N. Sergiu and P. Rabin K., “Hardware Speech Recognition in Low Cost, Low Power Devices”, computer science division (university of California, Berkeley) cs252 class project, spring 2003.
    [5]B.L. Zeigler and B. Mazor, “DIALOG DESIGN FOR A SPEECH-INTERACTIVE AUTOMATION SYSTEM”, GTE Laboratories Incorporated, 1994
    [6]L. Rabiner and B. H. Juang, “FUNDAMENTALS OF SPEECH RECOGNITION.” Prentice-Hall, Inc., 1993..
    [7]Patterson Hennessy, “Computer Organization and Design - The Hardware/Software Interface”, Morgan Kaufmann, 2000
    [8]ELAN, “eSL/eSLS series and eSLZ000 User manual”, version 0.5, ELAN MICROELECTRONICS CORP., Ltd., August. 2006
    [9]ELAN, “eSL/eSLS series and eSLZ000 Programming Guide”, version 0.4, ELAN MICROELECTRONICS CORP., Ltd., August. 2006
    [10]ELAN, “ICEeSL-U XA In-circuit Emulation Board User’s Handbook”, version 0.2, ELAN MICROELECTRONICS CORP., Ltd., August. 2006
    [11]ELAN, “eSL/eAM series ANSI C Reference Manual”, version 0.1, ELAN MICROELECTRONICS CORP., Ltd., August. 2006
    [12]ELAN, “eSL series C Marco Reference Manual”, version 0.17, ELAN MICROELECTRONICS CORP., Ltd., August. 2006
    [13]ELAN, “eSLZ000 Product Specification”, version 0.2, ELAN MICROELECTRONICS CORP., Ltd., August. 2006
    [14]PMC, “Pm25LV010 / 020 / 040, 1Mbit / 2Mbit / 4Mbit, 3.0 Volt-only, Serial Flash Memory With 33MHz SPI Bus Interface”, version 1.2, PROGRAMABLE MICROELECTRONICS CORP., Ltd., July. 2005
    [15]陳明熒, “PC電腦語音辨認實作”, 旗標出版社, 民國八十三年一月初版
    [16]謝依蘭, “語音訊號數位處理”, 松崗出版社, 民國八十一年一月初版
    [17]王小川, “語音訊號處理”, 全華出版社, 民國九十四年二月初版

    下載圖示 校內:2008-07-11公開
    校外:2008-07-11公開
    QR CODE