| 研究生: |
黃子軒 Huang, Tze-Hsuan |
|---|---|
| 論文名稱: |
以MPEG-7特徵為基礎的居家環境
聲音辨識器之超大型積體電路架構設計 VLSI Architectures for Home Environmental Sound Recognition Based on MPEG-7 Features |
| 指導教授: |
王駿發
Wang, Jhing-Fa |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2003 |
| 畢業學年度: | 91 |
| 語文別: | 英文 |
| 論文頁數: | 57 |
| 中文關鍵詞: | 聲音辨識 、隱藏馬可夫模型 、頻心 、超大型積體電路 |
| 外文關鍵詞: | MPEG-7, HMM, centroid, spread, flatness, vlsi, sound recognition |
| 相關次數: | 點閱:89 下載:6 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
none
In this thesis, an environmental sound recognition system based on MPEG-7 features (centroid, spread, and flatness [1]) and its corresponding VLSI architectures are proposed. Traditional sound recognizer utilizes decision-tree based method and causes a problem where the parameter is not generalized [2~5]. The HMM based sound recognizer has been introduced by [8] to resolve this drawback. However, it adopts spectrum parameter and will result in high dimensional feature vectors. This thesis successfully solves the shortcoming by taking the basis extraction. The recognition rate is about 82% while only spectrogram is adopted as the parameter. The improved recognition rate is about 95% while above three mentioned MPEG-7 audio features are regarded as the parameters in our environmental sound recognizer.
Moreover, related VLSI architectures for this sound recognition system are also proposed. The first one is the feature extraction module. The most complicated computations in the module are the division and nth-root operations. We utilize the CORDIC method to devise a divider. For the nth-root operation, a specific circuit is designed in accordance with the Brahmagupta iteration algorithm. For the Viterbi algorithm, a dedicated hardware architecture is also presented. This architecture is designed based on the 4-step fully Viterbi algorithm. This speed-up of this module is also ascribed to the fully pipeline systolic array architecture.
[1]ISO/IEC FDIS 15938 4:2001(E) Information Technology - Multimedia Content Description Interface---Part 4 : Audio
[2]Guojun Lu; Hankinson, T.
“A technique towards automatic audio classification and retrieval”
Signal Processing Proceedings, 1998. ICSP '98. 1998 Fourth International Conference on , Volume: 2 , 12-16 Oct. 1998
[3]Zhang, T.; Jay Kuo, C.-C.
”Audio content analysis for online audiovisual data segmentation and classification”
Speech and Audio Processing, IEEE Transactions on , Volume: 9 Issue: 4 , May 2001
[4]Tong Zhang; Kuo, C.-C.J.
”Classification and retrieval of sound effects in audiovisual data management”
Signals, Systems, and Computers, 1999. Conference Record of the Thirty-Third Asilomar Conference on , Volume: 1 , 24-27 Oct. 1999
[5]Tong Zhang; Kuo, C.-C.J.
”Hierarchical classification of audio data for archiving and retrieving“
Acoustics, Speech, and Signal Processing, 1999. ICASSP '99. Proceedings., 1999 IEEE International Conference on , Volume: 6 , 15-19 March 1999
[6]Wold, E.; Blum, T.; Keislar, D.; Wheaten, J.
”Content-based classification, search, and retrieval of audio“
Multimedia, IEEE , Volume: 3 Issue: 3 , Fall 1996
[7]Tzanetakis, G.; Cook, P.
”Musical genre classification of audio signals“
Speech and Audio Processing, IEEE Transactions on , Volume: 10 Issue: 5 , July 2002
[8]Casey, M.
”MPEG-7 sound-recognition tools“
Circuits and Systems for Video Technology, IEEE Transactions on , Volume: 11 Issue: 6 , June 2001
[9]Goldhor, R.S.
”Recognition of environmental sounds”
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on , Volume: 1 , 27-30 April 1993
[10]Brand, M.
”Structure and parameter learning via entropy minimization, with applications to mixture and hidden Markov models”
Acoustics, Speech, and Signal Processing, 1999. ICASSP '99. Proceedings., 1999 IEEE International Conference on , Volume: 3 , 15-19 March 1999
[11]Brand, M.
“Structure discovery in conditional probability models via an entropic prior and parameter extinction”
Neural Comput.,
vol. 11, no. 5, pp. 1155-1183, 1999.
[12]Kak, S.C.; Barbir, A.O.
”The Brahmagupta algorithm for square rooting”
System Theory, 1989. Proceedings., Twenty-First Southeastern Symposium on , 26-28 March 1989
[13]Chen-Jen Huang; Jer-Min Jou
Efficient rapid Hardware Prototyping, Analyzing and Design of AN HMM-based Speech Recognition Engine
Department of Electrical Engineering National Cheng Kung University Tainan, Taiwan, R.O.C. June, 2000[14]L.R. Rabiner
A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition
Proc. IEEE, 77(2):257-268, February 1989.