研究生: |
蔡尚宏 Tsai, Shang-Hung |
---|---|
論文名稱: |
具噪聲消除之低功率及低面積聲音辨識晶片設計應用於居家照護系統 A Low-Power and Small Area Sound Recognition Chip Design with Noise Cancellation for Home Care System |
指導教授: |
王駿發
Wang, Jhing-Fa |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2015 |
畢業學年度: | 103 |
語文別: | 英文 |
論文頁數: | 60 |
中文關鍵詞: | 低功率 、低面積 、雜訊消除 、聲音辨識 、晶片設計 |
外文關鍵詞: | low-power, small area, noise cancellation, sound recognition, chip design |
相關次數: | 點閱:91 下載:4 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
台灣已走向老年化社會並逐步加重社會負擔,高齡社會已成為急需解決之社會議題。因此我們提出「具噪聲消除之低功率及低面積聲音辨識晶片設計」,期許此晶片能給予老人、幼童等較需要幫助之族群,即時的照護與救助。
本篇研究提出之自動化聲音辨識(Automatic Sound Recognition, ASR)晶片設計將採用梅爾規格倒頻譜係數(Mel-Frequency Cepstral Coefficients, MFCC)做為聲音辨識特徵,並針對雜訊干擾、聲音辨識率、未知聲音判斷、晶片功耗及面積大小等問題,提出硬體電路改進。最後,本篇研究提出一結合聲音活性檢測(VAD)與clock gating技術之晶片架構,能有效降低晶片待機狀態功耗,使其具備長時間運作能力,提升晶片之居家照護實用性。
實現方面,我們利用國家晶片系統設計中心(Chip Implementation Center, CIC)與台灣積體電路公司(TSMC)所提供的90奈米製程完成本晶片實作下線。晶片面積為1.15*1.15 mm2,以40支接腳封裝,閘總數(Gate Count)約為188k,消耗功率為2.803 mW,最高工作頻率為10 MHz。
Aged people has increased vastly in Taiwan and the social burden has increased, too. Aging society has become an urgent social issue which needs to be solved. Therefore, we proposed an “A Low-Power and Small Area Sound Recognition Chip Design with Noise Cancellation for Home Care System”, and we hope the chip can provide timely aid and care for elderly or children who need help.
This study proposed the automatic sound recognition (ASR) chip design that it adopts the Mel-frequency cepstral coefficients (MFCC) feature. Then, we modify the hardware circuit for the problem of noise interference, recognition accuracy, unknown sound detection, power consumption and area of chip. Finally, this study proposed a chip architecture that combines voice activity detection (VAD) and clock gating technique to reduce power consumption in idle state of chip, make it can operate longer and enhance the practicality for home care.
For implementation, we has been tape-out in TSMC’s 90nm process via Chip Implementation Center (CIC). The chip area is 1.15*1.15 mm2, 40-pin package, gate count is 188k, the power dissipation is 2.803 mW and the highest operation frequency is 10 MHz.
[1] L. C. W. Pols, “Spectral Analysis and Identification of Dutch Vowels in Monosyllabic Words,” Doctoral dissertion, Free University, Amsterdam, The Netherlands, 1966.
[2] J. D. Markel, A. H. Gray JR., “Linear prediction of speech,” Springer-Verlag, New York, 1976.
[3] S. B. Davis and P. Mermelstein, “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences,” IEEE Trans. Acoustics, Speech and Signal Processing, vol. 28, no. 4, pp. 357-366, Aug. 1980.
[4] Y. Yuan, P. Zhao, and Q. Zhou, “Research of speaker recognition based on combination of LPCC and MFCC,” in Proc. IEEE International Conference on Intelligent Computing and Intelligent Systems (ICIS), Xiamen, China, 2010, Oct. 29-31, vol. 3, pp. 765-767.
[5] T. Cover, P. Hart, “Nearest neighbor pattern classification,” IEEE Trans. Information Theory, vol. 13, no. 1, pp. 21-27, Jan. 1967.
[6] V. N. Vapnik, “An overview of statistical learning theory,” IEEE Trans. Neural Networks, vol. 10, no. 5, pp. 988-999, Sep. 1999.
[7] T. K. Ho, “The random subspace method for constructing decision forests,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 8, pp. 832-844, Aug. 1998.
[8] I. Rish, “An empirical study of the naive Bayes classifier,” in Proc. 7th International Joint Conference on Artificial Intelligence, Seattle, WA., United States, 2001, Aug. 4-10, vol. 3, no. 22, pp. 41-46.
[9] G. D. Wu, K. T. Kuo, “System-on-Chip Architecture for Speech Recognition,” Journal of Information Science and Engineering, vol. 26, no. 3, pp. 1073-1089, 2010.
[10] C. H. Peng, T. W. Kuan, P. C. Lin, B. W. Chen, J. F. Wang, G. J. Wu, “Butterfly framework of LPCC ASIC design for friendly HMI in speaker identification,” in Proc. IEEE International Conference on Orange Technologies (ICOT), Tainan, Taiwan, 2013, Mar. 12-16, pp. 189-192.
[11] G. D. Wu, Z. W. Zhu, “Chip Design of LPC-cepstrum for Speech Recognition,” in Proc. 6th IEEE/ACIS International Conference on Computer and Information Science, Los Alamitos, Calif., United States, 2007, Jul. 11-13, pp. 43-47.
[12] B. Widrow, J. M. McCool, M. G. Larimore, C. R. Johnson, “Stationary and nonstationary learning characteristics of the LMS adaptive filter,” Proceedings of the IEEE, vol. 64, no. 8, pp. 1151-1162, Aug. 1976.
[13] K. Wagstaff, C. Cardie, S. Rogers, and S. Schroedl, “Constrained k-means clustering with background knowledge,” in Proc. 18th International Conference on Machine Learning(ICML), Williamstown, Massachusetts, United States, 2001, Jun., vol. 1, pp. 577-584.
[14] J. N. Mitchell, “Computer multiplication and division using binary logarithms,” IRE Trans. Electronic Computers, vol. EC-11, no.4, pp. 512-517, Aug. 1962.
[15] S. Chu, S. Narayanan, and C.-C.J. Kuo, “Environmental Sound Recognition With Time–Frequency Audio Features,” IEEE Trans. Audio, Speech, and Language Processing, vol. 17, no. 6, pp. 1142-1158, Aug. 2009.
[16] J. C. Wang, J. F. Wang, and Y. S. Weng, “Chip design of MFCC extraction for speech recognition,” INTEGRATION, VLSI journal, vol. 32, no. 1, pp. 111-131, 2002.
[17] G. D. Wu, Y. Lei, “Parallel Dual-Accumulator based Mel Frequency Cepstral Coefficient for speech recognition,” in Proc. 4th IET International Conference on Intelligent Environments, Seattle, WA., United States, 2008, Jul. 21-22, pp. 1-4.
[18] M. Staworko, M. Rawski, “FPGA implementation of feature extraction algorithm for speaker verification,” in Proc. 17th IEEE International Conference on Mixed Design of Integrated Circuits and Systems (MIXDES), Wroclaw, Poland, 2010, Jun. 24-26, pp. 557-561.
[19] E. I. Abbas, A. A. Refeis, “Isolated uttered words recognition based on GMM/HMM algorithms using SoPC/Nios II processor build on Altera Cyclone II FPGA chip,” in Proc. First National Conference for Engineering Sciences (FNCES), Baghdad, Iraq, 2012, Nov. 7-8, pp. 1-8.
[20] V. Amudha, B. Venkataramani, and J. Manikandan, “FPGA implementation of isolated digit recognition system using modified back propagation algorithm,” in Proc. IEEE International Conference on Electronic Design (ICED), Penang, Malaysia, 2008, Dec. 01-03, pp. 1-6.
[21] J. Li, Y. Tian, and L. Zhang, “Research and implementation of speaker recognition algorithm based on FPGA,” in Proc. 24th Chinese Control and Decision Conference (CCDC), Taiyuan, China, 2012, May 23-25, pp. 1155-1158.