簡易檢索 / 詳目顯示

研究生: 簡菎廷
Jian, Kun-Ting
論文名稱: 一多重音源辨識系統之演算法及系統硬體架構設計
A Multi-Sound Classification system Design based on Support Vector Machine and Independence Component Analysis
指導教授: 王駿發
Wang, Jhing-Fa
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2005
畢業學年度: 93
語文別: 英文
論文頁數: 73
中文關鍵詞: 獨立內容分析機支援向量機聲音辨識多重音源
外文關鍵詞: independent component analysis, support vector machine, sound classification, multi-sound
相關次數: 點閱:81下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •   在我們生活的環境中,每一種聲音都有其獨特性。我們常常可以藉由環境聲音的特質來辨識出聲音,進而判斷周遭的狀況。例如火災發生時的警報聲。如果能針對這些聲音資訊做分類及辨識,對於瞭解周遭的環境將有很大的幫助。尤其對於聽障者或自動保全系統。此外,聲音資訊在現今的電腦與多媒體應用中,是不可或缺的一部分。有非常大量的資訊是以音訊的檔案格式紀錄下來。運用聲音分類,也有助於搜尋我們想要的聲音片段。
      本論文中,提出一基於獨立內容分析機與支援向量機與MPEG-7低階聲音描述子之多重聲音環境聲音分類器。我們使用獨立內容分析機來分離多重音源,將分離出來的音源放入聲音辨識系統中辨識,並以spectrum centroid, spectrum spread和spectrum flatness這三個MPEG-7低階聲音描述子作為系統的音訊特徵,提出一結合支援向量機的多重聲音辨識系統。在單一聲音辨識上,我們收集了十五類共六百七十七筆的音訊資料庫。針對此資料庫,我們的分類器最高可以達到89.1%的辨識正確率。

     Different kinds of sound have different properties in our life environment, and we can make out surroundings by recognizing and understanding these properties of environmental sounds. For example, when we hear the fire alarm sound, we can judge there must be fire happening. It will be a great help to us for monitoring surrounding environment if we can classify and identify in accordance with the sound information, especially for the deaf person and security system. Besides, as mentioned in the former article, large amount of information is recorded in files format of audio. Making use of audio classification will be contributive to us for searching the audio segment we want.
     In this thesis, we present a multi-class audio classifier based on independent component analysis (ICA), support vector machine (SVM) and MPEG-7 audio low-level descriptors. We use independent component analysis to separate multi-sound and classify, and we take three MPEG-7 audio low-level descriptors, spectrum centroid, spectrum spread and spectrum flatness, as the features for sound classification and propose a classification method combined SVM and KNN (K-nearest neighbor). We collect an audio database contained 677 wav files of 15 classes. Experiments demonstrate the proposed sound classifier can achieve an 89.3% classification rate.

    1 Introduction 1 1.1 Background 1 1.2 Previous Work 2 1.3 Thesis Objectives 3 1.4 Thesis Organization 5 2 Audio Classification System 6 2.1 System overview of Audio Classification 6 2.2 Frame-based and grade-based Audio Classification 9 3 Algorithm for Multi-sound Classification 11 3.1 Feature Selection and Extraction 11 3.1.1 Audio Spectrum Centroid 12 3.1.2 Audio Spectrum Spread 15 3.1.3 Audio Spectrum Flatness 15 3.2 Multi-Sound Separation based on Independent Component Analysis 19 3.2.1 `Nongaussian` is independent 19 3.2.2 Whitening 20 3.2.3 Approximating negentropy 21 3.3 Algorithm for audio classification system 24 3.3.1 Introduction to Support Vector Machine 24 3.3.2 Multi-class Support Vector Machine 31 3.3.3 Improved Multi-Class Classification Method Based on Grade-Based 33 4 Algorithm Performance Results & Discussions 37 4.1 Performance Evaluation using different feature and method for single sound classification. 39 4.2 Multi-Sound Classification Using Independent Component Analysis and Grade-based Support Vector Machine 47 5 VLSI Design for SVM Classifier 50 5.1 System Overview 50 5.2 Improved Bit-level Inner-Product 54 5.3 Architecture of kernel function 59 5.3.1 CORDIC Division Mode 60 5.3.2 Fast-CORDIC Hyperbolic Cosine/Sine Mode 62 6 Conclusions 65 References 66

    [1] Tzanetakis, G., G. Essl, and P. Cook,“ Automatic Musical Genre Classification of Audio Signals, ” In Proc. Int. Symposium on Music Information Retrieval (ISMIR),Bloomington, Indiana (2001)

    [2] Tzanetakis, G., and P. Cook, “A Framework for Audio Analysis Based on Classification and Temporal Segmentation,” In Proc. EUROMICRO Conf., vol. 2 (1999) 61-67

    [3] Tzanetakis, G., and Cook, P. “Multifeature audio segmentation for browsing and annotation,” In Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPA99, New Paltz, NY (1999)

    [4] Tzanetakis, G., and Cook, P. “Sound analysis using MPEG-compressed audio,” In Proc. Int. conf on Audio, Speech and Signal Processing, ICASSP (2000)

    [5] Wold, E. et al., “Content-based classification, search, and retrieval of audio,” IEEE Multimedia, vol. 3, no. 2 (1996) 27-36

    [6] Jonathan T. Foote. et al., “Content-Based Retrieval of Music and Audio,” Multimedia Storage and Archiving Systems II, Proc. of SPIE, Vol. 3229, (1997) 138-147,

    [7] Guodong Guo & Stan Z. Li , “Content-Based Audio Classification and Retrieval by Support Vector Machines,” IEEE Transactions on Neural Network, Vol. 14, No. 1, January (2003)

    [8] Sung-Bae Cho & Hong-Hee Won , “Machine Learning in DNA Microarray Analysis for Cancer Classification,” Proceedings of the First Asia-Pacific bioinformatics conference on Bioinformatics (2003)

    [9] Dell Zhang & Wee Sun Lee , “Question Classification using Support Vector Machines,” Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval , July (2003)

    [10] Mpeg-7 overview (version 9). ISO/IEC JTC1/SC29/WG11 N5525, March 2003.

    [11] Text of international standard ISO/IEC 15938-4 information technology - multimedia content description interface - part 4: Audio. ISO/IEC 15938-4, 2002.

    [12] C. Cortes and V. Vapnik, “Support vector networks,”Machine Learning, vol. 20, pp. 273-297, 1995.

    [13] V. Vapnik, The Nature of Statistical Learning Theory. New York: Springer, 1995.

    [14] V. Vapnik, Statistical Learning Theory. New York: Wiley, 1998

    [15] B. Schölkopf, S. Mika, C. Burges, P. Knirsch, K.-R. Müller, G. Rätsch, and A. Smola, “Input space vs. feature space in kernel-based methods,” IEEE Transactions on Neural Networks, vol. 10, no. 5, pp. 1000-1017, 1999
    [16] E. Osuna, R. Freund, and F. Girosi, “Support vector machines: Training and applications,” Tech. Rep. AIM-1602, MIT A.I. Lab.,1996

    [17] V. Vapnik, Estimation of Dependences Based on Empirical Data. Springer-Verlag, 1982

    [18] C. J. C. Burges, “A tutorial on support vector machines for pattern recognition,” Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 121-167, 1998.

    [19] Smola and B. Schölkopf, “A tutorial on support vector regression,” Tech. Rep. NC2-TR-1998-030, Neural and Computational Learning II, 1998

    [20] J. C. Burges and B. Schölkopf, “Improving the accuracy and speed of support vector learning machines,” in Advances in Neural Information Processing Systems 9 (M. Mozer, M. Jordan, and T. Petsche, eds.), pp. 375-381, Cambridge, MA: MIT Press, 1997.

    [21] M. Schmidt, “Identifying speaker with support vector networks,” in Interface '96 Proceedings, (Sydney), 1996.

    [22] S. Ben-Yacoub, Y. Abdeljaoued, and E. Mayoraz, “Fusion of face and speech data for person identity verification,” IEEE Transactions on Neural Networks, vol. 10, no. 5, pp. 1065-1074, 1999

    [23] E. Osuna, R. Freund, and F. Girosi, “An improved training algorithm for support vector machines,” in 1997 IEEE Workshop on Neural Networks for Signal Processing, pp. 276-285, 1997

    [24] G. Fung, O. L. Mangasarian, and J. Shavlik, “Knowledge-based support vector machine classifiers,” in Advances in Neural Information Processing, 2002.

    [25] T. Joachims, “Text categorization with support vector machines: learning with many relevant features,” in Proceedings of ECML-98, 10th European Conference on Machine Learning (C. Nédellec and C. Rouveirol, eds.), (Chemnitz, DE), pp. 137-142, Springer Verlag, Heidelberg, DE, 1998.

    [26] K. Crammer and Y. Singer, “On the learnability and design of output codes for multiclass problems,” in Computational Learning Theory, pp. 35-46, 2000

    [27] K.-R. Müller, A. Smola, G. Rätsch, B. Schölkopf, J. Kohlmorgen, and V. Vapnik, “Predicting time series with support vector machines,” in Articial Neural Networks - ICANN'97 (W. Gerstner, A. Germond, M. Hasler, and J.-D. Nicoud, eds.), pp. 999-1004, 1997.

    [28] S. Mukherjee, E. Osuna, and F. Girosi, “Nonlinear prediction of chaotic time series using support vector machines,” in 1997 IEEE Workshop on Neural Networks for Signal Processing, pp. 511-519, 1997.

    [29] F. E. H. Tay and L. Cao, “Application of support vector machines in financial time series forecasting,” Omega, vol. 29, pp. 309-317, 2001.

    [30] L. J. Cao, K. S. Chua, and L. K. Guan, “c-ascending support vector machines for financial time series forecasting,” in 2003 International Conference on Computational Intelligence for Financial Engineering (CIFEr2003), (Hong Kong), pp. 317-323, 2003.

    [31] H. Drucker, C. J. C. Burges, L. Kaufman, A. Smola, and V. Vapnik, “Support vector regression machines,” in Advances in Neural Information Processing Systems, vol. 9, p. 155, The MIT Press, 1997.

    [32] R. Fletcher, Practical methods of optimization. Chichester and New York: John Wiley and Sons, 1987.

    [33] M. Aizerman, E. Braverman, and L. Rozonoer, “Theoretical foundations of the potential function method in pattern recognition learning,” Automations and Remote Control, vol. 25, pp. 821-837, 1964.

    [34] N. J. Nilsson, Learning machines: Foundations of trainable pattern classifying systems. McGraw-Hill, 1965.

    [35] R. Courant and D. Hilbert, Methods of Mathematical Physics. Interscience, 1953.

    [36] C.-W. Hsu and C.-J. Lin. “A comparison of methods for multi-class support vector machines,” IEEE Transactions on Neural Networks, vol. 13, pp. 415-425, 2002

    [37] Yu, X.; Cheng, X.; Fu, Y.; Zhou, J.; Hao, H.; Yang, X.; Huang, H.; Zhang, T.; Fang, L.;Systems, Man and Cybernetics, “Research of independent component analysis” 2004 IEEE International Conference on Volume 5, 10-13 Oct. 2004 Page(s):4804-4809

    [38] Wang, C.-L.; Wei, C.-H.; Chen, S.-H.; “Efficient bit-level systolic array implementation of FIR and IIR digital filters” Selected Areas in Communications, IEEE Journal on
    Volume 6, Issue 3, April 1988

    [39] Chin-Liang Wang; Chi-Mo Hwang;” An efficient and flexible bit-level systolic array for inner product computation” TENCON 90. 1990 IEEE Region 10 Conference on Computer and Communication Systems 24-27 Sept. 1990 Page(s):284 - 288 vol.1

    [40] Klass, F.; Electronics Letters” Unidirectional-flow systolic array for linear discriminant function classifier”Volume 26, Issue 20, 27 Sept. 1990 Page(s):1702 - 1703

    [41] McCanny, J.V.; McWhirter, J.G.; Kung, S.-Y.;” The use of data dependence graphs in the design of bit-level systolic arrays” Acoustics, Speech, and Signal Processing [see also IEEE Transactions on Signal Processing], IEEE Transactions

    [42] Aggoun, A.; Ibrahim, M.K.; Ashur, A.;” Bit-level pipelined digit-serial array processors” Circuits and Systems II: Analog and Digital Signal Processing, IEEE Transactions on [see also Circuits and Systems II: Express Briefs, IEEE Transactions on] Volume 45, Issue 7, July 1998
    [43] “Genov, R.; Cauwenberghs, G.;” Kerneltron: support vector "machine" in silicon” Neural Networks, IEEE Transactions on Volume 14, Issue 5, Sept. 2003 Page(s):1426 - 1434 Digital Object Identifier 10.1109/TNN.2003.816345

    [44] “Inverse Modeling with SVMs-based Dynamically Reconfigurable Systems.”IMTC2004-Instrumentation and Measurement Technology Conference.Como,Italy,May 18-20,2004

    下載圖示 校內:2006-08-29公開
    校外:2006-08-29公開
    QR CODE