研究生: |
林財貝 Lin, Cai-Bei |
---|---|
論文名稱: |
應用機率型SVMs與ICA於以內容為基礎音訊分類之研究 A Study on Content-based Audio Classification Using Probabilistic SVMs and ICA |
指導教授: |
王駿發
Wang, Jhing-Fa |
學位類別: |
碩士 Master |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2006 |
畢業學年度: | 94 |
語文別: | 英文 |
論文頁數: | 51 |
中文關鍵詞: | 聲音分類 、支援向量機 、獨立內容分析 |
外文關鍵詞: | independent component analysis, support vector machine, audio classification |
相關次數: | 點閱:88 下載:3 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在我們生活的環境中,每一種聲音都有其獨特性。我們常常可以藉由環境聲音的特質來辨識出聲音,進而判斷周遭的狀況。例如火災發生時的警報聲。如果能針對這些聲音資訊做分類及辨識,對於瞭解周遭的環境將有很大的幫助。尤其對於聽障者或自動保全系統。此外,聲音資訊在現今的電腦與多媒體應用中,是不可或缺的一部分。有非常大量的資訊是以音訊的檔案格式紀錄下來。運用聲音分類,也有助於搜尋我們想要的聲音片段。
本論文中,提出一基於獨立內容分析機與支援向量機之聲音環境聲音分類器。我們使用獨立內容分析機來做特徵萃取,主要是利用機率的特性取出具有獨特性的特徵。在系統的音訊特徵上包含三大組: 第一組是perceptual feature 其中包括 total spectrum power, subband powers, brightness, bandwidth 和 pitch,第二組為MFCC 和 delta MFCC,第三組為 ICA-transformed MFCC feature,是利用MFCC係數透過矩陣做轉換以達到特徵萃取,其中的矩陣是在訓練時把所有音訊求得的MFCC係數給ICA迭代。聲音分類器上使用機率型支援向量機的方式來做分類,我們收集了十五類共六百四十九筆的音訊資料庫。針對此資料庫,我們的分類器最高可以達到97.52%的辨識正確率。
Different kinds of sound have different properties in our life environment, and we can make out surroundings by recognizing and understanding these properties of environmental sounds. For example, when we hear the fire alarm sound, we can judge there must be fire happening. It will be a great help to us for monitoring surrounding environment if we can classify and identify in accordance with the sound information, especially for the deaf person and security system. Besides, as mentioned in the former article, large amount of information is recorded in files format of audio. Making use of audio classification will be contributive to us for searching the audio segment we want.
In this thesis, we present a home environmental audio classifier based on support vector machine (SVM) and independent component analysis. We use independent component analysis to extract the audio feature. This technique can extract independent components based on statistical characteristics. The proposed audio features can be categorized as three sets. The first feature set is perceptual features which include total spectrum power, subband powers, brightness, bandwidth and pitch. The second feature set consists of MFCC and delta MFCC. The third feature set is the ICA-transformed MFCC feature. This is achieved by transforming the MFCC feature using ICA transform. The ICA transform is literately obtained based on all the training audio data. The audio classifier is designed using probabilistic SVMs. We collect an audio database contained 649 wav files of 15 classes. Experiments demonstrate the proposed sound classifier can achieve a 97.52% classification rate.
[1] Tzanetakis, G., G. Essl, and P. Cook,“ Automatic Musical Genre Classification of Audio Signals, ” In Proc. Int. Symposium on Music Information Retrieval (ISMIR),Bloomington, Indiana 2001.
[2] Tzanetakis, G., and P. Cook, “A Framework for Audio Analysis Based on Classification and Temporal Segmentation,” In Proc. EUROMICRO Conf., vol. 2, pp. 61-67, 1999
[3] Tzanetakis, G., and Cook, P. “Multifeature audio segmentation for browsing and annotation,” In Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPA99, New Paltz, NY 1999.
[4] Tzanetakis, G., and Cook, P. “Sound analysis using MPEG-compressed audio,” In Proc. Int. conf on Audio, Speech and Signal Processing, ICASSP 2000.
[5] Wold, E. et al., “Content-based classification, search, and retrieval of audio,” IEEE Multimedia, vol. 3, no. 2, pp. 27-36, 1999.
[6] Jonathan T. Foote. et al., “Content-Based Retrieval of Music and Audio,” Multimedia Storage and Archiving Systems II, Proc. of SPIE, Vol. 3229, pp. 138-147, 1997.
[7] S. Pfeiffer, S. Fischer, and W. E. Elsberg, “Automatic Audio Content Analysis,” Tech. Rep. 96-008, Univ. Mannheim, Mannheim, Germany, Apr. 1996.
[8] S. Z. Li, “Content-based audio classification and retrieval using the nearest feature line method”, IEEE Transactions on Speech and Audio Processing, vol. 8, no. 5, pp. 619-625 Sept. 2000.
[9] Guodong Guo & Stan Z. Li , “Content-Based Audio Classification and Retrieval by Support Vector Machines,” IEEE Transactions on Neural Network, Vol. 14, No. 1, January 2003.
[10] Sung-Bae Cho & Hong-Hee Won , “Machine Learning in DNA Microarray Analysis for Cancer Classification,” Proceedings of the First Asia-Pacific bioinformatics conference on Bioinformatics 2003.
[11] Dell Zhang & Wee Sun Lee , “Question Classification using Support Vector Machines,” Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval , July 2003.
[12] C. Cortes and V. Vapnik, “Support vector networks,”Machine Learning, vol. 20, pp. 273-297, 1995.
[13] V. Vapnik, The Nature of Statistical Learning Theory. New York: Springer, 1995.
[14] V. Vapnik, Statistical Learning Theory. New York: Wiley, 1998
[15] B. Schölkopf, S. Mika, C. Burges, P. Knirsch, K.-R. Müller, G. Rätsch, and A. Smola, “Input space vs. feature space in kernel-based methods,” IEEE Transactions on Neural Networks, vol. 10, no. 5, pp. 1000-1017, 1999
[16] E. Osuna, R. Freund, and F. Girosi, “Support vector machines: Training and applications,” Tech. Rep. AIM-1602, MIT A.I. Lab.,1996
[17] V. Vapnik, Estimation of Dependences Based on Empirical Data. Springer-Verlag, 1982
[18] C. J. C. Burges, “A tutorial on support vector machines for pattern recognition,” Data Mining and Knowledge Discovery, vol. 2, no. 2, pp. 121-167, 1998.
[19] Smola and B. Schölkopf, “A tutorial on support vector regression,” Tech. Rep. NC2-TR-1998-030, Neural and Computational Learning II, 1998
[20] J. C. Burges and B. Schölkopf, “Improving the accuracy and speed of support vector learning machines,” in Advances in Neural Information Processing Systems 9 (M. Mozer, M. Jordan, and T. Petsche, eds.), pp. 375-381, Cambridge, MA: MIT Press, 1997.
[21] G. Fung, O. L. Mangasarian, and J. Shavlik, “Knowledge-based support vector machine classifiers,” in Advances in Neural Information Processing, 2002.
[22] T. Joachims, “Text categorization with support vector machines: learning with many relevant features,” in Proceedings of ECML-98, 10th European Conference on Machine Learning (C. Nédellec and C. Rouveirol, eds.), (Chemnitz, DE), pp. 137-142, Springer Verlag, Heidelberg, DE, 1998.
[23] K. Crammer and Y. Singer, “On the learnability and design of output codes for multiclass problems,” in Computational Learning Theory, pp. 35-46, 2000
[24] K.-R. Müller, A. Smola, G. Rätsch, B. Schölkopf, J. Kohlmorgen, and V. Vapnik, “Predicting time series with support vector machines,” in Articial Neural Networks - ICANN'97 (W. Gerstner, A. Germond, M. Hasler, and J.-D. Nicoud, eds.), pp. 999-1004, 1997.
[25] S. Mukherjee, E. Osuna, and F. Girosi, “Nonlinear prediction of chaotic time series using support vector machines,” in 1997 IEEE Workshop on Neural Networks for Signal Processing, pp. 511-519, 1997.
[26] F. E. H. Tay and L. Cao, “Application of support vector machines in financial time series forecasting,” Omega, vol. 29, pp. 309-317, 2001.
[27] L. J. Cao, K. S. Chua, and L. K. Guan, “c-ascending support vector machines for financial time series forecasting,” in 2003 International Conference on Computational Intelligence for Financial Engineering (CIFEr2003), (Hong Kong), pp. 317-323, 2003.
[28] H. Drucker, C. J. C. Burges, L. Kaufman, A. Smola, and V. Vapnik, “Support vector regression machines,” in Advances in Neural Information Processing Systems, vol. 9, p. 155, The MIT Press, 1997.
[29] R. Fletcher, Practical methods of optimization. Chichester and New York: John Wiley and Sons, 1987.
[30] M. Aizerman, E. Braverman, and L. Rozonoer, “Theoretical foundations of the potential function method in pattern recognition learning,” Automations and Remote Control, vol. 25, pp. 821-837, 1964.
[31] N. J. Nilsson, Learning machines: Foundations of trainable pattern classifying systems. McGraw-Hill, 1965.
[32] R. Courant and D. Hilbert, Methods of Mathematical Physics. Interscience, 1953.