簡易檢索 / 詳目顯示

研究生: 王昱敦
Wang, Yu-Don
論文名稱: 一個使用數位餘弦轉換直流係數差值之強健型聲紋系統
A Robust Audio Fingerprinting System Using DCT Direct Coefficients Difference
指導教授: 何裕琨
Ho, Yu-Kun
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2010
畢業學年度: 98
語文別: 中文
論文頁數: 44
中文關鍵詞: 聲紋基於內容搜尋數位餘弦轉換
外文關鍵詞: audio, fingerprint, content-based dct
相關次數: 點閱:87下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 以往的資料搜尋法大都以文字索引為主,主要是藉由輸入關鍵字在資料庫中來搜尋相關的資料。隨著多媒體時代的來臨,在多媒體資料的搜尋上採用以內容為基礎(content-based)的資料搜尋法是合理且較有效的方式。利用聲紋(audio fingerprint) 作為一段音訊的摘要(digest),如同人類的指紋一樣,亦即以此一音訊片段的聲紋來辨識它所屬之原歌曲,此即屬於以內容為基礎多媒體方面之搜尋。
    本論文提出了一個使用數位餘弦轉換直流係數差值之強健型聲紋系統,在音訊之頻率域(frequency domain)上利用數位餘弦轉換(Discrete Cosine Transform,DCT)先將頻域上的資訊轉換成數位餘弦係數,基於大部分能量集中在前幾係數的特性,採用直流係數(Direct Coefficient)差值之比較而編碼成為聲紋。此一在頻率域上利用數位餘弦轉換直流係數差值所產生之聲紋,在音訊搜尋上具有簡短、強健及在及易於比對之能力。而在聲紋資料的搜尋比對上則利用雜湊表儲存其資料庫子聲紋,並統計其所有子聲紋命中次數,再對命中次數最多之三首歌曲做其聲紋區塊之比對,最後找出位元錯誤率(Bit Error Rate,BER)最低者且低於本論文所設定之位元錯誤率門檻者為其辨識結果。
    經驗證顯示,本論文所提出之聲紋系統較之於先前之研究,具有聲紋資料量更簡短,需要比對之資料量較少,因而能減少資料比對時間。而在聲紋強健度上經由實驗結果顯示,此聲紋在加入小量雜訊後仍能保有相當高之正確性,

    The methods of database searching are primary by word indexing, search for relational information by entering the key words. As the multimedia times goes by, using content-based searching method in multimedia database is more rational and effect. An audio fingerprint is a digest of a fraction of audio signal. It is the same with human’s fingerprint. It can be used to recognize whether the snippet belongs to the original song. Therefore, we can use the fingerprint of the snippet of audio signal to search in multimedia database.
    We propose “A Robust Audio Fingerprinting System Using DCT Direct Coefficients Difference”. Firstly, it uses Discrete Cosine Transform to transform the information in frequency domain into Discrete Cosine Transform coefficients, and take the difference between direct coefficients to encode the fingerprint by using the energy compaction property. The fingerprint using the difference value between direct coefficients from Discrete Cosine Transform in frequency domain is not only brief but robust and has the ability to recognize correctly.
    We use Hashing in database searching to reduce the time of searching. It is verified that our fingerprinting system has less amount of fingerprint data, and need less time to recognize, so that we can reduce the comparison time. The experimental result shows that our fingerprint remains high accurate after adding white noise. And the time interval between two sub-fingerprint is the max value in the acceptable range. Our system can solve the problem of time alignment and the recognition rate is relatively high.

    目錄 第一章 緒論﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍1 第二章 相關背景﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍8 2.1 脈碼調變音訊檔案內容﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍8 2.2.1 傅立葉轉換﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍10 2.2.2 短時傅立葉轉換﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍12 2.2.3 數位餘弦轉換﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍13 2.3 漢明相似度 & 漢明距離﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍ 15 2.4 聲紋辨識系統﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍16 2.4.1 一個高強健型聲紋系統﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍18 2.4.2 使用小波轉換於基於內容之聲紋﹍﹍﹍﹍﹍﹍﹍﹍20 2.4.3 基於多層雜湊表之數位餘弦轉換聲紋系統﹍﹍﹍﹍23 第三章 一個使用數位餘弦轉換直流係數差值之強健型聲紋系統 ﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍27 3.1 聲紋萃取過程﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍28 3.2 使用雜湊表之聲紋比對及結果判斷﹍﹍﹍﹍﹍﹍﹍﹍33 第四章 實驗結果﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍36 4.1 實驗設備﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍36 4.2 加入白雜訊後之位元錯誤率實驗結果﹍﹍﹍﹍﹍﹍﹍36 4.3 辨識實驗結果﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍38 4.3.1 加入白雜訊辨識實驗結果﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍38 4.3.2 實際錄音辨識實驗結果﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍44 4.4 本論文與其他論文比較﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍39 第五章 結論與未來展望﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍﹍41

    [1] Baluja, Covell, "Content Fingerprinting Using Wavelets," Proc. IET Conference on Multimedia, London England, November 2006
    [2] Yu Liu; Hwan Sik Yun; Nam Soo Kim; "Audio Fingerprinting Based on Multiple Hashing in DCT Domain "Signal Processing Letters, IEEE Volume: 16 , Issue: 6 Digital Object Identifier: 10.1109/LSP.2009.2016837 Publication Year: 2009 , Page(s): 525 - 528
    [3] A.Wang, “An industrial strength audio search algorithm,” in Proc. 4th Int. Conf. Music Information Retrieval, Oct. 2003, pp. 7–13.
    [4] J. Haitsma and T. Kalker, “A highly robust audio fingerprinting system,” in Proc. 3rd Int. Conf. Music Information Retrieval, Oct.2002, pp. 107–115.
    [5] Cano, P.; Batle, E.; Kalker, T.; Haitsma, J.; "A Review of Algorithm for Audio Fingerprinting" Multimedia Signal Processing, 2002 IEEE Workshop on Publication Year: 2002 , Page(s): 169 – 173
    [6] Ghouti, L.; Bouridane, A.; Ibrahim, M.K.; "A Fingerprinting System for Music Content" Multimedia and Expo, 2006 IEEE International Conference on Digital Object Identifier: 10.1109/ICME.2006.262949 Publication Year: 2006 , Page(s): 1989 - 1992
    [7] B. Logan,“ Mel Frequency Cepstral Coefficients for Music Modeling”Proceedings of the International Symposium on Music Information Retrieval (ISMIR 2000), Oct. 2000.
    [7] Discrete-Time Signal Processing,OPPENHEIM
    [8] Dalwon Jang; Yoo, C.D.;“Fingerprint Matching Based On Distance Metric Learning”Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on Digital Object Identifier: 10.1109/ICASSP.2009.4959887
    Publication Year: 2009 , Page(s): 1529 - 1532
    [9]資料壓縮,戴顯權
    [10] A. Gionis, P. Indyk, R. Motwani (1999), Similarity search
    in high dimensions via hashing. Proc. International
    Conference on Very Large Data Bases,.
    [11]Enhancing Binary Feature Vector Similarity Measures
    [12] Cohen, et aL.. (200 1) Finding interesting associations
    without support pruning. Knowledge and Data Engineering, 13(1).

    下載圖示 校內:2012-09-01公開
    校外:2013-09-01公開
    QR CODE