簡易檢索 / 詳目顯示

研究生: 溫立全
Wen, Li-Chang
論文名稱: 多國語言語音文句檢索之部份比對演算法研究與可程式化系統單晶片設計
Partial Matching Algorithms and SOPC Design for Multi-Language Spoken Sentences Retrieval
指導教授: 王駿發
Wang, Jhing-Fa
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2004
畢業學年度: 92
語文別: 英文
論文頁數: 59
中文關鍵詞: 語音文句檢索部分比對語言獨立
外文關鍵詞: Language independent, partial matching, spoken sentence retrieval.
相關次數: 點閱:104下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文提出兩種新的關於語音文句檢索的部分比對演算法並且將其實現在個人數位助理(PDA)與一可程式化系統單晶片發展板上。
    本演算法首先將詢問文句與資料庫文句切割成相等長度的比對單元,然後形成一比對平面,一個比對平面包括數個比對區塊,每個比對區塊可算出詢問文句與資料庫文句的局部失真量,而比對平面可算出整體失真量。整體失真量的計算為累加局部失真量累加的方式,分為全平面累加與行列累加。本演算法為語音參數直接比對,並不須要建立語音模型與語言模型,故所實現之系統為語言獨立的。
    在評估本演算法之後,我們將其用於一語音文句檢索系統並實現於HP iPAQ Pocket PC與一可程式化系統單晶片發展板上。

    This study presents two new partial matching algorithms for spoken sentence retrieval and realizes them on PDA and an ARM-based SOPC development board.
    For the proposed algorithms, the query and database sentences are initially segmented into equal-size matching units. A matching plane consisting of matching blocks is then created. For each matching block, a local similarity score is then calculated based on the feature distance. The global similarity score of the matching plane indicates the similarity of the query and database sentences. A whole-matching-plane accumulation scheme and a column-based row-based accumulation scheme then are used to obtain the global similarity score. To improve the accuracy of the similarity estimation, the similarity score is calculated through the inverse distance weighting (IDW) technique. The proposed algorithms are based on the feature level comparison and do not require acoustical and language models. The proposed spoken sentence retrieval system thus is language independent. In terms of retrieval performance, the experiments also demonstrate that the proposed spoken sentence retrieval system outperforms the system that uses IBM ViaVoice, a large-vocabulary continuous-speech recognition (LVCSR) system.
    After estimating the proposed algorithms, we realize the proposed spoken sentence retrieval system on HP iPAQ Pocket PC and implement the hardware/software co-design version on an ARM-based SOPC development board to be used in various portable speech systems.

    中文摘要....................................................................i ABSTRACT....................................................................ii ACKNOWLEDGEMENT.............................................................iv CONTENTS....................................................................v LIST OF TABLES..............................................................vii LIST OF FIGURES.............................................................viii CHAPTER 1 INTRODUCTION......................................................1 1.1 Background and Previous Works...........................................1 1.2 Motivation..............................................................1 1.3 Proposed Methodologies and System Architecture for Spoken Sentence Retrieval...............................................................2 1.4 Thesis Organization.....................................................4 CHAPTER 2 PARTIAL MATCHING ALGORITHMS FOR SPOKEN SENTENCE RETRIEVAL.........6 2.1 Sentence Matching for Spoken Sentence Retrieval.........................6 2.2 Semantic-Level Partial Matching Spoken Sentence Retrieval...............7 2.3 Feature-Level Partial Matching Spoken Sentence Retrieval................8 2.3.1 Whole-Matching-Plane Based Algorithm..................................9 2.3.2 Column-Based Row-Based Algorithm......................................14 2.3.2.1 Basic Concept.......................................................14 2.3.2.2 Consideration for Second Similar FPUs...............................15 2.3.2.3 Consideration for Row-Based Matching................................16 2.3.2.4 CBRB Algorithm......................................................17 CHAPTER 3 ALGORITHMS PERFORMANCE EVALUATIONS................................21 3.1 Experimental Environments...............................................21 3.2 Sentence Retrieval Using Whole-Matching-Plane Based Algorithm with Various Matching Unit Sizes and IDW Functions...................................23 3.2.1 Mandarin Sentence Retrieval...........................................23 3.2.2 English Sentence Retrieval............................................25 3.2.3 Taiwanese Sentence Retrieval..........................................27 3.3 Sentence Retrieval Using Column-Based Row-Based Algorithm with fixed Parameter Settings and various matching unit sizes and Inverse Distance Weighting functions.....................................................29 3.3.1 Mandarin Sentence Retrieval...........................................29 3.3.2 English Sentence Retrieval............................................31 3.3.3 Taiwanese Sentence Retrieval..........................................32 3.4 Sentence Retrieval with Various Parameter Settings Using Column-Based Row-Based Algorithm.....................................................33 3.4.1 Mandarin Sentence Retrieval...........................................33 3.4.2 English Sentence Retrieval............................................34 3.4.3 Taiwanese Sentence Retrieval..........................................36 3.5 Multi-Language Sentence Retrieval.......................................38 3.5.1 Multi-Language Sentence Retrieval using the WMPB Algorithm............38 3.5.2 Multi-Language Sentence Retrieval using the CBRB Algorithm............38 3.6 Spoken Sentence Retrieval Based on Acoustic and Language Models.........40 CHAPTER 4 AN ARM-BASED SOPC HARDWARE DESIGN FOR CBRB ALGORITHM..............41 4.1 Hardware/Software Partitioning..........................................41 4.2 CBRB Hardware Architecture..............................................42 4.2.1 DTW Overview..........................................................43 4.2.2 Memory Allocation.....................................................46 4.2.3 Distortion Unit.......................................................48 4.2.4 Processing Element Unit...............................................50 4.2.5 Accumulation Unit.....................................................51 4.2.6 Design of Finite State Machine Controller.............................52 4.3 Simulations and Verifications...........................................54 4.3.1 Distortion Unit Simulation............................................54 4.3.2 PE Unit Simulation....................................................54 4.3.3 Accumulation Unit Simulation..........................................55 CHAPTER 5 CONCLUSIONS.......................................................56 REFERENCE...................................................................57

    [1] B. Y. Ricardo and R. N. Berthier, “Modern Information Retrieval,” ACM Press, New York, 1999
    [2] B. Chen, H. M. Wang, and L. S. Lee, “Discriminating capabilities of syllable-based features and approaches of utilizing them for voice retrieval of speech information in Mandarin Chinese,” IEEE Trans. Speech and audio Processing, vol. 10, pp.303 –314, Jul. 2002.
    [3] C. Fabio, “Towards the use of prosodic information for spoken document retrieval,” in Proc. Int. Annual ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 420-421, 2001.
    [4] D. B. Weet, Introduction to Graph Theory, Second Edition, Prentice Hall, Upper Saddle River, NJ, 1996.
    [5] E. Chang, F. Seide, H.M. Meng, Z. Chen, Y. Shi, and Y.C. Li, “A system for spoken query information retrieval on mobile devices,” IEEE Trans. Speech and audio Processing, vol. 10, no. 8, pp. 531-541, Nov. 2002.
    [6] H. M. Meng and P. Y. Hui, “Spoken document retrieval for the languages of Hong Kong,” in Proc. 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing, pp. 201-204, May. 2001.
    [7] Jhing-Fa Wang, Jia-Ching Wang, Han-Chiang Chen, Tai-Lung Chen, Chin-Chan Chang, and Ming-Chi Shih, “Chip Design of Portable Speech Memopad Suitable
    for Persons With Visual Disabilities,” IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 10, NO. 8, NOVEMBER 2002.
    [8] J. F. Wang, P. C. Lin, J. J. Huang, and L. C. Wen, ”Spoken sentence retrieval based on MPEG-7 low-level descriptors and two level matching approach,” in Proc. the 8th Australian and New Zealand Conf. on Intelligent Information Systems, pp.397-402, 2003.
    [9] K. Ng and V. Zue, “Phonetic recognition for spoken document retrieval,” in Proc. Int. Conf. Acoustics, Speech, Signal Processing, pp. 325 -328, 1998.
    [10] M. Wechsler, Spoken Document Retrieval Based on Phoneme Recognition, Ph.D. Dissertation, Swiss Federal Institute of Technology (ETH), Zurich, 1998.
    [11] M. Tomczak, “Spatial interpolation and its uncertainty using automated anisotropic inverse distance weighting (IDW) cross-validation/jackknife approach,” Journal of Geographic Information and Decision Analysis, vol. 2, no. 2, pp. 18-30, 1998.
    [12] S. E. Johnson, K. S. Jones, P. Jourlin, G. L. Moore, and P. C. Woodland, “The Cambridge University spoken document retrieval system,” in Proc. Int. Conf. Acoustics, Speech, Signal Processing, pp. 49-52, 1999.
    [13] S. Amit and P. Fernando, “Document expansion for speech retrieval,” in Proc. Int. Annual ACM SIGIR Conf. Research and Development in Information Retrieval, pp.34-41, 1999.
    [14] S. Srinivasan and P. Dragutin, “Phonetic confusion matrix based spoken document retrieval,” in Proc. Int. Annual ACM SIGIR Conf. Research and Development in Information Retrieval Proceedings, pp.81-87, 2000.
    [15] W. K. Lo, H. Meng, and P. C. Ching, “Multi-scale and multi-model integration for improved performance in Chinese spoken document retrieval,” in Proc. Int. Conf. Spoken Language Processing, pp.1513-1516, 2002.

    下載圖示 校內:立即公開
    校外:2004-08-31公開
    QR CODE