| 研究生: |
溫立全 Wen, Li-Chang |
|---|---|
| 論文名稱: |
多國語言語音文句檢索之部份比對演算法研究與可程式化系統單晶片設計 Partial Matching Algorithms and SOPC Design for Multi-Language Spoken Sentences Retrieval |
| 指導教授: |
王駿發
Wang, Jhing-Fa |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2004 |
| 畢業學年度: | 92 |
| 語文別: | 英文 |
| 論文頁數: | 59 |
| 中文關鍵詞: | 語音文句檢索 、部分比對 、語言獨立 |
| 外文關鍵詞: | Language independent, partial matching, spoken sentence retrieval. |
| 相關次數: | 點閱:104 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文提出兩種新的關於語音文句檢索的部分比對演算法並且將其實現在個人數位助理(PDA)與一可程式化系統單晶片發展板上。
本演算法首先將詢問文句與資料庫文句切割成相等長度的比對單元,然後形成一比對平面,一個比對平面包括數個比對區塊,每個比對區塊可算出詢問文句與資料庫文句的局部失真量,而比對平面可算出整體失真量。整體失真量的計算為累加局部失真量累加的方式,分為全平面累加與行列累加。本演算法為語音參數直接比對,並不須要建立語音模型與語言模型,故所實現之系統為語言獨立的。
在評估本演算法之後,我們將其用於一語音文句檢索系統並實現於HP iPAQ Pocket PC與一可程式化系統單晶片發展板上。
This study presents two new partial matching algorithms for spoken sentence retrieval and realizes them on PDA and an ARM-based SOPC development board.
For the proposed algorithms, the query and database sentences are initially segmented into equal-size matching units. A matching plane consisting of matching blocks is then created. For each matching block, a local similarity score is then calculated based on the feature distance. The global similarity score of the matching plane indicates the similarity of the query and database sentences. A whole-matching-plane accumulation scheme and a column-based row-based accumulation scheme then are used to obtain the global similarity score. To improve the accuracy of the similarity estimation, the similarity score is calculated through the inverse distance weighting (IDW) technique. The proposed algorithms are based on the feature level comparison and do not require acoustical and language models. The proposed spoken sentence retrieval system thus is language independent. In terms of retrieval performance, the experiments also demonstrate that the proposed spoken sentence retrieval system outperforms the system that uses IBM ViaVoice, a large-vocabulary continuous-speech recognition (LVCSR) system.
After estimating the proposed algorithms, we realize the proposed spoken sentence retrieval system on HP iPAQ Pocket PC and implement the hardware/software co-design version on an ARM-based SOPC development board to be used in various portable speech systems.
[1] B. Y. Ricardo and R. N. Berthier, “Modern Information Retrieval,” ACM Press, New York, 1999
[2] B. Chen, H. M. Wang, and L. S. Lee, “Discriminating capabilities of syllable-based features and approaches of utilizing them for voice retrieval of speech information in Mandarin Chinese,” IEEE Trans. Speech and audio Processing, vol. 10, pp.303 –314, Jul. 2002.
[3] C. Fabio, “Towards the use of prosodic information for spoken document retrieval,” in Proc. Int. Annual ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 420-421, 2001.
[4] D. B. Weet, Introduction to Graph Theory, Second Edition, Prentice Hall, Upper Saddle River, NJ, 1996.
[5] E. Chang, F. Seide, H.M. Meng, Z. Chen, Y. Shi, and Y.C. Li, “A system for spoken query information retrieval on mobile devices,” IEEE Trans. Speech and audio Processing, vol. 10, no. 8, pp. 531-541, Nov. 2002.
[6] H. M. Meng and P. Y. Hui, “Spoken document retrieval for the languages of Hong Kong,” in Proc. 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing, pp. 201-204, May. 2001.
[7] Jhing-Fa Wang, Jia-Ching Wang, Han-Chiang Chen, Tai-Lung Chen, Chin-Chan Chang, and Ming-Chi Shih, “Chip Design of Portable Speech Memopad Suitable
for Persons With Visual Disabilities,” IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 10, NO. 8, NOVEMBER 2002.
[8] J. F. Wang, P. C. Lin, J. J. Huang, and L. C. Wen, ”Spoken sentence retrieval based on MPEG-7 low-level descriptors and two level matching approach,” in Proc. the 8th Australian and New Zealand Conf. on Intelligent Information Systems, pp.397-402, 2003.
[9] K. Ng and V. Zue, “Phonetic recognition for spoken document retrieval,” in Proc. Int. Conf. Acoustics, Speech, Signal Processing, pp. 325 -328, 1998.
[10] M. Wechsler, Spoken Document Retrieval Based on Phoneme Recognition, Ph.D. Dissertation, Swiss Federal Institute of Technology (ETH), Zurich, 1998.
[11] M. Tomczak, “Spatial interpolation and its uncertainty using automated anisotropic inverse distance weighting (IDW) cross-validation/jackknife approach,” Journal of Geographic Information and Decision Analysis, vol. 2, no. 2, pp. 18-30, 1998.
[12] S. E. Johnson, K. S. Jones, P. Jourlin, G. L. Moore, and P. C. Woodland, “The Cambridge University spoken document retrieval system,” in Proc. Int. Conf. Acoustics, Speech, Signal Processing, pp. 49-52, 1999.
[13] S. Amit and P. Fernando, “Document expansion for speech retrieval,” in Proc. Int. Annual ACM SIGIR Conf. Research and Development in Information Retrieval, pp.34-41, 1999.
[14] S. Srinivasan and P. Dragutin, “Phonetic confusion matrix based spoken document retrieval,” in Proc. Int. Annual ACM SIGIR Conf. Research and Development in Information Retrieval Proceedings, pp.81-87, 2000.
[15] W. K. Lo, H. Meng, and P. C. Ching, “Multi-scale and multi-model integration for improved performance in Chinese spoken document retrieval,” in Proc. Int. Conf. Spoken Language Processing, pp.1513-1516, 2002.