| 研究生: |
韓志杰 Han, Zhi-Jie |
|---|---|
| 論文名稱: |
設計與實作一適用於網頁瀏覽之語音助理 Design and Implementation of a Voice Assistant for Web Browsing |
| 指導教授: |
鄧維光
Teng, Wei-Guang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 工程科學系碩士在職專班 Department of Engineering Science (on the job class) |
| 論文出版年: | 2015 |
| 畢業學年度: | 103 |
| 語文別: | 中文 |
| 論文頁數: | 47 |
| 中文關鍵詞: | 人機互動 、自然使用者介面 、語音助理 |
| 外文關鍵詞: | human-computer interaction, natural user interface, voice assistant |
| 相關次數: | 點閱:118 下載:7 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在日常生活當中,人們已非常習慣藉由資訊系統的幫助來便捷地完成許多事務,傳統的電腦操作介面 (或稱為WIMP介面) 中,包含了視窗、圖示、選單與指標等人們所習知的互動方式,但在近來人機互動的諸多研究中,已提出應可更加利用人類感官 (視覺、聽覺、觸覺等),作為與機器的傳達與控制介面並成為一種新的設計典範。為了建構一可直覺使用的自然使用者介面,本研究主要探討了在網頁瀏覽的過程中,如何運用語音方式來作為輔助;明確地而言,使用者能以口語向虛擬語音助理下達指令,由語音助理代為控制網頁瀏覽器以切換畫面或執行功能,可免除絕大部分原本需以鍵盤、滑鼠完成的操作動作,可提升一般使用者的工作效率,亦可讓特定族群 (例:不諳傳統電腦操作方式的銀髮族、身障者等) 更容易地親近資訊環境。此外,本研究中所開發之語音助理僅需網頁瀏覽器即可運作,免除額外添購特定軟、硬體的負擔,在不改變使用者既有上網環境基礎下,可更貼近大眾使用需求。
Nowadays, many people are used to complete their daily tasks by applying information technology approaches. Throughout the years, the WIMP user interface (including windows, icons, menus, point-and-click devices) is generally accepted for computer users. Nevertheless, recent research works in HCI (human-computer interaction) indicate that human senses can be further investigated to build a natural user interface (NUI). Among several alternatives, speech is usually a common way for people to communicate with each other. We thus propose in this work to devise a voice assistant (i.e., a software agent) for supporting web browsing activities. Specifically, a user can control the web browser to scroll the current web page or to perform functions by commanding to the voice assistant. Note that the proposed voice assistant helps common people, the elderly, and people with disabilities. Futhermore, it is not necessary to install an additional software but a web browser when utilizing our voice assistant.
[1] 工業技術研究院資訊與通訊研究所,“語音對話”,http://atc.ccl.itri.org.tw/speech/ interaction/conversation.php。
[2] 李泓儒,“淨化網頁: 網頁區塊化以及資料區域擷取”,國立中央大學資訊工程學系碩士論文,2004年。
[3] 任育麒,“雙向互動式語音對話系統”,國立中正大學電機工程學系碩士論文,2005年。
[4] 吳宜鴻,“全球資訊網資料之分析、索引與擷取”, 國立清華大學資訊工程學系博士論文,2001年。
[5] 林永忠,“中文字轉換注音符號”,http://search.cpan.org/~xern/Lingua-ZH-ZhuYin Wen-0.01/ZhuYinWen.pm,2003年。
[6] 柳永青,“友善的人機介面”,科學發展472期,頁數6-13,2012年。
[7] 張照煌,“語音辨識技術應用之發展趨勢”,http://www.ascc.sinica.edu.tw/iascc/nl/ 87/1407/04.txt。
[8] 陳柏琳,“現階段大詞彙連續語音辨識研究之簡介”,國立台灣師範大學資訊工程學系碩士論文,2005年。
[9] 葉禮宗,“對話介面代理人-以推薦旅遊行程為例”,國立清華大學資訊工程學系碩士論文,2008年。
[10] R. Agarwal, Y. Muthusamy, and V. Viswanathan, “Voice Browsing the Web for Information Access,” http://www.w3.org/Voice/1998/Workshop/RajeevAgarwal.html, 1998.
[11] A. V. Dam, “Beyond Wimp,” International Journal of Computer Graphics and Applications, 20(1):50-51, February 2000.
[12] Document Object Model (DOM) Level 3 Core Specification, “http://www.w3.org/TR/ DOM-Level-3-Core,” April 2004.
[13] R. J. Elliott, L. Aggoun, and J. B. Moore, “Hidden Markov Models,” Springer, November 1995.
[14] H. Gu, J. Li, B. Walter, and E. Chang, “Spoken Query for Web Search and Navigation,” Proceedings of WWW Posters, pp. 10-12, May 2001.
[15] A. Hindman, “What is a Natural User Interface,” http://pugetworks.com/blog/2011/ 01/what-is-a-natural-user-interface, January 2011.
[16] S.-H. Lin, and J.-M. Ho, “Discovering Informative Content Blocks from Web Documents,” Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 588-593, July 2002.
[17] J. Mahmud, Y. Borodin, and I. V. Ramakrishnan, “Csurf: A Context-Driven Non-Visual Web-Browser,” Proceedings of the Sixteenth International Conference on World Wide Web, pp. 31-40, May 2007.
[18] A. Messer, A. Kunjithapatham, N. Phuong, P. Rathod, M. Sheshagiri, D. Cheng, and S. Gibbs, “Internet Search on TV,” Proceedings of the Fifth Consumer Communications and Networking Conference, pp. 1240-1241, January 2008.
[19] P. Montuschi, A. Sanna, F. Lamberti, and G. Paravati, “Human-Computer Interaction: Present and Future Trends,” International Journal of Computing Now, 7(9), September 2014.
[20] D. A. Norman, and J. Nielsen, “Gestural Interfaces: A Step Backward in Usability,” International Journal of Interactions, 17(5):46-49, September 2010.
[21] C. L. Ortiz, “The Road to Natural Conversational Speech Interfaces,” International Journal of Internet Computing, 18(2):74-78, March 2014.
[22] S. Rafaeli, and F. Sudweeks, “Interactivity on the Nets,” International Journal of Network and Netplay: Virtual Groups on the Internet, 173(90): 173-189, April 1998.
[23] P. Senin, “Dynamic Time Warping Algorithm Review,” Computer Science Department University of Hawaii, December 2008.
[24] U. Shrawankar, and V. Thakare, “Speech user interface for computer based education system,” Proceedings of the International Conference on Signal and Image, pp. 148-152, December 2010.
[25] SUMI User Satisfaction Questionnaire, “http://sumi.ucc.ie/”.
[26] A. Teppo, and P. Vuorimaa, “Speech Interface Implementation for XML Browser,” Proceedings of the International Conference on Auditory Display, pp. 272-275, August 2001.
[27] T. Tunguz, “The Fastest User Interface,” http://tomtunguz.com/voice-interface-of-the- future, October 2014.
[28] U.S. Patent No. 7,831,426, “Network Based Interactive Speech Recognition System,” http://www.google.com/patents/US7831426, November 2010.
[29] U.S. Patent No. 8,457,946, “Recognition Architecture for Generating Asian Characters,” http://www.google.com/patents/US8457946, June 2013.
[30] H.-M. Wang, Y.-H. Chou, and B. Chen, “Surfing the Chinese Web Pages by Unconstrained Mandarin Speech,” Proceedings of the International Conference on Consumer Electronics, pp. 84-85, June 1998.
[31] Web Speech API Specification, “https://dvcs.w3.org/hg/speech-api/raw-file/tip/ speechapi.html,” October 2012.