| 研究生: |
林煜翔 Lin, Yu-Hsiang |
|---|---|
| 論文名稱: |
家庭關懷機器人友善人機介面與對話管理及多媒體回饋系統 Friendly Human-Machine Interaction with Dialogue Management and Multimedia Feedback system on Home Care Robot |
| 指導教授: |
王駿發
Wang, Jhing-Fa |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2017 |
| 畢業學年度: | 105 |
| 語文別: | 英文 |
| 論文頁數: | 55 |
| 中文關鍵詞: | 居家關懷 、口語對話理解 、對話管理 、使用者介面 |
| 外文關鍵詞: | home care, spoken language understanding, dialogue management, user interface |
| 相關次數: | 點閱:87 下載:18 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文提出一個具友善人機介面與對話決策及多媒體回饋系之系統,並將其應用於家庭機器人。本系統可與使用者互動,提供資訊及娛樂,並致力於更好的使用者體驗,讓使用者覺得友善及富有親和力。本系統分成四大部分,分別為對話管理、多媒體應用、機器人的噪音應對以及友善的使用者介面。對話管理主要負責分析使用者的語句,找出使用者當前的意圖並做出最適當的回饋。在對話管理中我們導入口語對話理解(Spoken Language Understanding)分析使用者意圖及找出句子中所含的資訊(例如:提到的區域地點),以及針對命令類別的快速指令偵測分析。分析後的結果會透過機率規則(Probabilistic Rule)更新系統的對話狀態(dialogue state)並產生有著最高獎勵的工作。這些工作會被介面管理者處理並安排到對應的多媒體應用中執行。其中多媒體應用包含了多項的影音娛樂,資料查詢等數項服務。為了提升使用者在與機器人互動時的對話品質,我們透過音源定位鎖定說話者的位置,並使機器人轉向面對使用者。除此之外,當機器人偵測出自己處於吵雜環境時,會開啟吵雜應對文本,避免噪音影響造成系統做出錯誤的決定。使用者介面是最直接和使用者互動的窗口,我們改善系統的操作流程,並加入觸控的操作模式,同時平行處理系統任務藉此提升系統流暢性,使得操作起來更加友善。
In this thesis, we proposed a friendly human-machine interaction with dialogue management and multimedia feedback system on home care robot. Our system is able to interact with people, provide information and entertainment. In addition, we commit to a better user-experience, making the interaction friendlier and affable. The system is divided into four parts, dialogue management, multimedia applications, robot reaction in noisy environment and friendly user interface. Dialogue management analyzes the input sentence from the user’s speech, finding the user’s current intent and make an appropriate decision. In dialogue management, we apply Spoken Language Understanding (SLU) for analyzing the user intent and finding the information in the sentence. The command detection is applied for detecting command categories. The results will update the dialogue state of the system through a probabilistic rule and generate a task with the highest reward. This task will be process by the interface controller and arrange to the corresponding multimedia application. The multimedia feedback contains media entertainment, information query and so on. In order to increase the conversation quality during the human-robot interaction, we lock the position of the speaking user and rotate the robot to face the user. In addition, when the robot detects the noisy environment, she will open a noise reaction corpus, avoiding the impact form the noise. The user interface takes a significant responsibility for direct interaction with people. Hence, we improve the operating procedures and add a touch mode for operation. We progress the tasks with parallel processing to enhance the fluency of the system, making the operation friendlier.
[1] 2018年 台灣進入高齡社會. Available: http://news.ltn.com.tw/news/business/paper/105329
[2] 台灣老年人占比「超英趕美」、老人市場及養生趨勢. Available: http://goooqle-money.blogspot.tw/2015/09/taiwan-old-people-increase-rate-will-be.html
[3] WHO IS PEPPER? Available: https://www.ald.softbankrobotics.com/en/cool-robots/pepper
[4] H. Hodson, "The first family robot," ed: Elsevier, 2014.
[5] S.-y. Takahashi, T. Morimoto, S. Maeda, and N. Tsuruta, "Dialogue experiment for elderly people in home health care system," in International Conference on Text, Speech and Dialogue, 2003, pp. 418-423.
[6] D. Fischinger, P. Einramhof, K. Papoutsakis, W. Wohlkinger, P. Mayer, P. Panek, et al., "Hobbit, a care robot supporting independent living at home: First prototype and lessons learned," Robotics and Autonomous Systems, vol. 75, pp. 60-78, 2016.
[7] 時薪180元,她們是長照第一線的到宅照服員. Available: https://www.twreporter.org/a/long-term-care-nurse-aide
[8] Personal Assist Robot. Available: http://www.toyota-global.com/innovation/partner_robot/family_2.html
[9] Zenbo Developer Meetup Highlights. Available: https://zenbo.asus.com/whatsnew/events/zenbo-developer-meetup-highlights/
[10] B. Graf, U. Reiser, M. Hägele, K. Mauz, and P. Klein, "Robotic home assistant Care-O-bot® 3-product vision and innovation platform," in Advanced Robotics and its Social Impacts (ARSO), 2009 IEEE Workshop on, 2009, pp. 139-144.
[11] C. Schroeter, S. Mueller, M. Volkhardt, E. Einhorn, C. Huijnen, H. van den Heuvel, et al., "Realization and user evaluation of a companion robot for people with mild cognitive impairments," in Robotics and Automation (ICRA), 2013 IEEE International Conference on, 2013, pp. 1153-1159.
[12] Y.-C. Chang, Y.-L. Hsieh, C.-C. Chen, and W.-L. Hsu, "Semantic frame-based natural language understanding for intelligent topic detection agent," in International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, 2014, pp. 339-348.
[13] F. Doshi and N. Roy, "Spoken language interaction with model uncertainty: an adaptive human–robot interaction system," Connection Science, vol. 20, pp. 299-318, 2008.
[14] S. J. Young, "Probabilistic methods in spoken–dialogue systems," Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, vol. 358, pp. 1389-1402, 2000.
[15] B. Liu and I. Lane, "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling," arXiv preprint arXiv:1609.01454, 2016.
[16] G. Tur and R. De Mori, Spoken language understanding: Systems for extracting semantic information from speech: John Wiley & Sons, 2011.
[17] P. Xu and R. Sarikaya, "Convolutional neural network based triangular CRF for joint intent detection and slot filling," in Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on, 2013, pp. 78-83.
[18] D. Hakkani-Tür, G. Tur, A. Celikyilmaz, Y.-N. Chen, J. Gao, L. Deng, et al., "Multi-domain joint semantic frame parsing using bi-directional RNN-LSTM," in Proceedings of The 17th Annual Meeting of the International Speech Communication Association, 2016.
[19] A. El-Kahky, X. Liu, R. Sarikaya, G. Tur, D. Hakkani-Tur, and L. Heck, "Extending domain coverage of language understanding systems via intent transfer between domains using knowledge graphs and search query click logs," in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, 2014, pp. 4067-4071.
[20] A. Bhargava, A. Celikyilmaz, D. Hakkani-Tür, and R. Sarikaya, "Easy contextual intent prediction and slot detection," in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, 2013, pp. 8337-8341.
[21] M. Surdeanu, "Overview of the TAC2013 Knowledge Base Population Evaluation: English Slot Filling and Temporal Slot Filling," in TAC, 2013.
[22] E. Levin, R. Pieraccini, and W. Eckert, "A stochastic model of human-machine interaction for learning dialog strategies," IEEE Transactions on speech and audio processing, vol. 8, pp. 11-23, 2000.
[23] E. Levin, R. Pieraccini, and W. Eckert, "Using Markov decision process for learning dialogue strategies," in Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on, 1998, pp. 201-204.
[24] J. D. Williams and S. Young, "Partially observable Markov decision processes for spoken dialog systems," Computer Speech & Language, vol. 21, pp. 393-422, 2007.
[25] D. Kim, J. H. Kim, and K. E. Kim, "Robust Performance Evaluation of POMDP-Based Dialogue Systems," IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, pp. 1029-1040, 2011.
[26] A. R. Cassandra, "Exact and approximate algorithms for partially observable Markov decision processes," 1998.
[27] M. L. Littman, "A tutorial on partially observable Markov decision processes," Journal of Mathematical Psychology, vol. 53, pp. 119-125, 6// 2009.
[28] D. A. McAllester and S. Singh, "Approximate planning for factored POMDPs using belief state simplification," in Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence, 1999, pp. 409-416.
[29] Y. Jie, J. Y. Pei, L. Jun, G. Yun, and X. Wei, "Smart Home System Based on IOT Technologies," in 2013 International Conference on Computational and Information Sciences, 2013, pp. 1789-1791.
[30] D. J. Cook, M. Youngblood, E. O. Heierman, K. Gopalratnam, S. Rao, A. Litvin, et al., "MavHome: An agent-based smart home," in Pervasive Computing and Communications, 2003.(PerCom 2003). Proceedings of the First IEEE International Conference on, 2003, pp. 521-524.
[31] R. Y. M. Li, H. C. Y. Li, C. K. Mak, and T. B. Tang, "Sustainable Smart Home and Home Automation: Big Data Analytics Approach," 2016.
[32] zigbee alliance. Available: http://www.zigbee.org/
[33] Buddy the first companion robot. Available: http://www.bluefrogrobotics.com/en/buddy/
[34] 設計筆記:淺談聊天介面與人機互動設計. Available: https://rocket.cafe/talks/75588
[35] CLOUD SPEECH API. Available: https://cloud.google.com/speech/
[36] W.-Y. Ma and K.-J. Chen, "Introduction to CKIP Chinese word segmentation system for the first international Chinese Word Segmentation Bakeoff," presented at the Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17, Sapporo, Japan, 2003.
[37] I. Titov and R. T. McDonald, "A Joint Model of Text and Aspect Ratings for Sentiment Summarization," in ACL, 2008, pp. 308-316.
[38] 部分語法學詞匯簡釋. Available: http://chowkafat.net/Book2/Glossary2.html#demonstrative
[39] J. Zhang, Y. Sun, H. Wang, and Y. He, "Calculating statistical similarity between sentences," Journal of Convergence Information Technology, vol. 6, 2011.
[40] S. Niwattanakul, J. Singthongchai, E. Naenudorn, and S. Wanapu, "Using of Jaccard coefficient for keywords similarity," in Proceedings of the International MultiConference of Engineers and Computer Scientists, 2013, p. 6.
[41] B. Thomson, J. Schatzmann, and S. Young, "Bayesian update of dialogue state for robust dialogue systems," in Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on, 2008, pp. 4937-4940.
[42] B. Thomson and S. Young, "Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems," Computer Speech & Language, vol. 24, pp. 562-588, 2010.
[43] C. Plesca, V. Charvillat, and R. Grigoras, "Adapting content delivery to limited resources and inferred user interest," International Journal of Digital Multimedia Broadcasting, vol. 2008, 2008.
[44] X. Zhang, B. Wu, and H. Lin, "Assume-guarantee reasoning framework for MDP-POMDP," in Decision and Control (CDC), 2016 IEEE 55th Conference on, 2016, pp. 795-800.
[45] P. Lison, "A hybrid approach to dialogue management based on probabilistic rules," Computer Speech & Language, vol. 34, pp. 232-255, 2015.
[46] G. H. Mealy, "A method for synthesizing sequential circuits," Bell System Technical Journal, vol. 34, pp. 1045-1079, 1955.
[47] Microsoft Speech API. Available: https://docs.microsoft.com/zh-tw/azure/cognitive-services/speech/home
[48] Central Weather Bureau. Available: http://www.cwb.gov.tw/V7/index.htm
[49] S. Bo-Hao, M.-H. Lu, K. Ta-Wen, and J.-F. Wang, "A HUMEM System for Online Multimedia Feedback," International Conference on Orange Technologies(ICOT), 2016.
[50] K. Kim, C. Lee, S. Jung, and G. G. Lee, "A frame-based probabilistic framework for spoken dialog management using dialog examples," in Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue, 2008, pp. 120-127.