研究生: |
楊智傑 Yang, Zhi-Jie |
---|---|
論文名稱: |
用於災難事件管理之線上訊息分群技術 Clustering Online Event-based Messages for Disaster Management |
指導教授: |
侯廷偉
Hou, Ting-Wei 鄧維光 Teng, Wei-Guang |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 工程科學系 Department of Engineering Science |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 中文 |
論文頁數: | 38 |
中文關鍵詞: | 災難管理 、災難偵測 、訊息分群 、社群媒體 |
外文關鍵詞: | disaster management, disaster detection, messages clustering, social media |
相關次數: | 點閱:124 下載:6 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
自然災害和人為災害,例如大型交通事故、大型火災,甚至是2019年年底所爆發的新冠肺炎都導致了嚴重的人員傷亡以及經濟損失,因此對於相關災難管理單位來說,掌握最新且準確的災難事件情況是最主要的任務之一。社群媒體也是災難訊息的來源之一,本研究搜尋社群媒體訊息,並採用資料分類技術以期能更有效的找出災難訊息。本研究採用了LibShortText的SVM技術針對短文章進行分類。首先,希望利用到目前為止所蒐集到的災難社群媒體訊息來改善SVM分類器的準確率,因此本研究設計了提升SVM分類器模型對災難訊息分類準確率的實驗。實際成果顯示,改善後的SVM分類器模型平均準確率較原先模型提升24.4%。其次,本研究提出對歷史災難訊息的分群方法,將災難訊息透過Doc2vec轉換為向量,再經由分群演算法將相似的文章向量進行合併,最後透過後處理產生最終結果。實驗結果顯示,在災難發生後,本研究提出的災難訊息分群方法能夠幫助災難管理單位全面地回顧歷史災難事件。
Natural and man-made disasters, such as traffic accidents, fires and even the COVID-19 have caused serious casualties and economic losses, so disaster management is increasingly important. The disaster relief organizations require to have the latest and accurate situation of disaster events. With the widespread social media in recent years, social platforms such as Twitter, Facebook, PTT, etc. allow users to quickly and conveniently share real-time information, like relevant information or photos, videos, etc. to friends, relatives, or communities. Hence social media becomes a source of disaster information. In this research, a scheme combined with disaster message detection and disaster message clustering is proposed. It is demonstrated on a web-based system on which the disaster information is displayed. Firstly, experiments were performed to improve the original SVM classifier model of disaster information detection. Following the experimental results, the model was tuned. The precision of disaster messages detection increased 24.4%. Secondly, a clustering method for historical disaster information is proposed. The short texts are transformed into vectors by Doc2vec. Similar document vectors are clustered by the clustering algorithms. Final clustering result (disaster events) is determined in the post process. Experimental results show that the clustered disaster messages can help the disaster management organization to review historical disaster events after the disaster.
[1] T. Li, N. Xie, C. Zeng, W. Zhou, L. Zheng, Y. Jiang, Y. Yang, H.-Y. Ha, W. Xue, Y. Huang, S.-C. Chen, J. Navlakha, and S. S. Iyengar, ‘‘Data-Driven Techniques in Disaster Information Management,’’ ACM Journal of Computing Surveys, vol.50, no.1, pp. 1-45, March 2017.
[2] C.-P. Lin, "Identifying and Aggregating Disaster-related Messages from Social Media Streams", M.S. thesis, National Cheng Kung University, Taiwan, July, 2018.
[3] S. Yang, G. Hung, and B. Cai, ‘‘Discovering Topic Representative Terms for Short Text Clustering,’’ IEEE Access, vol. 7, pp. 92037 - 92047, 2019.
[4] S. Jinarat, B. Manaskasemsak and A. Rungsawang, ‘‘Short Text Clustering based on Word Semantic Graph with Word Embedding Model,’’ 10th International Conf. on Soft Computing and Intelligent Systems and 19th International Symposium on Advanced Intelligent Systems, Toyama, Japan, 2018, pp. 1427-1432.
[5] C.-C. Chang and C.-J. Lin, LIBSVM. (2019) [Online] Available: https://www.csie.ntu.edu.tw/~cjlin/libsvm/, last retrieve 1 June, 2020.
[6] Q. Hou and M. Han, ‘‘Incorporating Content Beyond Text: A High Reliable Twitter-Based Disaster Information System,’’ Proceedings of the International Conference on Computational Data and Social Networks, Ho Chi Minh City, Vietnam, November, 2019, pp. 282-292.
[7] T. H. Nazer, G. Xue, Y. Ji, and H. Liu, ‘‘Intelligent Disaster Response via Social Media Analysis - A Survey,’’ ACM SIGKDD Explorations Newsletter, vol.19, no.1, pp. 46-59, September, 2017.
[8] M. A. Cameron, R. Power, B. Robinson, and J. Yin, “Emergency Situation Awareness from Twitter for Crisis Management,” Proceedings of the 21st International World Wide Web Conference, pp. 695-698, April, 2012.
[9] H. F. Tu, C. H. Ho, Y. C. Juan, and C. J. Lin, ‘‘Libshorttext: A Library for Short-text Classification and Analysis,’’ Department of Computer Science, National Taiwan University, Taipei, Taiwan, 2013. [Online] Available: https://www.csie.ntu.edu.tw/~cjlin/papers/libshorttext.pdf, last retrieve 1 June, 2020.
[10] Z. Yu, L. Li, J. Liu, and G. Han, ‘‘Text Document Clustering on the basis of Inter passage approach by using K-means,’’ International Conference on Computing, Communication and Automation, Noida, 2015, pp. 110-113.
[11] X. Geng , Y. Zhang, Y. Jiao, and Y. Mei, ‘‘A Novel Hybrid Clustering Algorithm for Topic Detection on Chinese Microblogging,’’ IEEE Transaction on Computational System, vol. 6, no. 2, pp. 289-300, 2019.
[12] Z. Wang and X. Ye, ‘‘Social Media Analytics for Natural Disaster Management, ’’ International Journal of Geographical Information Science, vol. 32, no. 1, pp. 49-72, 2018.
[13] J. Kim, J. Bae, and M. Hastak, ‘‘Emergency Information Diffusion on Online Social Media During Storm Cindy in U.S.,’’ International Journal of Information Management, vol. 40, pp. 153-165, June, 2018.
[14] K. K. Scott, and N. A. Errett, ‘‘Content, Accessibility, and Dissemination of Disaster Information via Social Media During the 2016 Louisiana Floods,’’ Journal of Public Health Management and Practice, vol. 24, no. 4, pp. 370-379, July/August, 2018.
[15] Q. Le and T. Mikolov, ‘‘Distributed Representations of Sentences and Documents,’’ The 31st International Conference on Machine Learning, vol. 32, Bejing, China, 2014, pp. 1188-1196.
[16] M. Careem, C. De Silva, R. De Silva, L. Raschid and S. Weerawarana, "Sahana: Overview of a Disaster Management System," 2006 International Conference on Information and Automation, Shandong, China, 2006, pp. 361-366.
[17] K. Arai and A.R. Barakbah, ‘‘Hierarchical K-means: an Algorithm for Centroids Initialization for K-means,’’ Reports of the Faculty of Science and Engineering, Saga University, vol. 36, no.1, 2007, pp. 25-31.
[18] X. Hu, L. Tang and H. Liu, "Embracing Information Explosion without Choking: Clustering and Labeling in Microblogging," in IEEE Transactions on Big Data, vol. 1, no. 1, pp. 35-46, 1 March 2015.
[19] Z. Yu, L. Li, J. Liu and G. Han, "Hybrid Adaptive Classifier Ensemble," in IEEE Transactions on Cybernetics, vol. 45, no. 2, pp. 177-190, Feb. 2015.
[20] Y. Lin, J. Jiang and S. Lee, "A Similarity Measure for Text Classification and Clustering," in IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 7, pp. 1575-1590.
[21] PyPI, jieba 0.42.1, [Online] Available: https://pypi.org/project/jieba/, last retrieve Jul. 2020