簡易檢索 / 詳目顯示

研究生: 謝欣蓉
Hsieh, Hsin-Jung
論文名稱: 應用機器學習於警政輿情蒐報之研究
A Study on Using Machine Learning to Identify Relevant Social Messages for Supporting Police Affairs
指導教授: 侯廷偉
Hou, Ting-Wei
鄧維光
Teng, Wei-Guang
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2025
畢業學年度: 113
語文別: 中文
論文頁數: 62
中文關鍵詞: 警政輿情輿情偵測文本分類社群媒體機器學習
外文關鍵詞: police-related social messages, social messages detection, text classification, social media, machine learning
相關次數: 點閱:13下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著社群媒體及智慧型手機使用普及,民眾日益傾向透過網路分享周遭發生的事件及發表意見,而社會事件發生後,民眾可能在報案之前將現場狀況上傳社群媒體,警察機關不及因應後續的輿情發酵,面對新聞輿情的狀況也日漸被動,因此警政機關開始關注輿情蒐報工作。而目前各縣市的作法採用人工辨識,判讀成本高外亦有時效性問題。本研究透過爬蟲蒐集社群媒體訊息,透過人工標識建立資料集,採用傳統機器學習SVM及深度學習ALBERT訓練模型,期望能有效且即時地找出警政輿情訊息。研究共設計四項實驗,分別探討不同時間區間資料對模型分類穩定性的影響、模型對異質資料的分類能力,以及資料混合訓練對分類效能的改善情形。研究結果顯示,SVM及ALBERT模型均能穩定處理不平衡資料,而ALBERT模型則展現更好的警政輿情辨識能力。本研究驗證了結合機器學習模型模型於警政輿情辨識的可行性,提供警政機關未來建構社群於輿情監測系統實證參考與技術建議。

    With the widespread use of social media and smartphones, online platforms have become a primary channel for the public to share things happening around and express opinions. In some cases, social events may be posted on social media before being reported to 110, causing public opinion to spread rapidly. In such situations, while handling the case, police agencies must also urgently formulate responses to prevent the deterioration of public sentiment. To address this situation, police agencies have begun focusing on monitoring and collecting social messages related to police affairs.
    Currently, most police agencies rely on manual monitoring, which consumes excessive manpower and lacks timeliness. This study addresses this issue by collecting data through web crawlers, constructing a labeled dataset based on practical experience, and training classifiers using both a machine learning model, Support Vector Machine (SVM) and a deep learning model, ALBERT, with the aim of effectively and promptly identifying police-related social messages. Four experiments were conducted to evaluate the impact of temporal variations in training and testing data, the models’ ability to handle heterogeneous data, and the effects of mixed-data training on classification capability. Experimental results show that both SVM and ALBERT can handle imbalanced data reliably, while ALBERT shows better capability in detecting police-related messages.
    This research confirms the feasibility of integrating machine learning models into police-related message detection, and provides empirical evidence and technical recommendations for future development of automated social media monitoring systems by police agencies.

    摘要i 第一章 簡介 1 1.1研究動機 1 1.2研究背景 2 1.3研究目的與貢獻 3 第二章 相關研究 5 2.1社群媒體資料分析 5 2.1.1社群媒體的基本特性 5 2.1.2社群媒體資料分析的實務應用 6 2.2社群媒體訊息分類應用於公共安全之相關研究 7 2.2.1災害監測 8 2.2.2犯罪行為偵測 10 2.3輿情擴散風險評估 11 第三章 研究方法 13 3.1資料庫來源及研究方法流程 13 3.2資料預處理 17 3.2.1中文斷詞 17 3.2.2停用詞 18 3.3模型訓練 19 3.3.1 SVM支援向量機 19 3.3.2 ALBERT(A Lite BERT) 21 3.4模型評估 22 第四章 實驗探討 24 4.1實驗規劃 24 4.2資料蒐集及資料庫建立 25 4.3 二元分類模型訓練及評估 27 4.3.1 PTT資料集模型訓練及評估 27 4.3.2不同來源資料集模型評估 29 4.3.3不同來源資料集模型訓練及評估 30 4.3.4運用深度學習模型訓練及評估 31 4.4實驗結果探討 33 第五章 結論與未來目標 38 5.1結論 38 5.2未來目標 39 參考文獻 42 附錄1. NCP系統測試實作與使用經驗 44

    [1]Z.- J. Yang, Clustering online event-based messages for disaster management, M.S. thesis, Department of Engineering Science, National Cheng Kung University, Tainan, Taiwan, July, 2020.
    [2]Similarweb, https://www.similarweb.com/zh-tw/top-websites/taiwan/, last accessed July 1,2024
    [3]Ashima Kukkar, Rajni Mohana, Aman Sharma, Anand Nayyar, and Mohd. Asif Shah, “Improving sentiment analysis in social media by handling lengthened words,” IEEE Access, Vol. 11, p.9775 – 9788, 2023
    [4]Mona Khalifa A. Aljero And Nazife Dimililer, “Genetic programming approach to detect hate speech in social media,”IEEE Access, Vol. 9, p.115115 – 115125, 2021
    [5]Fatimah Alzamzami And Abdulmotaleb El Saddik, “Monitoring cyber sentihate social behavior during covid-19 pandemic in north America” IEEE Access, Vol. 9, p.91184 – 91208, 2021
    [6]Marco Mameli, Marina Paolanti, Rocco Pietrini, Giulia Pazzaglia, Emanuele Frontoni, And Primo Zingaretti, “Deep learning approaches for fashion knowledge extraction from social media: a review” IEEE Access, Vol. 10, p.1545 – 1576, 2022
    [7]Elif Kongar And Olumide Adebayo, “Impact of social media marketing on business performance: a hybrid performance measurement approach using data analytics and machine learning,”IEEE Engineering Management Review, Vol. 49, No. 1, p.133 – 147, 2021
    [8]Xiao Luo ,Priyanka Gandhi, Susan Storey and Kun Huang, “A deep language model for symptom extraction from clinical text and its application to extract COVID-19 symptoms from social media” IEEE Journal Of Biomedical And Health Informatics, Vol. 26, No. 4, p.1737 – 1748, 2022
    [9]E. Maruthavani And S. P. Shantharajah, “real-time healthcare recommendation system for social media platforms,” IEEE Access, Vol. 12, p.74161 – 74168, 2024
    [10]Anastasia Angelopoulou , Konstantinos Mykoniatis , and Alice E. Smith, “Real-Time healthcare recommendation system for social media platforms” IEEE Transactions On Computational Social Systems, Vol. 11, No. 1, p.307 – 318, 2024
    [11]Rafaa Aljurbua , Jumanah Alshehri, Abdulrahman Alharbi, William Power, And Zoran Obradovic, “Social media sensors for weather-caused outage prediction based on spatio–temporal multiplex network representation” IEEE Access, Vol. 11, p.125883 – 125896, 2023
    [12]Fengpan Zhao , Pavel Skums, Alex Zelikovsky , Eric L. Sevigny, Monica Haavisto Swahn, Sheryl M. Strasser, Yan Huang, and Yubao Wu, “Computational approaches to detect illicit drug ads and find vendor communities within social media platforms,” IEEE/ACM Transactions On Computational Biology And Bioinformatics, Vol. 19, No. 1, p.180 – 191, 2022
    [13]M. Awad and R. Khanna, Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers, Apress, New York, 2015
    [14]Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, “ALBERT: A lite BERT for self-supervised learning of language representations,” The 8th International Conference on Learning Representations (ICLR), Poster, Addis Ababa, Ethiopia, April 26–30, 2020.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE