| 研究生: |
許凱竣 Hsu, Kai-Chun |
|---|---|
| 論文名稱: |
設計與實作基於關聯特徵規則之可攜式文件格式分析系統 Design and Implementation of a Malicious PDF Analysis System Based on Association Rules |
| 指導教授: |
楊竹星
Yang, Chu-Sing |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2016 |
| 畢業學年度: | 104 |
| 語文別: | 中文 |
| 論文頁數: | 60 |
| 中文關鍵詞: | 文件型惡意程式 、可攜式文件格式 、靜態分析 、資料探勘 |
| 外文關鍵詞: | Document Malware, PDF, Static Analysis, Data Mining |
| 相關次數: | 點閱:92 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
可攜式文件格式(PDF)文件由於其便利性、跨平台執行及容易取得的等特點,在現今網路生活中已成為多數文件交流格式的最佳選擇;但PDF文件豐富的特性也令其成為網路攻擊中絕佳的犯罪工具。透過在PDF文件中夾帶惡意的程式碼或惡意的檔案,搭配上吸引人的檔名或標題,很容易令網路上的使用者們疏忽大意,成為下一位受害者。本論文提出了一個基於關聯特徵規則方式分析PDF文件的系統,以資料探勘的方式找出惡意程式碼中,特定函式或參數使用的關聯性,並且以這些關聯性當作比對文件的規則,再搭配幾項只會出現在惡意文件中的特徵,提高對文件分析的成功率。此外本論文也實作一種透過反向操作,在保留原始閱讀內容的前提下,將文件中所有可疑的部份給剔除,還給使用者一個乾淨且可以安心使用的PDF文件。
Due to the features of cross-platform support, free viewer programs for obtain and plentiful API support, PDF (Portable Document Format) file now is a popular transmission medium on the Internet. The rich amount of API support brings users better user experiences. However, some of these APIs are still under development and therefore may contain some vulnerabilities in the PDF viewers, which provide gaps for those crackers to commit crimes in the net environments.
The previous works on static analysis have been developed for about ten years, many effective methods have been proposed to detect the malicious PDF files. Although these methods can have good performance, they still need the “human” factor in the process of the whole system work. That is, these works need people to build the rules or to define the file types, and this makes their system can’t be fully automated.
This research aims at finding good ways to detect these malicious PDF documents. Unlike previous researches, which use developers' experience or observation to find features for their analysis system, this research use data mining methods to trace the word associations from the collected malicious codes, and use these associations to build the rules for analyzing PDF documents. The total flow of building the rules can be finished without any human factors, and the detecting accuracy of the system with the proposed method can reach around 98% detection rate.
[1] A. K. Sood and R. J. Enbody, "Targeted Cyberattacks: A Superset of Advanced Persistent Threats," IEEE Security & Privacy, vol.11, no.1, pp.54-61, 2013.
[2] A. Beuhring and K. Salous, "Beyond Blacklisting: Cyber defense in the Era of Advanced Persistent Threats," IEEE Security & Privacy, vol.12, no.5, pp.90-93, 2014.
[3] 《APT 攻擊》南韓 DarkSeoul 大規模 APT 攻擊事件FAQ, http://blog.trendmicro.com.tw/?p=4652
[4] PDF file structure – four parts, http://www.simpopdf.com/resource/pdf-file-structure.html
[5] Ding, Yu, et al. "Heap taichi: exploiting memory allocation granularity in heap-spraying attacks." ACM Proceedings of the 26th Annual Computer Security Applications Conference, 2010.
[6] I. Corona, D. Maiorca, D. Ariu and G. Giacinto, "Lux0R: Detection of Malicious PDF-embedded JavaScript code through Discriminant Analysis of API References," ACM Proceedings of the 2014 ACM Workshop on Artificial Intelligence and Security, pp. 47-57, 2014.
[7] D. Maiorca, G. Giacinto and I. Corona, "A patterns Recognition System for Malicious PDF Files Detection," Machine Learning and Data Mining in patterns Recognition, Springer Berlin Heidelberg, pp. 510-524, 2012
[8] P. Laskov and N. Šrndić, "Static Detection of Malicious JavaScript-bearing PDF Documents," Proceedings of the 27th Annual Computer Security Applications Conference, ACM, pp.373-382, 2011.
[9] SpiderMonkey, https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey.
[10] Smutz, Charles, and Angelos Stavrou. "Malicious PDF detection using metadata and structural features." ACM Proceedings of the 28th Annual Computer Security Applications Conference, 2012.
[11] Šrndic, Nedim, and Pavel Laskov. "Detection of malicious pdf files based on hierarchical document structure." Proceedings of the 20th Annual Network & Distributed System Security Symposium. 2013.
[12] Maiorca, Davide, Igino Corona, and Giorgio Giacinto. "Looking at the bag is not enough to find the bomb: an evasion of structural methods for malicious pdf files detection." ACM Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications security, 2013.
[13] Willems, Carsten, Thorsten Holz, and Felix Freiling. "Toward automated dynamic malware analysis using cwsandbox." IEEE Security & Privacy 2 (2007): 32-39.
[14] Wepawet, https://wepawet.iseclab.org/about.php
[15] F. Schmitt, J. Gassen and Gerhards-Padilla, "PDF Scrutinizer: Detecting JavaScript-based attacks in PDF documents," IEEE Privacy, Security and Trust (PST), 2012 Tenth Annual International Conference, pp.104-111, 2012.
[16] H. Cheng, F. Yong, L. Liang and L. R. Wang, "A static detection model of malicious PDF documents based on naive Bayesian classifier technology," IEEE Wavelet Active Media Technology and Information Processing (ICWAMTIP), 2012 International Conference, pp.29-32, 2012.
[17] T. E. Dube, R. A. Raines, M. R. Grimaila, K. W. Bauer and S. K. Rogers, "Malware Target Recognition of Unknown Threats," Systems Journal, IEEE , vol.7, no.3, pp.467-477, 2013.
[18] C, Ulucenk, V. Varadharajan, V. Balakrishnan and U, Tupakula, "Techniques for Analysing PDF Malware," Software Engineering Conference (APSEC), 2011 18th Asia Pacific, pp.41-48, 2011.
[19] Y. H. Choi, B. J. Han, B. C. Bae, H. G. Oh and K. W. Sohn, "Toward extracting malware features for classification using static and dynamic analysis," IEEE Computing and Networking Technology (ICCNT), 2012 8th International Conference, pp.126-129, 2012.
[20] Z. Tzermias, G. Sykiotakis, M. Polychronakis and E. P. Markatos, "Combining Static and Dynamic Analysis for the Detection of Malicious Documents," ACM Proceedings of the Fourth European Workshop on System Security, 2011.
[21] X. Lu, J. Zhuge, R. Wang, Y, Cao and Y, Chen, "De-obfuscation and Detection of Malicious PDF Files with High Accuracy," IEEE System Sciences (HICSS), 2013 46th Hawaii International Conference, pp.4890-4899, 2013.
[22] Liu, Daiping, Haining Wang, and Angelos Stavrou. "Detecting malicious javascript in pdf through document instrumentation." Dependable Systems and Networks (DSN), 2014 44th Annual IEEE/IFIP International Conference, 2014.
[23] Blackhat, http://www.imdb.com/title/tt2717822/
[24] QPDF, http://qpdf.sourceforge.net/
[25] PDF Tool kit, https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
[26] Setting up PDF Job Options File, http://www.bestprintingonline.com/job-options.htm
[27] Apriori algorithm, http://en.wikipedia.org/wiki/Apriori_algorithm.
[28] Han, Jiawei, Jian Pei, and Yiwen Yin. "Mining frequent patterns without candidate generation." ACM SIGMOD Record. Vol. 29. No. 2. ACM, 2000.
[29] Contagio, http://contagiodump.blogspot.tw/.
[30] virustotal, https://www.virustotal.com/.
校內:2021-01-28公開