| 研究生: |
黃首翰 Huang, Shuo-Han |
|---|---|
| 論文名稱: |
運用圖形探勘方法從惡意軟體活動行為中進行線上威脅偵測 Graph Mining on MALWARE Activities for Online THREAT Identification |
| 指導教授: |
莊坤達
Chuang, Kun-Ta |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2014 |
| 畢業學年度: | 102 |
| 語文別: | 英文 |
| 論文頁數: | 26 |
| 中文關鍵詞: | 圖形探勘 、惡意軟體 |
| 外文關鍵詞: | Graph Mining, Malware |
| 相關次數: | 點閱:96 下載:5 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,隨著網路的發展與普及,惡意軟體與電腦病毒的傳播變得越來越 容易。基於網路技術的惡意軟體透過更改其行為模式,或是使用zero-day攻 擊,來克服傳統利用病毒特徵碼進行的偵測。這對防毒軟體公司在抵禦這些 攻擊者們來說,是一大挑戰。沒有一間防毒軟體公司有絕對的自信宣稱他們 可以偵測到所有的病毒,因此,有很多線上病毒偵測的工具相應而生,這些 工具讓使用者能夠在防毒軟體捕抓不到病毒時,可以去搜尋可疑的檔案、網 址與IP。當人們使用這些線上偵測工具時,會留下許多query logs資訊,而 這些query logs資訊是相當有價值的,尤其當他們是來自電腦安全領域的專 家時。這些資訊不僅包含專家們的懷疑,也包含了許多時間上的特性,可以 用來分析什麼時候可能會有攻擊事件形成。從網路安全公司的觀察中可以 發現到,這些query logs彼此之間的差異很大,因為同樣的惡意軟體在不同 的環境上可以有非常多樣的行為變化。但這些query logs彼此之間會有些許 關係存在,若這些query logs的時間都很相近時,關係就會更為強烈。因此 我們可以找出這些關聯性,並識別出一些正在形成中的攻擊事件的因子。 基於這個想法,我們建立了一個惡意軟體行為圖,並發展一個雛形框架, 從query logs與目標(CVEs 或是 malware family, etc)之間的關聯,去找出擁 有時間集中特性(Temporal Concentrative Property)的關係,並進一步的識別 出Suspect Rising Attack Factor (SRAF)。在我們的實驗結果中,展現了我們 可以找出K-approximation targets來驗證我們的初始想法。
With the popularization of Internet, the spread of malwares or viruses become easier in recent years. Web-based malware overcomes signature-based detec- tion by modifying the behavior or using zero-day attack. It results in critical challenges for anti-virus companies to defend with those attackers. None of any anti-virus companies have the confident to say they could detect all of viruses, consequently, there are many online virus scan tools providing the capability for users searching the suspect files, url or ip, in case of anti-virus software didn’t catch it. When people use these online tools, they will leave query logs behind, and those query logs are valuable, especially if they come from domain experts. It provides information not only containing security domain experts’ suspicion but also having the time property for us to analyze what attack events could be formed. From the observation in network security company, those query logs are highly different because the same malware could have the diverse behavior in different environments. But those query logs may have some relationships between each others especially if they come within the same time interval. As such we can find those relationships and identify some factors of rising attacks. Based on this idea, we model a malware behavior graph and develop a proto-
v
type framework to find the relationship have Temporal Concentrative Property (TCP) between query logs and targets (CVEs or malware family, etc), and then further identify the Suspect Rising Attack Factor (SRAF). In our experimental results, it shows that we can find the k-approximation targets that give us a verification of the original idea.
[1] “Malware definition,” http://en.wikipedia.org/wiki/Malware.
[2] “Obfuscation,” http://en.wikipedia.org/wiki/Obfuscation (software).
[3] “Zero-Day attack,” http://en.wikipedia.org/wiki/Zero-day attack.
[4] “VirusTotal,” https://www.virustotal.com/en/.
[5] “ThreatExpert,” http://www.threatexpert.com/.
[6] M. Bailey, J. Oberheide, J. Andersen, Z. M. Mao, F. Jahanian, and J. Nazario, “Automated classification and analysis of internet malware,” in RAID, 2007, pp. 178–197.
[7] “TrendMicro,” http://www.trendmicro.tw/tw/index.html.
[8] “Neo4j,” http://www.neo4j.org/.
[9] M. Thorup and U. Zwick, “Approximate distance oracles,” in STOC, 2001, pp. 183–192.
[10] Z. Qi, Y. Xiao, B. Shao, and H. Wang, “Toward a distance oracle for billion-node graphs,” PVLDB, vol. 7, no. 1, pp. 61–72, 2013.
[11] E. Menahem, A. Shabtai, and A. Levhar, “Detecting malware through temporal function-based features,” in ACM Conference on Computer and Communications Security, 2013, pp. 1379–1382.
[12] N. Leibowitz, B. Baum, G. Enden, and A. Karniel, “The exponential learn- ing equation as a function of successful trials results in sigmoid perfor- mance,” in Journal of Mathematical Psychology, 2010, p. 338–340.
[13] J. Yuan, Y. Zheng, X. Xie, and G. Sun, “Driving with knowledge from the physical world,” in KDD, 2011, pp. 316–324.
[14] B. Shao, H. Wang, and Y. Li, “Trinity: a distributed graph engine on a memory cloud,” in SIGMOD, 2013, pp. 505–516.
[15] Z. Qi, Y. Xiao, B. Shao, and H. Wang, “Toward a distance oracle for billion-node graphs,” PVLDB, vol. 7, no. 1, pp. 61–72, 2013.
[16] A. D. Sarma, S. Gollapudi, M. Najork, and R. Panigrahy, “A sketch-based distance oracle for web-scale graphs,” in WSDM, 2010, pp. 401–410.
[17] M. Potamias, F. Bonchi, C. Castillo, and A. Gionis, “Fast shortest path distance estimation in large networks,” in CIKM, 2009, pp. 867–876.
[18] A. Gubichev, S. J. Bedathur, S. Seufert, and G. Weikum, “Fast and ac- curate estimation of shortest paths in large graphs,” in CIKM, 2010, pp. 499–508.
[19] K. Tretyakov, A. Armas-Cervantes, L. Garc ́ıa-Ban ̃uelos, J. Vilo, and M. Dumas, “Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs,” in CIKM, 2011, pp. 1785–1794.
[20] J. Hegedus, Y. Miche, A. Ilin, and A. Lendasse, “Methodology for behavioral-based malware analysis and detection using random projections and k-nearest neighbors classifiers,” in CIS, 2011, pp. 1016–1023.
[21] Mr.B.Dwarakanath and Mr.A.Suthakar, “Prediction and detection of mal- ware using association rules,” in IJPCSC, 2012.
[22] “Random Walk,” http://en.wikipedia.org/wiki/Random walk.
[23] B. K. et al., “An efficient heuristic procedure for partitioning graphs,” in Bell Systems Journal, 1972, pp. 291–307.
[24] D. S. J. et al., “Optimization by simulated annealing: an experimental evaluation. part i, graph partitioning,” in Oper. Res., 1989, pp. 865–892.
[25] T. N. B. et al., “Genetic algorithm and graph partitioning,” in Computers, IEEE Transactions, 1996, pp. 841–855.
[26] G. Karypis and V. Kumar, “A fast and high quality multilevel scheme for partitioning irregular graphs.” in International Conference on Parallel Processing, 1995, pp. 113–122.
[27] V. Kumar and G. Karypis, “Multilevel k-way partitioning scheme for ir- regular graphs.” in J. Parallel Distrib. Comput., 1998, pp. 96–129.
[28] “CVE-2006-4688,” http://cve.scap.org.cn/CVE-2006-4688.html.