| 研究生: |
吳禎祐 Wu, Jen-You |
|---|---|
| 論文名稱: |
應用資料擾亂機制保護部分開放之敏感性靜態資料庫 Using Noise Addition on Protecting Partially Open, Sensitive and Static Databases |
| 指導教授: |
侯廷偉
Hou, Ting-Wei |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 工程科學系碩士在職專班 Department of Engineering Science (on the job class) |
| 論文出版年: | 2009 |
| 畢業學年度: | 97 |
| 語文別: | 中文 |
| 論文頁數: | 53 |
| 中文關鍵詞: | 資料庫安全 、資料擾亂 、雜訊添加 、隱私保存 、敏感性資料 |
| 外文關鍵詞: | Noise Addition, Database Safety, Data Perturbation, Privacy Preservation, Sensitive Data |
| 相關次數: | 點閱:156 下載:3 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著電子商務化的成長,無紙化作業已經逐漸取代以往的傳統作業,資料庫的運算速度更是硬體的進步下大幅提升,在如此的背景下,資料庫的快速、即時存取早已不是難事。然而,儲存敏感性資料的資料庫,可能在有心人的操作下,利用匿名性的敏感性資料庫,對照公開性或半公開性資料庫的查詢以破解並取得具有隱私權的相關資訊,尤其往往在極具重要性,如醫療、軍事等之資料庫,造成無法彌補之損失。此外,網際網路帶來的便利化相對也伴隨網路犯罪日益增加的現象。但倘若一昧追求資料庫安全,導致非敏感性之資料無法被查詢,或是產出錯誤的資料,亦有可能導致不良副作用,例如醫生對於患者病歷資料掌握性不足而無法迅速診斷,造成搶救黃金時間的損失等情事發生。
本論文目的在研究如何在保持資料統計值的正確性之前提下,將資料庫內敏感性資料的單一辨識性移除。我們提出二種新的方法:(1)『抑制型雜訊添加機制』:結合傳統『雜訊添加』機制及『群組雜訊抑制』觀念將群組內資料值平均化,以移除其單一性;(2)『隨機型區間雜訊添加機制』:針對敏感度較高之資料值,將其添加雜訊後再取區間值來遮蔽其原始值。最後以網頁應用(Web-Based)系統來實作前述雜訊添加方式,並比較其優缺點。
With the growth of electronic commerce, paperless operations are gradually replacing traditional operations. The computing speed progresses substantially along with the advancement of hardware. Fast and instant access is not a difficult issue under this background. However, databases with sensitive data might be broken through by the intruders by combining anonymous sensitive databases with disclosure databases. If this should happen to medical and military databases and it would cause damages that can not be recovered. Besides, the convenience that Internet brings is also accompanied with the phenomenon of increasing network crimes. It is a dilemma to either give the exact data or “perturbed’ data for the users. For example, if we export the “perturbed” data, it might be happen that a doctor cannot make emergency rescue rapidly due to the low accuracy of patients’ case history.
This thesis is to remove the uniqueness of sensitive data in databases under the premise of keeping the accuracy of statistical data. We propose two new algorithms: Reducing Noise Addition and Random Noise Interval Addition. Reducing Noise Addition combines tradition noise addition and group noise reducing to generalize the data in groups and removes the uniqueness of each tuple. Random Interval Noise Addition adds noise to the random values in groups and calculates an interval to cover the original data values. At last, but not the least, we implemented our algorithms in a web-based system to analyze its feasibility.
[1] Charu C. Aggarwal and Philip S. Yu, “On Static and Dynamic Methods for Condensation-Based Privacy-Preserving Data Mining”, ACM Transactions on Database Systems, vol.33 no.1, pp.1-39, March 2008.
[2] C. C. Aggarwal, "On Randomization, Public Information and the Curse of Dimensionality", Proc. of International Conference on Data Engineering, pp.136-145, 2007.
[3] R. Agrawal and R. Srikant. “Privacy Preserving Data Mining”, Proc. of the ACM SIGMOD, pp.439-450, 2000.
[4] D. Agrawal and C. C. Aggarwal, "On the Design and Quantification of Privacy Preserving Data Mining Algorithms", Proc. of ACM Special Interest Group on Management of Data, pp.247-255, 2001.
[5] Ruth Brand, “Microdata Protection through Noise Addition”, Inference Control in Statistical Databases, Lecture Notes in Computer Science v.2316, pp.97–116, 2002.
[6] L. Brankovic and H. Giggins, “Security, Privacy and Trust in Modern Data Management”, Chapter 12 in Statistical Database Security, pp.167-182, Springer Science+ Business Media, 2007.
[7] Josep Domingo-Ferrer, Francesc Seb’e, and Jordi Castell`a-Roca, “On the Security of Noise Addition for Privacy in Statistical Databases”, Proc. of Privacy of Statistical Databases 2004, Lecture Notes in Computer Science v.3050, pp.149–161, 2004.
[8] Wenliang Du and Zhijun Zhan, “Using Randomized Response Techniques for Privacy-preserving Data Mining”, Proc. of ACM International Conference on Knowledge Discovery and Data Mining, pp.305-358, Washington, D.C., August 24-27, 2003.
[9] C. Dwork. Krishnaram Kenthapadim F, McSherrt, I. Mironov and M. Naor, “Our Data, Ourselves: Privacy via Distributed Noise Generation.”, EUROCRYPT, pp.486-503, Russia, 2006.
[10] Alexandre Evfimievski, “Randomization in Privacy Preserving Data Mining”, Special Interest Group on Knowledge Discovery and Data Mining Explorations, 4(2), Issue 2, pp.43-48, Dec. 2002.
[11] A. Evfimevski, J. Gehrke, and R. Srikant, “Limiting Privacy Breaches in Privacy Preserving Data Mining,” Proc. of the ACM Special Interest Group on Management of Data/Principles of Database Systems Conference, pp.211-222, San Diego, CA, June 2003.
[12] S. Evfimievski, “Randomization Techniques for Privacy Preserving Association Rule Mining”, Special Interest Group on Knowledge Discovery and Data Mining Explorations, vol. 4, no. 2, pp.43-48, Dec 2002.
[13] W. A. Fuller," Masking Procedures for Microdata Disclosure Limitation" J. Official Stat, vol. 9, no. 2, pp.383-406, 1993.
[14] R. Garfinkel, R. Gopal, and D. Rice, "New Approaches to Disclosure Limitation While Answering Queries to a Database: Protecting Numerical Confidential Data Against Insider Threat Based on Data or Algorithms," Prof. of Hawaii International Conference on System Sciences, pp.1-18, 2006.
[15] Helen Giggins and Ljiljana Brankovic, “Statistical Disclosure Control: To Trust or Not to Trust”, Proc. of International Symposium on Computer Science and its Applications, pp.108-113, 2008.
[16] X. Huang and A. C. Madoc, "Image and Its Noise Removal in Nakagami Fading Channels," Proc. of IEEE 8th International Conference on Advanced Communication Technology, pp.570-573, 2005.
[17] Hillol Kargupta and Souptik Datta, “On the Privacy Preserving Properties of Random Data Perturbation Techniques”, Proc. of IEEE International Conference on Data Mining, pp.1-8 2003.
[18] J.J Kim and W.E. Winkler, “Multiplicative Noise for Masking Continuous Data”, unpublished manuscript, pp.1-17, 2001. (Available at http://www.census.gov/srd/papers/pdf/rrs2003-01.pdf.)
[19] Yufei Tao, “Privacy Preserving Publication: Anonymization Frameworks and Principles”, Handbook of Database Security, Springer Science+ Business Media, New York, pp.489-508, 2008.
[20] J. Traub, Y. Yemini and H. Wozniakowski, "Statistical Security of a Statical Database" ACM Trans. Database Syst., vol. 9, no. 4, pp.672-679, 1984.
[21] Jilles Vreeken, Matthijs van Leeuwen and Arno Siebes, “Preserving Privacy through Data Generation”, Proc. of Seventh IEEE International Conference on Data Mining, pp.685-690, 2007.
[22] S. De Capitani di Vimercati, “Recent Advances in Access Control”, Handbook of Database Security, Springer Science+ Business Media, New York, pp.1-26, 2008.
[23] Da-Wei Wang, “Preserving Confidentiality When Sharing Medical Database with the Cellsecu System”, International Journal of Medical Informatics 71, pp.17-23, 2003.
[24] 金靈,林信良,Ajax技術手冊,碁峰資訊股份有限公司,台北市,2006年。
[25] 施威銘研究室,新觀念ASP.NET 3.5 網頁程式設計,旗標出版股份有限公司,台北市,2008年。
[26] 張天慧,Oracle資料庫管理與維護,悅知文化出版社,台北市,2008年。
[27] 賴俐錦,一種應用於挖掘關聯式法則可防止洩密的資料擾亂方法,碩士論文,國立中興大學,2003年。