簡易檢索 / 詳目顯示

研究生: 孫榕辰
Sun, Rong-Chen
論文名稱: 使用者友善的區域差分隱私
User Friendly Local Differential Privacy
指導教授: 賀保羅
Paul Horton
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2020
畢業學年度: 109
語文別: 英文
論文頁數: 63
中文關鍵詞: 區域差分隱私隱私保留隨機回應
外文關鍵詞: Local Differential Privacy, Privacy Preserving, Randomized Response
相關次數: 點閱:186下載:8
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,許多隱私洩露的事件發生,讓大眾提升隱私保護的意識,大家開始關注隱私安全的議題,嚴格要求服務供應方保證不會洩露用戶資料,達到用戶的隱私保護。為了達到隱私保護,許多公司開始採用區域差分隱私(LDP)來向資料擁有者安全地蒐集資料。LDP的值域通常是二元資料(例如是或否的回答),否則需要對數值做編碼。而非二元資料使用LDP的輸出結果是bit array。這樣的形式無法被資料使用者直接使用。資料使用者必須設計可以解讀bit array的演算法來執行所需的統計操作。
    事實上,這對資料使用者來說是非常不友善,也沒有效率。於是本論文提出改良的LDP—封閉性區域差分隱私(CLDP),使得輸出的結果為用戶可讀的數值。CLDP的輸出可以讓資料使用者直接使用,對資料使用者更加友善。此外,我們也提出抵禦平均攻擊的方法,防止惡意用戶對輸出結果執行準確度還原。本論文也對提出的方法推導安全分析。最後,根據實驗結果,本論文提出的方法保有良好的資料可用性,且對資料使用者非常友善,可以直接使用CLDP輸出的資料,減少演算法開發時間。

    Recent years, more and more privacy disclosure occurred. That raises public awareness of privacy preserving. People begin to pay attention to privacy security issues and strictly require service providers to guarantee that data will not be disclosed and achieve privacy preserving. For privacy preserving, a lot of enterprises using local differential privacy (LDP) to safely collecting data from data owner. General input of LDP is binary data (the answer is yes/no) otherwise the input needs to be re-encoded. The output of re-encoding is a bit array, but users can not directly use this data. Instead, the data user must design specific algorithms to interpret the bit array for executing statistic operations. Actually, this is very unfriendly and not efficient to data users.
    Thus, this thesis proposes improved LDP, Closed Local Differential Privacy (CLDP). The output of CLDP is a user-readable value, that can used directly. Furthermore, this thesis also proposes a new method for resisting averaging attack. It can prevent malicious users from accurately recover accuracy of system output. Finally, we present inference security analysis for our proposed method. According to our experimental results, we conclude that our proposed CLDP affords great data availability and is quite friendly to data users.

    Chapter 1 Introduction 1 1.1 Background 1 1.2 Problem Description 7 1.3 Motivation 9 1.4 Scenario 10 1.5 Contribution 12 1.6 Organization 14 Chapter 2 Related work 15 2.1 Differential Privacy 15 2.2 Attribute-based Encryption 25 Chapter 3 Closed Local Differential Privacy 28 3.1 CLDP 28 3.2 Resist Averaging Attack 32 Chapter 4 System Analysis 33 4.1 CLDP 33 4.2 Resist Averaging Attack 35 Chapter 5 Experiment 36 5.1 Experimental Environment and Datasets 36 5.2 Evaluation Metrics 38 5.3 Experimental Results 40 Chapter 6 Conclusion 55 6.1 Conclusion 55 6.2 Future Work 57 Reference 58

    [1] Petrov, C. (2020, July 19). 25+ Impressive Big Data Statistics for 2020. Retrieved from https://techjury.net/blog/big-data-statistics/#gref
    [2] Domo. (n.d.). Data Never Sleeps 5.0. Retrieved from https://www.domo.com/learn/data-never-sleeps-5
    [3] Bizga, A. (2020, May 4). Privacy issues in Australia’s SkillSelect platform may have exposed personal information of 700,000 aspiring migrants. Retrieved from https://securityboulevard.com/2020/05/privacy-issues-in-australias-skillselect-platform-may-have-exposed-personal-information-of-700000-aspiring-migrants/
    [4] Davis, J. (2020, June 11). Breach of Telehealth App Babylon Health Raises Privacy Concerns. Retrieved from https://healthitsecurity.com/news/breach-of-telehealth-app-babylon-health-raises-privacy-concerns#
    [5] Canada Facebook fined $6.5m over 'false' privacy claims. (2020, May 19). Retrieved from https://www.bbc.com/news/world-us-canada-52640785
    [6] OECD. (2005, November 10). Glossary of Statistical Terms: Quasi-identifier. Retrieved from https://stats.oecd.org/glossary/detail.asp?ID=6961
    [7] L. Sweeney. Weaving technology and policy together to maintain confi- dentiality. Journal of Law, Medicines Ethics, 25:98–110, 1997.
    [8] A. Narayanan and V. Shmatikov. Robust de-anonymization of large sparse datasets (how to break anonymity of the netflix prize dataset). In Proceedings of IEEE Symposium on Security and Privacy. 2008.
    [9] C. Dwork, ‘‘Differential privacy,’’ in Proc. 33rd Int. Conf. Automata, Lang.Program., 2006, pp. 1–12.
    [10] Warner, S. L. (1965). Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309), 63-69.
    [11] Apple. (2018). Learning With Privacy at Scale. Retrieved from https://machinelearning.apple.com/docs/learning-with-privacy-atscale/appledifferentialprivacysystem.pdf
    [12] U. Erlingsson, V. Pihur, and A. Korolova, ‘‘RAPPOR: Randomized aggregatable privacy-preserving ordinal response,’’ in Proc. ACM SIGSAC Conf. Comput. Commun. Secur., 2014, pp. 1054–1067.
    [13] Dwork, C., & Roth, A. (2014). The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3–4), 211-407.
    [14] C. Dwork, F. McSherry, K. Nissim, and A. Smith, ‘‘Calibrating noise to sensitivity in private data analysis,’’ in Proc. 3rd Conf. Theory Cryptogr., 2006, pp. 265–284.
    [15] Ted. (2019, February 20). Why differential privacy is awesome. Retrieved from https://desfontain.es/privacy/differential-privacy-awesomeness.html
    [16] Warner, S. L. (1965). Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309), 63-69.
    [17] Greenberg, B. G., Abul-Ela, A. L. A., Simmons, W. R., & Horvitz, D. G. (1969). The unrelated question randomized response model: Theoretical framework. Journal of the American Statistical Association, 64(326), 520-539.
    [18] Additive noise mechanisms. (2020, April 11‎) In Wikipedia. Retrieved from https://en.wikipedia.org/wiki/Additive_noise_mechanisms
    [19] Fan, Z., & Xu, X. (2019, August). APDPk-Means: A New Differential Privacy Clustering Algorithm Based on Arithmetic Progression Privacy Budget Allocation. In 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS) (pp. 1737-1742). IEEE.
    [20] Sun, H., Xiao, X., Khalil, I., Yang, Y., Qin, Z., Wang, H., & Yu, T. (2019, November). Analyzing Subgraph Statistics from Extended Local Views with Decentralized Differential Privacy. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (pp. 703-717).
    [21] Asif, H., Papakonstantinou, P. A., & Vaidya, J. (2019, November). How to accurately and privately identify anomalies. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (pp. 719-736).
    [22] Wang, L., Zhang, D., Yang, D., Lim, B. Y., Han, X., & Ma, X. (2020). Sparse Mobile Crowdsensing With Differential and Distortion Location Privacy. IEEE Transactions on Information Forensics and Security, 15, 2735-2749.
    [23] Fanti, G., Pihur, V., & Erlingsson, Ú. (2016). Building a rappor with the unknown: Privacy-preserving learning of associations and data dictionaries. Proceedings on Privacy Enhancing Technologies, 2016(3), 41-61.
    [24] Tang, J., Korolova, A., Bai, X., Wang, X., & Wang, X. (2017). Privacy loss in apple's implementation of differential privacy on macos 10.12. arXiv preprint arXiv:1709.02753.
    [25] Sun, M., & Tay, W. P. (2019). On the Relationship Between Inference and Data Privacy in Decentralized IoT Networks. IEEE Transactions on Information Forensics and Security, 15, 852-866.
    [26] Naor, M., Pinkas, B., & Ronen, E. (2019, November). How to (not) share a password: Privacy preserving protocols for finding heavy hitters with adversarial behavior. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (pp. 1369-1386).
    [27] Kim, J. W., Edemacu, K., & Jang, B. (2019). MPPDS: Multilevel Privacy-Preserving Data Sharing in a Collaborative eHealth System. IEEE Access, 7, 109910-109923.
    [28] Raisaro, J. L., Choi, G., Pradervand, S., Colsenet, R., Jacquemont, N., Rosat, N., ... & Hubaux, J. P. (2018). Protecting privacy and security of genomic data in I2B2 with homomorphic encryption and differential privacy. IEEE/ACM transactions on computational biology and bioinformatics, 15(5), 1413-1426.
    [29] Sahai, A., & Waters, B. (2005, May). Fuzzy identity-based encryption. In Annual International Conference on the Theory and Applications of Cryptographic Techniques (pp. 457-473). Springer, Berlin, Heidelberg.
    [30] Shamir, A. (1979). How to share a secret. Communications of the ACM, 22(11), 612-613.
    [31] Goyal, V., Pandey, O., Sahai, A., & Waters, B. (2006, October). Attribute-based encryption for fine-grained access control of encrypted data. In Proceedings of the 13th ACM conference on Computer and communications security (pp. 89-98).
    [32] Bethencourt, J., Sahai, A., & Waters, B. (2007, May). Ciphertext-policy attribute-based encryption. In 2007 IEEE symposium on security and privacy (SP'07) (pp. 321-334). IEEE.
    [33] National Development Council. (n.d.). Dengue fever daily confirmed case statistics since 1998. Retrieved June 19, 2020, from https://data.gov.tw/dataset/21025
    [34] UCI Machine Learning Repository. (1988, July 1). Iris Data Set. Retrieved June 19, 2020, from https://archive.ics.uci.edu/ml/datasets/iris
    [35] UCI Machine Learning Repository. (1992, July 15). Breast Cancer Wisconsin (Original) Data Set. Retrieved June 19, 2020, from https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+%28original%29
    [36] UCI Machine Learning Repository. (2009, February 11). Acute Inflammations Data Set. Retrieved June 19, 2020, from https://archive.ics.uci.edu/ml/datasets/Acute+Inflammations
    [37] UCI Machine Learning Repository. (2020, February 5). Heart failure clinical records Data Set. Retrieved June 19, 2020, from https://archive.ics.uci.edu/ml/datasets/Heart+failure+clinical+records

    無法下載圖示 校內:立即公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE