成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	張孟淮 Chang, Meng-Huai
論文名稱：	利用P2P網路實現分散式矩陣分解法－以Chord為例 Utilizing P2P network to implement distributed matrix factorization — Using Chord as an Example
指導教授：	劉任修 Liu, Ren-Shiou
學位類別：	碩士 Master
系所名稱：	管理學院 - 資訊管理研究所 Institute of Information Management
論文出版年：	2023
畢業學年度：	111
語文別：	中文
論文頁數：	45
中文關鍵詞：	矩陣分解法、P2P網路、分散式矩陣分解法、分散式儲存
外文關鍵詞：	Distributed matrix factorization, Distributed storage
相關次數：	點閱：147 下載：11
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

矩陣分解法正是構築推薦系統的主要方法之一。然而，推薦系統雖然能夠輔助決策，但其計算的過程包含了大量用戶的敏感資訊，因此如何保護用戶隱私成為一個重要挑戰。另一方面，在不斷增加的用戶和商品項目下，傳統的矩陣分解法所需要的儲存成本也愈來愈龐大，終端設備上有限的儲存容量無法負擔如此大量且重複的資料，如何分擔資料儲存也成為了重大課題之一。
通過網際網路，以P2P網路模式的資料分享機制得以讓大量的資料(用戶潛在特徵向量、項目潛在特徵向量)不見得要儲存在同一個設備中，在需要使用時透過網路傳輸即可獲得必需的資料。另外，與集中式儲存資料的方式不同，分散式儲存的架構讓特定人士難以竄改項目的潛在特徵，藉此保有資料完整性。本研究透過P2P網路與結合應用於推薦系統中的矩陣分解法，希望在保有相似精準度的情況下，保護用戶隱私的同時得以分散重複資料的儲存，減少終端設備記憶體的負擔，且藉此維護資料完整性。
最後本研究的實驗得知，在將矩陣分解法之資料分散儲存，其預測能力與其他未分散儲存之模型差異不大，約介於-4.7%至+3.9%之間。另外，在建立模型的過程中，所需的網路傳輸成本會遠小於其它分散式矩陣分解法至少四倍，因此此方法也不失為一種保護隱私與資料完整性的一種選擇。

Matrix factorization's computational process involved contains a large amount of sensitive user information, making the protection of user privacy a significant challenge. With the increasing number of users and items, the storage cost required by traditional matrix factorization methods has become increasingly massive.
This study combines P2P networking with matrix factorization methods applied in recommendation systems, aiming to protect user privacy while reducing the storage burden of repetitive data and maintaining data integrity. The experiments conducted in this research revealed that when storing data using a distributed approach for matrix factorization, The predictive ability is not much different from other models without decentralized storage, ranging from -4.7% to +3.9%. The required network transmission cost during model building is significantly reduced by at least four times compared to other distributed matrix factorization methods. This approach can be considered as a choice for protecting privacy and ensuring data integrity.

摘要 i
EXTENDED ABSTRACT ii
誌謝 ix
目錄 x
表目錄 xii
圖目錄 xiii
第壹章 緒論 1
第一節 背景及動機 2
第二節 研究目的 4
第三節 研究貢獻 4
第四節 論文架構 4
第貳章 相關文獻探討 5
第一節 矩陣分解法6
第一小節 分散式矩陣分解法 8
第二節 P2P網路架構 10
第一小節 覆蓋網路 10
第二小節 分層覆蓋網路 14
第三節 聯盟式學習 16
第四節 小結 19
第參章 研究方法 20
第一節 P2P網路架構 22
第一小節 超級對等點 23
第二小節 一般對等點 25
第二節 系統運作 25
第一小節 潛在特徵向量更新機制 26
第二小節 推薦方法 29
第肆章 實驗與分析 30
第一節 實驗流程 30
第二節 資料集與資料前處理 31
第三節 實驗評估指標 33
第四節 實驗之環境與參數的設定 33
第一小節 模型建立方法 34
第二小節 參數設定 36
第五節 實驗結果 38
第伍章 結論與未來發展 42
參考文獻 43
                                    

Ammad-Ud-Din, M., Ivannikova, E., Khan, S. A., Oyomno, W., Fu, Q., Tan, K. E., & Flanagan, A. (2019). Federated Collaborative Filtering for Privacy-preserving Personalized Recommendation System. ArXiv, abs/1901.09888.

Chai, D., Wang, L., Chen, K., & Yang, Q. (2021). Secure Federated Matrix Factorization. IEEE Intelligent Systems, 36(5), 11–20.

Daswani, N., Garcia-Molina, H., & Yang, B. (2003). Open Problems in Data-sharing Peer-to-peer Systems. In International Conference on Database Theory (pp. 1–15).

Duriakova, E., Tragos, E. Z., Smyth, B., Hurley, N., Pe ̃na, F. J., Symeonidis, P., . . . Lawlor, A. (2019). PDMFRec: A Decentralised Matrix Factorisation with Tunable User-centric Privacy. In Proceedings of the 13th ACM Conference on Recommender Systems (pp. 457–461)

Garces-Erice, L., Biersack, E. W., Ross, K. W., Felber, P. A., & Urvoy-Keller, G. (2003). Hierarchical Peer-to-peer Systems. Parallel Processing Letters, 13(04), 643–657.

Gemulla, R., Nijkamp, E., Haas, P. J., & Sismanis, Y. (2011). Large-scale Matrix Factorization with Distributed Stochastic Gradient Descent. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 69–77).

Koren, Y. (2008). Factorization Meets the Neighborhood: A Multifaceted Collaborative Filtering Model. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 426–434).

Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix Factorization Techniques for Recommender Systems. In Computer (p. 30-37).

Li, F., Wu, B., Xu, L., Shi, C., & Shi, J. (2014). A Fast Distributed Stochastic Gradient Descent Algorithm for Matrix Factorization. In Proceedings of the 3rd International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications (pp. 77–87).

Liu, C., Yang, H.-c., Fan, J., He, L.-W., & Wang, Y.-M. (2010). Distributed Non-negative Matrix Factorization for Web-Scale Dyadic Data Analysis on Mapreduce. In Proceedings of the 19th International Conference on World Wide Web (pp. 681– 690).

Liu, P., Fan, Y., Huang, K., & Huang, G. (2020). Super Peer-based P2P VoD Architecture for Supporting Multiple Terminals. In International Conference of Pioneering Computer Scientists, Engineers and Educators (pp. 389–404).

McMahan, H. B., Moore, E., Ramage, D., & y Arcas, B. A. (2016). Federated Learning of Deep Networks Using Model Averaging. CoRR, abs/1602.05629.

Ratnasamy, S., Francis, P., Handley, M., Karp, R., & Shenker, S. (2001). A Scalable Content-addressable Network. In Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (pp.161–172).

Regulation, G. D. P. (2018). General Data Protection Regulation (GDPR). Intersoft Consulting, Accessed in October, 24(1).

Rowstron, A., & Druschel, P. (2001a). Pastry: Scalable, Decentralized Object Location, and Routing for Large-scale Peer-to-peer Systems. In IFIP/ACM International Conference on Distributed Systems Platforms and Open Distributed Processing (pp. 329–350).

Rowstron, A., & Druschel, P. (2001b). Storage Management and Caching in PAST, a Large-scale, Persistent Peer-to-peer Storage Utility. In Proceedings of the Eighteenth ACM Symposium on Operating Systems Principles (pp. 188–201).

Stoica, I., Morris, R., Liben-Nowell, D., Karger, D. R., Kaashoek, M. F., Dabek, F., & Balakrishnan, H. (2003). Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications. IEEE/ACM Transactions on Networking, 11(1), 17–32.

Vallet, D., Friedman, A., & Berkovsky, S. (2014). Matrix Factorization without User Data Retention. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 569–580).

Zhao, B. Y., Huang, L., Stribling, J., Rhea, S. C., Joseph, A. D., & Kubiatowicz, J. D. (2004). Tapestry: A Resilient Global-scale Overlay for Service Deployment. IEEE Journal on Selected Areas in Communications, 22(1), 41–53.

校外：立即公開

簡易檢索 / 詳目顯示

相關論文