簡易檢索 / 詳目顯示

研究生: 吳柏欣
Wu, Po-Hsin
論文名稱: 以深度學習方法結合個人隱含特徵與打卡紀錄做社群網路連結預測
A Deep Learning Approach to Link Prediction for Social Networks Based on Personal Latent-factor and Check-in Histories
指導教授: 劉任修
Liu, Ren-Shiou
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理研究所
Institute of Information Management
論文出版年: 2018
畢業學年度: 106
語文別: 中文
論文頁數: 48
中文關鍵詞: 深度學習社群網路連結預測矩陣分解地點評分
外文關鍵詞: Deep Learning, Social Networks, Link Prediction, Matrix Factorization, Location Rating
相關次數: 點閱:163下載:12
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著線上社群網路的興起,社群網路的連結預測(Link Prediction)問題應運而生。在過去,許多研究者利用社群用戶間的交友關係進行預測。然而,社群網路的形成通常與用戶間的地理關係息息相關。許多研究也指出社交關係有助於改善地點推薦系統。此外,受到推薦系統的啟發,我們認為擁有相同偏好的人更容易成為朋友。因此,我們將使用矩陣分解法(Matrix Factorization)擷取存在於地點評分中的個人潛在因子(Personal Latent-factor),並結合社交關係、打卡紀錄等資料來預測社群網路中連結存在的機率。
    首先,我們採用網路嵌入方法來提取社群網路中的各節點拓撲特徵;採用矩陣分解法將地點評分矩陣拆分為「個人」與「地點」的潛在因子矩陣,並採用個人潛在因子矩陣作為節點特徵;採用短期、中期、長期的時區,將用戶的打卡紀錄以傑卡德指標(Jaccard Index)擷取兩節點的打卡模式特徵。最後,我們採用Hadamard函數對節點特徵運算可取得連結的特徵。此外,為解決多種資料間的高度非線性關係以及存在大量資料中的噪音,我們採用降噪自動編碼器(Denoising Autoencoder)建立多模式的深度神經網路模型MSDA。MSDA可以進行非監督式的特徵提取,並穩健地對抗資料中的噪音,最後利用預訓練(Pre-training)的模型,快速地進行標籤資料的監督式訓練。研究結果表明,結合多種特徵的MSDA模型可以改善過去僅考慮網路拓撲的網路嵌入方法,能更好的預測社群連結,為社群用戶提供更好的好友推薦。

    With the rise of online social networks, the issue of Link Prediction in the social network has emerged. In the past, many researchers used the friendship to make predictions. However, the formation of a social network is usually related to the geographical relationship between users. Inspired by recommender system, we believe that people who have the similar preferences are more likely to be friends. Therefore, we use Matrix Factorization to extract users' personal latent-factor in location ratings, and combine personal latent-factor, social relationships and check-in histories to predict the probability of existence of links in social network.
    First, we use a network embedding method to extract the topological representations of each node in social network. Next, matrix factorization is used to extract users' personal latent-factor. Finally, we extract the check-in pattern between users by Jaccard index and three different lengths of period. In addition to the feature extraction, to sovle the highly nonlinear relationship between multiple data, we use Denoising Autoencoder to build a multimodal neural network model MSDA. The experimental results show that the proposed model MSDA can improve the method which only considers network topology in the past. Our model is better than network embedding method in AUC.

    摘要i EXTENDED ABSTRACT ii 誌謝ix 目錄x 表目錄xii 圖目錄xiii 1 緒論1 1.1 背景及動機. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 研究目的. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 貢獻. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 論文架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 相關文獻探討5 2.1 相似度方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 區域方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1.2 全域方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.3 準區域方法. . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 深度學習方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 小結. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3 研究方法13 3.1 問題描述. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 特徵擷取方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.1 社群網路特徵擷取. . . . . . . . . . . . . . . . . . . . . . . 16 3.2.2 地點評分特徵擷取. . . . . . . . . . . . . . . . . . . . . . . 20 3.2.3 打卡紀錄特徵擷取. . . . . . . . . . . . . . . . . . . . . . . 23 3.3 模型架構與訓練方法. . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.1 多模堆疊降噪自動編碼器. . . . . . . . . . . . . . . . . . . 25 3.3.2 訓練與連結預測方法. . . . . . . . . . . . . . . . . . . . . . 28 4 實驗與分析30 4.1 實驗架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.2 資料集與特徵描述. . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.3 實驗結果與分析. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.3.1 衡量指標. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.3.2 模型驗證方法. . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.3.3 實驗環境與參數設定. . . . . . . . . . . . . . . . . . . . . . 37 4.3.4 實驗結果與分析. . . . . . . . . . . . . . . . . . . . . . . . . 39 5 結論與未來發展42 參考文獻43

    Al Hasan, M. and Zaki, M. J. (2011). A survey of link prediction in social networks. In Social network data analytics, pages 243–275. Springer.
    Barkan, O. and Koenigstein, N. (2016). Item2vec: neural item embedding for collaborative filtering. In Proceedings of the 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing, pages 1–6.
    Cai, H., Zheng, V.W., and Chang, K. (2018). A comprehensive survey of graph embedding: problems, techniques and applications. IEEE Transactions on Knowledge and Data Engineering.
    Chandola, V., Banerjee, A., and Kumar, V. (2009). Anomaly detection: A survey. ACM computing surveys (CSUR), 41(15):1–58.
    Cho, Y.-S., Ver Steeg, G., and Galstyan, A. (2013). Socially relevant venue clustering from check-in data. In Proceedings of KDD Workshop on Mining and Learning with Graphs.
    Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., and Kuksa, P. (2011). Natural language processing (almost) from scratch. Journal of Machine
    Learning Research, 12(Aug):2493–2537.
    Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3):75–174.
    Gandomi, A. and Haider, M. (2015). Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management, 35(2):137–144.
    Grover, A. and Leskovec, J. (2016). node2vec: Scalable feature learning for networks.
    In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 855–864.
    Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., Prenger, R., Satheesh, S., Sengupta, S., Coates, A., and Y. Ng, A. (2014). Deep speech: Scaling up end-to-end speech recognition. CoRR, abs/1412.5567.
    He, K., Zhang, X., Ren, S., and Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision, pages 1026–1034.
    Hinton, G. E., Osindero, S., and Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7):1527–1554.
    Hou, Y. and Holder, L. B. (2017). Deep learning approach to link weight prediction. In Proceedings of the 2017 International Joint Conference on Neural Networks, pages
    1855–1862.
    Jaccard, P. (1901). ´ Etude comparative de la distribution florale dans une portion desalpes et des jura. Bull Soc Vaudoise Sci Nat, 37:547–579.
    Kingma, D. P. and Ba, J. (2014). Adam: A method for stochastic optimization. CoRR, abs/1412.6980.
    Kohavi, R. et al. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th international joint conference on Artificial intelligence, volume 2, pages 1137–1145.
    Kotsiantis, S., Kanellopoulos, D., Pintelas, P., et al. (2006). Handling imbalanced datasets: A review. GESTS International Transactions on Computer Science and
    Engineering, 30(1):25–36.
    Li, C.-T., Huang, M.-Y., and Yan, R. (2018). Team formation with influence maximization for influential event organization on social networks. World Wide Web, 21(4):939–959.
    Li, X., Du, N., Li, H., Li, K., Gao, J., and Zhang, A. (2014). A deep learning approach to link prediction in dynamic networks. In Proceedings of the 2014 SIAM International Conference on Data Mining, pages 289–297.
    Liben-Nowell, D. and Kleinberg, J. (2007). The link-prediction problem for social networks. Journal of the Association for Information Science and Technology,
    58(7):1019–1031.
    Liu, F., Liu, B., Sun, C., Liu, M., and Wang, X. (2013). Deep learning approaches for link prediction in social network services. In Proceedings of International Conference
    on Neural Information Processing, pages 425–432.
    Liu, F., Liu, B., Sun, C., Liu, M., and Wang, X. (2015a). Deep belief network-based approaches for link prediction in signed social networks. Entropy, 17(4):2140–2169.
    Liu, F., Liu, B., Sun, C., Liu, M., and Wang, X. (2015b). Multimodal deep belief network based link prediction and user comment generation. In Proceedings of Neural Information Processing, pages 20–28.
    Liu, W. and L¨u, L. (2010). Link prediction based on local random walk. Europhysics Letters, 89(5):58007–58012.
    Liu, Y., Sui, Z., Kang, C., and Gao, Y. (2014). Uncovering patterns of inter-urban trip and spatial interaction from social media check-in data. PloS one, 9(1):1–11.
    L¨u, L. and Zhou, T. (2011). Link prediction in complex networks: A survey. Physica A: statistical mechanics and its applications, 390(6):1150–1170.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Proceedings of
    International Conference on Neural Information Processing Systems, pages 3111–3119.
    Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., and Ng, A. Y. (2011). Multimodal deep learning. In Proceedings of the 28th international conference on machine learning,
    pages 689–696.
    Page, L., Brin, S., Motwani, R., and Winograd, T. (1999). The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab.
    Pearson, K. (1905). The problem of the random walk. Nature, 72(1865):294.
    Perozzi, B., Al-Rfou, R., and Skiena, S. (2014). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 701–710.
    Qi, Y., Bar-Joseph, Z., and Klein-Seetharaman, J. (2006). Evaluation of different biological data and computational classification methods for use in protein interaction prediction. Proteins, 63(3):490–500.
    Schafer, J. B., Frankowski, D., Herlocker, J., and Sen, S. (2007). Collaborative filtering recommender systems. In The adaptive web, pages 291–324. Springer.
    Sen, P., Namata, G., Bilgic, M., Getoor, L., Galligher, B., and Eliassi-Rad, T. (2008). Collective classification in network data. AI Magazine, 29(3):93–106.
    Shen, C.-Y., Yang, D.-N., Lee, W.-C., and Chen, M.-S. (2015). Maximizing friendmaking likelihood for social activity organization. In Proceedings of Advances in Knowledge Discovery and Data Mining, pages 3–15.
    Simonyan, K. and Zisserman, A. (2014). Very deep convolutional networks for largescale image recognition. CoRR, abs/1409.1556.
    Sun, Y., Barber, R., Gupta, M., Aggarwal, C. C., and Han, J. (2011). Co-author relationship prediction in heterogeneous bibliographic networks. In Proceedings of the 2011 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pages 121–128.
    Tang, J., Hu, X., and Liu, H. (2013). Social recommendation: a review. Social Network Analysis and Mining, 3(4):1113–1133.
    Tong, H., Faloutsos, C., and Pan, J.-Y. (2006). Fast random walk with restart and its applications. In Proceedings of the Sixth International Conference on Data Mining,
    pages 613–622.
    Vincent, P., Larochelle, H., Bengio, Y., and Manzagol, P.-A. (2008). Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th
    international conference on Machine learning, pages 1096–1103.
    Wang, D., Cui, P., and Zhu,W. (2016). Structural deep network embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1225–1234.
    Xu, J. and Chen, H. (2008). The topology of dark networks. Communications of the ACM, 51(10):58–65.
    Yang, C., Sun, M., Zhao, W. X., Liu, Z., and Chang, E. Y. (2017). A neural network approach to jointly modeling social networks and mobile trajectories. ACM Transactions on Information Systems, 35(36):1–28.
    Zhang, C., Zhang, H., Yuan, D., and Zhang, M. (2016). Deep learning based link prediction with social pattern and external attribute knowledge in bibliographic networks. In Proceedings of the 2016 IEEE International Conference on Internet of Things and IEEE Green Computing and Communications and IEEE Cyber, Physical and Social Computing and IEEE Smart Data, pages 815–821.
    Zhou, T., L¨u, L., and Zhang, Y.-C. (2009). Predicting missing links via local information. The European Physical Journal B, 71(4):623–630.

    下載圖示 校內:2020-07-20公開
    校外:2020-07-20公開
    QR CODE