簡易檢索 / 詳目顯示

研究生: 吳承翰
Wu, Cheng-Han
論文名稱: 學術社群網路之資料視覺化與關係探索
Data Visualization and Relationship Discovery in a Scholarly Social Network
指導教授: 鄧維光
Teng, Wei-Guang
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2024
畢業學年度: 112
語文別: 英文
論文頁數: 57
中文關鍵詞: 社群網路分析學者社群網路圖資料庫
外文關鍵詞: social network analysis, scholarly social network, graph database
相關次數: 點閱:88下載:7
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在社群網路分析此一課題中,資料間往往存在複雜關係且具有高維度,例如在學術網路中學者間有著不同關係,若採用傳統的關聯式資料庫較易受到限制,而造成資料查詢與管理時的效率低下。有鑑於此,本研究提出了使用關聯式與非關聯式資料庫的整合方案,具體來說,非關聯式的圖資料庫使用節點和邊來表示實體及其間的關係,使得它在探索和分析關係方面具有更強的能力。明確而言,我們所提出的資訊系統將學者資訊和關係存放於關聯式資料庫,而圖資料庫則用於更有效地分析和視覺化這些關係,此一方法受啟發於圖資料立方體的概念,該概念促進了多維數據分析,此外,我們透過明確定義學者間的關係深入分析,這樣的明確定義有助於我們更清晰地識別學者之間的合作網路。我們實際運用了此系統以分析台灣工程學者的資料集,分析內容包括識別共同作者、合作關係和其他學術關係以及探索中心性指標以確定網絡中的關鍵人物。此外,本研究還探討了六度分隔理論在學者社群網路中的應用,藉由計算頂尖學者之間的平均路徑長度,展示了學者間的高連接性,根據我們的結果顯示,這群頂尖學者的平均路徑長度約為2.1,顯示出學術網路內部的緊密合作與資訊傳播的高效率。

    In the context of social network analysis, the data often contains complex relationships and high-dimensional characteristics, such as the various relationships between scholars in academic networks. Traditional relational databases are easily constrained when managing and querying such data, leading to inefficiencies in both querying and data management. To address this, our study proposes an integrated approach combining relational and non-relational databases. Specifically, non-relational graph databases use nodes and edges to represent entities and their relationships, which provides greater capabilities in exploring and analyzing these connections. Specifically, our proposed system organizes scholar information and relationships within a relational database, while the graph database is used for more effective analysis and visualization of these relationships. This is inspired by the concept of graph cube, which facilitates multidimensional data analysis. Additionally, by clearly defining the relationships between scholars, we were able to conduct an in-depth analysis, allowing us to more clearly identify collaboration networks among scholars. The proposed system is utilized to conduct data analysis on a dataset of Taiwanese engineering scholars. The analysis includes identifying co-authorships, collaborations, and other academic relationships, as well as exploring centrality metrics to determine key figures in the network. Furthermore, we examine in this work the application of the "six degrees of separation" theory in academic networks by calculating the average path length among top scholars, highlighting the high level of connectivity between them. The results show that the average path length among these top scholars is approximately 2.1, indicating a highly collaborative network and efficient dissemination of information within the academic community.

    Chapter 1 Introduction 1 1.1 Motivation and Overview 1 1.2 Contributions of This Work 3 Chapter 2 Preliminaries 4 2.1 Limitations of Relational Databases in Discovery Relationships 4 2.1.1 Overview of Using Relational Databases 4 2.1.2 Challenges of Relationship Discovery in Relational Databases 5 2.2 Benefits of Using a Graph Database for Discovery Relationships 7 2.2.1 Description and Applications of Graph Databases 7 2.2.2 Overcoming Traditional Database Limits with Graph Databases 8 2.2.3 Multidimensional Heterogeneous Networks 9 2.3 Multidimensional Graph Analytics 11 2.3.1 Overview of a Graph Cube 11 2.3.2 Graph Cube for Multidimensional Data Analysis 12 2.3.3 Flow of Relationship Discovery 14 Chapter 3 Our Proposed Scheme 15 3.1 Footprint Engine 15 3.2 Design of Our System 17 3.2.1 Utilizing a Relational Database 17 3.2.2 Utilizing a Graph Database 18 3.3 Overview of Our System Workflow 20 3.4 Defining Relationships and Syncing Databases 21 3.4.1 Scholar Information 22 3.4.2 Establishing Relationships 23 3.5 Approaches of Scholarly Network Analysis 27 Chapter 4 Experimental Studies 29 4.1 Our Dataset for Experiments 29 4.1.1 Experimental Dataset 29 4.1.2 Using the Graph Database 30 4.2 Neighbors of a Scholar 31 4.3 Use Keywords to Discovery Scholars Relationships in the Field 32 4.4 Identifying Conflicts of Interest 33 4.5 Relationship Discovery between Two Scholars 35 4.6 Analysis of Average Path Length Among Top Scholars: Exploring the Six Degrees of Separation Theory 37 4.7 Centrality Analysis Among Scholars 40 Chapter 5 Conclusions and Future Work 43 Bibliography 44

    [1] KONG, Xiangjie, et al. Academic social networks: Modeling, analysis, mining and applications. Journal of Network and Computer Applications, 2019, 132: 86-103.
    [2] XU, Bo, et al. HIM: Discovering Implicit Relationships in Heterogeneous Social Networks. In: ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2024. p. 5875-5879.
    [3] Collective Intelligence Can Be Predicted and Quantified, New Study Finds. https://www.cmu.edu/tepper/news/stories/2021/may/collective-intelligence-research.html
    [4] BATRA, Shalini; TYAGI, Charu. Comparative analysis of relational and graph databases. International Journal of Soft Computing and Engineering (IJSCE), 2012, 2.2: 509-512.
    [5] VYAWAHARE, H. R.; KARDE, Pravin P.; THAKARE, Vilas M. A hybrid database approach using graph and relational database. In: 2018 International Conference on Research in Intelligent and Computing in Engineering (RICE). IEEE, 2018. p. 1-4.
    [6] CHEN, Jinhua, et al. Graph database and relational database performance comparison on a transportation network. In: Advances in Computing and Data Sciences: 4th International Conference, ICACDS 2020, Valletta, Malta, April 24–25, 2020, Revised Selected Papers 4. springer singapore, 2020. p. 407-418.
    [7] ALMABDY, Soad. Comparative analysis of relational and graph databases for social networks. In: 2018 1st International Conference on Computer Applications & Information Security (ICCAIS). IEEE, 2018. p. 1-4.
    [8] NGUYEN, Thanh Binh; TJOA, A. Min; WAGNER, Roland R. An object oriented multidimensional data model for OLAP. In: Web-Age Information Management: First International Conference, WAIM 2000 Shanghai, China, June 21–23, 2000 Proceedings 1. Springer Berlin Heidelberg, 2000. p. 69-82.
    [9] LI, Dandan; HAN, Lu; DING, Yi. SQL query optimization methods of relational database system. In: 2010 Second International Conference on Computer Engineering and Applications. IEEE, 2010. p. 557-560.
    [10] RABL, Tilmann, et al. Solving big data challenges for enterprise application performance management. arXiv preprint arXiv:1208.4167, 2012.
    [11] JOWAN, Salah A., et al. Traditional RDBMS to NoSQL database: new era of databases for big data. Journal of Basic Sciences, 2016, 29: 83-102.
    [12] DAS, Anupam, et al. Issues and Concepts of Graph Database and a Comparative Analysis on list of Graph Database tools. In: 2020 International Conference on Computer Communication and Informatics (ICCCI). IEEE, 2020. p. 1-6.
    [13] LUTU, Patricia E. Nalwoga. Using twitter mentions and a graph database to analyse social network centrality. In: 2019 6th international conference on soft computing & machine intelligence (ISCMI). IEEE, 2019. p. 155-159.
    [14] CHA, Meeyoung, et al. Measuring user influence in twitter: The million follower fallacy. In: Proceedings of the international AAAI conference on web and social media. 2010. p. 10-17.
    [15] LIN, Yin, et al. Identifying insufficient data coverage in databases with multiple relations. Proceedings of the VLDB Endowment, 2020, 13.11.
    [16] ANGLES, Renzo, et al. Foundations of modern query languages for graph databases. ACM Computing Surveys (CSUR), 2017, 50.5: 1-40.
    [17] GÓMEZ, Leticia I.; KUIJPERS, Bart; VAISMAN, Alejandro A. Analytical queries on semantic trajectories using graph databases. Transactions in GIS, 2019, 23.5: 1078-1101.
    [18] POKORNÝ, Jaroslav. Integration of relational and graph databases functionally. Foundations of computing and decision sciences, 2019, 44.4: 427-441.
    [19] FERNANDES, Diogo, et al. Graph Databases Comparison: AllegroGraph, ArangoDB, InfiniteGraph, Neo4J, and OrientDB. Data, 2018, 10: 0006910203730380.
    [20] ROBINSON, Ian; WEBBER, Jim; EIFREM, Emil. Graph databases: new opportunities for connected data. " O'Reilly Media, Inc.", 2015.
    [21] GUO, Dongming; ONSTEIN, Erling. State-of-the-art geospatial information processing in NoSQL databases. ISPRS International Journal of Geo-Information, 2020, 9.5: 331.
    [22] DEHDOUH, Khaled. Building OLAP cubes from columnar NoSQL data warehouses. In: Model and Data Engineering: 6th International Conference, MEDI 2016, Almería, Spain, September 21-23, 2016, Proceedings 6. Springer International Publishing, 2016. p. 166-179.
    [23] ZHAO, Peixiang, et al. Graph cube: on warehousing and OLAP multidimensional networks. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. 2011. p. 853-864.
    [24] JAKAWAT, Wararat; FAVRE, Cécile; LOUDCHER, Sabine. Graphs enriched by cubes for OLAP on bibliographic networks. International Journal of Business Intelligence and Data Mining, 2016, 11.1: 85-107.
    [25] HEIM, Philipp; LOHMANN, Steffen; STEGEMANN, Timo. Interactive relationship discovery via the semantic web. In: The Semantic Web: Research and Applications: 7th Extended Semantic Web Conference, ESWC 2010, Heraklion, Crete, Greece, May 30June 3, 2010, Proceedings, Part I 7. Springer Berlin Heidelberg, 2010. p. 303-317.
    [26] ZHOU, Kaitlyn, et al. Problems with cosine as a measure of embedding similarity for high frequency words. arXiv preprint arXiv:2205.05092, 2022.
    [27] RODRIGUES, Francisco Aparecido. Network centrality: an introduction. A mathematical modeling approach from nonlinear dynamics to complex systems, 2019, 177-196.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE