研究生: |
吳承翰 Wu, Cheng-Han |
---|---|
論文名稱: |
學術社群網路之資料視覺化與關係探索 Data Visualization and Relationship Discovery in a Scholarly Social Network |
指導教授: |
鄧維光
Teng, Wei-Guang |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 工程科學系 Department of Engineering Science |
論文出版年: | 2024 |
畢業學年度: | 112 |
語文別: | 英文 |
論文頁數: | 57 |
中文關鍵詞: | 社群網路分析 、學者社群網路 、圖資料庫 |
外文關鍵詞: | social network analysis, scholarly social network, graph database |
相關次數: | 點閱:88 下載:7 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在社群網路分析此一課題中,資料間往往存在複雜關係且具有高維度,例如在學術網路中學者間有著不同關係,若採用傳統的關聯式資料庫較易受到限制,而造成資料查詢與管理時的效率低下。有鑑於此,本研究提出了使用關聯式與非關聯式資料庫的整合方案,具體來說,非關聯式的圖資料庫使用節點和邊來表示實體及其間的關係,使得它在探索和分析關係方面具有更強的能力。明確而言,我們所提出的資訊系統將學者資訊和關係存放於關聯式資料庫,而圖資料庫則用於更有效地分析和視覺化這些關係,此一方法受啟發於圖資料立方體的概念,該概念促進了多維數據分析,此外,我們透過明確定義學者間的關係深入分析,這樣的明確定義有助於我們更清晰地識別學者之間的合作網路。我們實際運用了此系統以分析台灣工程學者的資料集,分析內容包括識別共同作者、合作關係和其他學術關係以及探索中心性指標以確定網絡中的關鍵人物。此外,本研究還探討了六度分隔理論在學者社群網路中的應用,藉由計算頂尖學者之間的平均路徑長度,展示了學者間的高連接性,根據我們的結果顯示,這群頂尖學者的平均路徑長度約為2.1,顯示出學術網路內部的緊密合作與資訊傳播的高效率。
In the context of social network analysis, the data often contains complex relationships and high-dimensional characteristics, such as the various relationships between scholars in academic networks. Traditional relational databases are easily constrained when managing and querying such data, leading to inefficiencies in both querying and data management. To address this, our study proposes an integrated approach combining relational and non-relational databases. Specifically, non-relational graph databases use nodes and edges to represent entities and their relationships, which provides greater capabilities in exploring and analyzing these connections. Specifically, our proposed system organizes scholar information and relationships within a relational database, while the graph database is used for more effective analysis and visualization of these relationships. This is inspired by the concept of graph cube, which facilitates multidimensional data analysis. Additionally, by clearly defining the relationships between scholars, we were able to conduct an in-depth analysis, allowing us to more clearly identify collaboration networks among scholars. The proposed system is utilized to conduct data analysis on a dataset of Taiwanese engineering scholars. The analysis includes identifying co-authorships, collaborations, and other academic relationships, as well as exploring centrality metrics to determine key figures in the network. Furthermore, we examine in this work the application of the "six degrees of separation" theory in academic networks by calculating the average path length among top scholars, highlighting the high level of connectivity between them. The results show that the average path length among these top scholars is approximately 2.1, indicating a highly collaborative network and efficient dissemination of information within the academic community.
[1] KONG, Xiangjie, et al. Academic social networks: Modeling, analysis, mining and applications. Journal of Network and Computer Applications, 2019, 132: 86-103.
[2] XU, Bo, et al. HIM: Discovering Implicit Relationships in Heterogeneous Social Networks. In: ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2024. p. 5875-5879.
[3] Collective Intelligence Can Be Predicted and Quantified, New Study Finds. https://www.cmu.edu/tepper/news/stories/2021/may/collective-intelligence-research.html
[4] BATRA, Shalini; TYAGI, Charu. Comparative analysis of relational and graph databases. International Journal of Soft Computing and Engineering (IJSCE), 2012, 2.2: 509-512.
[5] VYAWAHARE, H. R.; KARDE, Pravin P.; THAKARE, Vilas M. A hybrid database approach using graph and relational database. In: 2018 International Conference on Research in Intelligent and Computing in Engineering (RICE). IEEE, 2018. p. 1-4.
[6] CHEN, Jinhua, et al. Graph database and relational database performance comparison on a transportation network. In: Advances in Computing and Data Sciences: 4th International Conference, ICACDS 2020, Valletta, Malta, April 24–25, 2020, Revised Selected Papers 4. springer singapore, 2020. p. 407-418.
[7] ALMABDY, Soad. Comparative analysis of relational and graph databases for social networks. In: 2018 1st International Conference on Computer Applications & Information Security (ICCAIS). IEEE, 2018. p. 1-4.
[8] NGUYEN, Thanh Binh; TJOA, A. Min; WAGNER, Roland R. An object oriented multidimensional data model for OLAP. In: Web-Age Information Management: First International Conference, WAIM 2000 Shanghai, China, June 21–23, 2000 Proceedings 1. Springer Berlin Heidelberg, 2000. p. 69-82.
[9] LI, Dandan; HAN, Lu; DING, Yi. SQL query optimization methods of relational database system. In: 2010 Second International Conference on Computer Engineering and Applications. IEEE, 2010. p. 557-560.
[10] RABL, Tilmann, et al. Solving big data challenges for enterprise application performance management. arXiv preprint arXiv:1208.4167, 2012.
[11] JOWAN, Salah A., et al. Traditional RDBMS to NoSQL database: new era of databases for big data. Journal of Basic Sciences, 2016, 29: 83-102.
[12] DAS, Anupam, et al. Issues and Concepts of Graph Database and a Comparative Analysis on list of Graph Database tools. In: 2020 International Conference on Computer Communication and Informatics (ICCCI). IEEE, 2020. p. 1-6.
[13] LUTU, Patricia E. Nalwoga. Using twitter mentions and a graph database to analyse social network centrality. In: 2019 6th international conference on soft computing & machine intelligence (ISCMI). IEEE, 2019. p. 155-159.
[14] CHA, Meeyoung, et al. Measuring user influence in twitter: The million follower fallacy. In: Proceedings of the international AAAI conference on web and social media. 2010. p. 10-17.
[15] LIN, Yin, et al. Identifying insufficient data coverage in databases with multiple relations. Proceedings of the VLDB Endowment, 2020, 13.11.
[16] ANGLES, Renzo, et al. Foundations of modern query languages for graph databases. ACM Computing Surveys (CSUR), 2017, 50.5: 1-40.
[17] GÓMEZ, Leticia I.; KUIJPERS, Bart; VAISMAN, Alejandro A. Analytical queries on semantic trajectories using graph databases. Transactions in GIS, 2019, 23.5: 1078-1101.
[18] POKORNÝ, Jaroslav. Integration of relational and graph databases functionally. Foundations of computing and decision sciences, 2019, 44.4: 427-441.
[19] FERNANDES, Diogo, et al. Graph Databases Comparison: AllegroGraph, ArangoDB, InfiniteGraph, Neo4J, and OrientDB. Data, 2018, 10: 0006910203730380.
[20] ROBINSON, Ian; WEBBER, Jim; EIFREM, Emil. Graph databases: new opportunities for connected data. " O'Reilly Media, Inc.", 2015.
[21] GUO, Dongming; ONSTEIN, Erling. State-of-the-art geospatial information processing in NoSQL databases. ISPRS International Journal of Geo-Information, 2020, 9.5: 331.
[22] DEHDOUH, Khaled. Building OLAP cubes from columnar NoSQL data warehouses. In: Model and Data Engineering: 6th International Conference, MEDI 2016, Almería, Spain, September 21-23, 2016, Proceedings 6. Springer International Publishing, 2016. p. 166-179.
[23] ZHAO, Peixiang, et al. Graph cube: on warehousing and OLAP multidimensional networks. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. 2011. p. 853-864.
[24] JAKAWAT, Wararat; FAVRE, Cécile; LOUDCHER, Sabine. Graphs enriched by cubes for OLAP on bibliographic networks. International Journal of Business Intelligence and Data Mining, 2016, 11.1: 85-107.
[25] HEIM, Philipp; LOHMANN, Steffen; STEGEMANN, Timo. Interactive relationship discovery via the semantic web. In: The Semantic Web: Research and Applications: 7th Extended Semantic Web Conference, ESWC 2010, Heraklion, Crete, Greece, May 30June 3, 2010, Proceedings, Part I 7. Springer Berlin Heidelberg, 2010. p. 303-317.
[26] ZHOU, Kaitlyn, et al. Problems with cosine as a measure of embedding similarity for high frequency words. arXiv preprint arXiv:2205.05092, 2022.
[27] RODRIGUES, Francisco Aparecido. Network centrality: an introduction. A mathematical modeling approach from nonlinear dynamics to complex systems, 2019, 177-196.