| 研究生: |
何助鴻 Ho, Chu-Hung |
|---|---|
| 論文名稱: |
基於道路環境下使用 MapReduce 架構處理異種鄰近物體之K-最短平均距離查詢 K-Shortest Average Distance Query on Heterogeneous Neighboring Objects in Road Networks by Using MapReduce |
| 指導教授: |
李強
Lee, Chiang |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2017 |
| 畢業學年度: | 105 |
| 語文別: | 英文 |
| 論文頁數: | 61 |
| 中文關鍵詞: | 基於位置之查詢 、異種物體 、異種鄰近物體 、k-最短平均距離查詢 、道路環境 、MapReduce |
| 外文關鍵詞: | location-based queries, heterogeneous objects, heterogeneous neighboring objects, k-shortest average distance query, road networks, MapReduce |
| 相關次數: | 點閱:73 下載:5 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
過去的研究大多專注於處理單一種類物體的基於位置之查詢 (location-based queries)。然而,在現實生活中,存在著許多不同種類的物體, 使用者可能對於不同種類的物體較感興趣。我們稱這些不同種類的物體為異種物體 (heterogeneous objects,簡稱為HOs),如果在 HOs 中任兩個物體的距離都小於或等於某個距離值,則此 HOs 被稱為異種鄰近物體 (heterogeneous neighboring objects,簡稱為HNOs)。對於處理 HNOs 的 location-based queries 相較於單一種類型態的物體來說更加複雜,原因是 HOs 之間的鄰近關係必須被考慮。在本論文中,我們基於道路環境和分散式環境,提出一種新穎且有用之 location-based query 來針對 HNOs 做處理,進而提供使用者符合期待的 HNOs 資訊。這個新穎的 location-based query 稱之為 k-最短平均距離查詢 (k-shortest average distance query,簡稱為 KSADQ)。給定一個正整數值 k、一個距離值 d 以及一個查詢點集合 Q,對於每一個查詢點 q 來說,KSADQ 找尋 k 組平均距離最短的 HNOs,而每一組 HNOs 當中的物體彼此之間的距離小於或等於 d。為了以分散式和平行化的方式有效率地處理 KSADQ,我們使用四叉樹將給定的空間分解成若干個單元, 讓各個機器能夠同時存取所需要的資料,並且在 MapReduce 軟體框架的基礎上,搭配適當的資料結構,設計了兩個演算法,分別是基本型 k-最短平均距離查詢演算法與改良型 k-最短平均距離查詢查詢演算法。最後,我們以真實道路資料和合成物體資料為基礎,並且以大量的實驗來證明我們提出演算法之效率及可擴展性。
In the past years, most of the researches focus on developing the query processing techniques on a single data type. However, in real-life applications users may be interested in obtaining information about multiple types of data objects, which are called the heterogeneous objects (HOs). Furthermore, the HOs are termed as the heterogeneous neighboring objects (HNOs), if the pairwise distances between the HOs are less than or equal to a user-defined distance. Obviously, processing the location-based queries on the HNOs is more sophisticated than that on a single type of objects, as the neighboring relationship between the HOs needs to be considered. In this thesis, we present a novel and applicable location-based query for the HNOs, called the k-shortest average distance query (KSADQ). Given a user-defined integer k, a user-defined distance d and a set of query objects Q, the KSADQ returns k-sets of HNOs with the shortest average distance for each query object, where the pairwise road distances among the HOs in each set of HNOs are less than or equal to d. To efficiently process the KSADQ in a distributed and parallel manner, we utilize the quadtree to divide the given space into several cells, so as to enable each computing node to simultaneously accesses the required data. On the basis of MapReduce programming model, we design two algorithms, the basic k-shortest average distance query algorithm and the improved k-shortest average distance query algorithm, with appropriate data structures. Comprehensive experiments using real road networks and synthetic objects datasets are conducted to demonstrate the efficiency and scalability of the proposed algorithms.
[1] A. Akdogan, U. Demiryurek, F. Banaei-Kashani, and C. Shahabi, "Voronoi-based geospatial query processing with mapreduce," in IEEE Second International Conference on Cloud Computing Technology and Science, pages 9-16, 2010.
[2] S. Aridhi, P. Lacomme, L. Ren and B. Vincent, "A mapreduce-based approach for shortest path problem in large-scale networks," in Engineering Applications of Artificial Intelligence, pages 151-165, 2015.2
[3] Y. Chen and J. M. Patel, "Efficient evaluation of all-nearest-neighbor queries," in International Conference on Data Engineering, pages 1056-1065, 2007.
[4] J. Dean and S. Ghemawat, "MapReduce: Simplified data processing on large clusters," in conference on Symposium on Opearting Systems Design and Implementation, 2004.
[5] H. Ferchichi and J. Akaichi, "Using mapreduce for efficient parallel processing of continuous k nearest neighbors in road networks," in Journal of Software and Systems Development, 2016.
[6] R. A. Finkel and J. L. Bentley, "Quad trees a data structure for retrieval on composite keys," in Acta informatica, 4(1), pages 1-9, 1974.
[7] A. Guttman, "R-trees: A dynamic index structure for spatial searching," in ACM SIGMOD, 14(2), pages 47-57, 1984.
[8] T. Hashem, T. Hashem, M. E. Ali, and L. Kulik, "Group trip planning queries in spatial databases," in International Symposium on Spatial and Temporal Databases, pages 259-276, 2013.
[9] C. C. Huang, J. L. Huang, T. C. Liang, J. Z. Wang, W. Y. Shih and W. C. Lee, "Nearest window cluster queries," in International Conference on Extending Database Technology, pages 341-352, 2016.
[10] Y. K. Huang and L. F. Lin, "Continuous within query in road networks," in International Wireless Communications and Mobile Computing Conference, pages 1176-1181, 2011.
[11] G. R. Hjaltason and H. Samet, "Distance browsing in spatial databases," in ACM Transactions on Database Systems, 24(2), pages 265-318, 1999.
[12] Y. K. Huang, C. H. Su, C. Lee and C. H. Ho, "Efficient evaluation of shortest average distance query on heterogeneous neighboring objects in Road Networks," in International Database Engineering and Applications Symposium, 2017.
[13] "hadoop," http://hadoop.apache.org/
[14] "HDFS," https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html
[15] G. Iwerks, H. Samet, and K. Smith, "Continuous k-nearest neighbor queries for continuously moving points with updates," in International Conference on Very Large Data Bases, 29, pages 512-523, 2003.
[16] C. Ji, T. Dong, Y. Li, Y. Shen, K. Li, W. Qiu, W. Qiu, and M. Guo, "Inverted grid-based knn query processing with mapreduce," in ChinaGrid Annual Conference, pages 25-32, 2012.
[17] D. Jiang, B. C. Ooi, L. Shi, and S. Wu, "The performance of mapreduce: An in-depth study," in International Conference on Very Large Data Bases, 3(1-2), pages 472-483, 2010.
[18] M. Kolahdouzan and C. Shahabi. "Voronoi-based aggregate k nearest neighbor search for spatial network databases," in International Conference on Very Large Data Bases, pages 840-851, 2004.
[19] S. Luo, Y. Luo, S. Zhou, G. Cong and J. Guan, "DISKs: A system for distributed spatial group keyword search on road networks," in Proceedings of the VLDB Endowment, 5(12), pages 1966-1969, 2012.
[20] W. Lu, Y. Shen, S. Chen, and B. C. Ooi, "Efficient processing of k nearest neighbor joins using mapreduce," in Proceedings of the VLDB Endowment, 5(10), pages 1016-1027, 2012.
[21] K. Mullesgaard, J. L. Pedersen, H. Lu and Y. Zhou, "Efficient skyline computation in mapeeduce,"in International Conference on Extending Database Technology, pages 37-48, 2014.
[22] A. Papadopoulos and Y. Manolopoulos, "Multiple range query optimization in spatial databases," in East European Symposium on Advances in Databases and Information Systems, pages 71-82, 1998.
[23] D. Papadias, Q. Shen, Y. Tao, and K. Mouratidis, "Group nearest neighbor queries," in International Conference on Data Engineering, pages 301-312, 2004.
[24] D. Papadias, J. Zhang, N. Mamoulis, Y. Tao and A. K. H. Tung. "Query processing in spatial network databases," in International Conference on Very Large Data Bases, pages 802-813, 2003.
[25] N. Roussopoulos, S. Kelley, and F. Vincent, "Nearest neighbor queries," in ACM SIGMOD, 24(2), pages 71-79, 1995.
[26] Y. Sun, J. Qi, Y. Zheng and R. Zhang, "K-nearest neighbor temporal aggregate queries," in International Conference on Extending Database Technology, pages 493-504, 2015.
[27] L. H. Van and A. Takasu, "An efficient distributed index for geospatial databases," in International Conference on Database and Expert Systems Applications, pages 28-42, 2015.
[28] C. Xia, H. Lu, B. C. Ooi, and J. Hu, "Gorder: An efficient method for knn join processing," in International Conference on Very Large Data Bases, 30, pages 756-767, 2004.
[29] T. Yokoyama, Y. Ishikawa, and Y. Suzuki, "Processing all k-nearest neighbor queries in hadoop," in International Conference on Web-Age Information Management, pages 346-351, 2012.
[30] M. L. Yiu, N. Mamoulis, and D. Papadias, "Aggregate nearest neighbor queries in road networks," in ACM Transactions on Knowledge and Data Engineering, 17(6), pages 820-833, 2005.
[31] D. Zhang, Y. M. Chee, A. Mondal, A. K. H. Tung, and M. Kitsuregawa, "Keyword search in spatial databases: Towards searching by document," in International Conference on Data Engineering, pages 688-699, 2009.
[32] D. Zhang, C. Y. Chan and K. L. Tan, "Nearest group queries," in International Conference on Scientific and Statistical Database Management, 2013.
[33] S. Zhang, J. Han, Z. Liu, K. Wang, and S. Feng, "Spatial queries evaluation with mapreduce," in International Conference on Grid and Cooperative Computing, pages 287-292, 2009.
[34] L. Zhu, Y. Jing, W. Sun, D. Mao, and P. Liu, "Voronoi-based aggregate nearest neighbor query processing in road networks," in SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 518-521, 2010.
[35] C. Zhang, F. Li, and J. Jestes, "Efficient parallel knn joins for large data in mapreduce," in International Conference on Extending Database Technology, pages 38-49, 2012.
[36] R. Zhong, G. Li, K. L. Tan, L. Zhou and Z. Gong, "G-Tree: An Efficient and Scalable Index for Spatial Search on Road Networks," in IEEE Transactions on Knowledge and Data Engineering, pages 2175-2189, 2015.