簡易檢索 / 詳目顯示

研究生: 李壯
Li, Zhuang
論文名稱: 為不同位置的用戶查找top-k個地理及文本相關群
Finding Top-k Spatial Textual Clusters for Users in Different Locations
指導教授: 李強
Lee, Chiang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 65
外文關鍵詞: spatial textual, cluster, top-k query, spatial database
相關次數: 點閱:45下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著行動裝置和無線網路的普及化,造就了適地性服務( Location-Based Service, LBS)
    的熱門,其中又以空間-文本查詢(spatial-textual query)最廣為人知。使用者可以透過此
    種查詢,找尋其當前位置附近感興趣的物件或者區域。
    本文在 LBS 的環境下,同時考慮多個使用者,為他們推薦感興趣的區域,區域中會
    包含多個興趣點以供使用者去挑選。舉例來說,有兩個使用者約好一起逛街,其中一
    人想要喝coffee,另一人想吃pizza,那麼他們的需求就是pizza 和coffee,我們把需求
    稱為keyword。而具備這些keyword 的店家稱為使用者的興趣點,即POI(point of
    interest)。我們就要設法給他們推薦一些區域,區域中既有可以喝咖啡的POI,又有
    可以吃pizza 的POI。我們在論文中提出一種適合Spatial-textual 運用環境的劃分
    cluster 的機制,並且提出三個演算法,Baseline Algorithm, Grid Based Algorithm(GB)與
    IR-tree Based Algorithm(IRB)。GB 和IRB 均可以有效的減少查詢時間,而IRB 更加
    快速地縮小了搜尋cluster 要考慮的範圍以及減少要考慮的POI 數量,使其更有效率
    的解決所提出的問題。為此我們設計了一系列的實驗,來評估方法的效率,並驗證我
    們的演算法所找出的答案是有根據的。

    With the advent of mobile devices and wireless networks, location-based services (LBS)
    have become popular, best known as spatial-textual queries. Through such queries, users can
    find objects or areas of interest near their current location.
    In the context of LBS, this paper considers multiple users at the same time and recommends
    regions of interest for them, which will contain multiple interest points for users to choose.
    For example, there are two users who have made a date to go shopping together. One wants
    to drink coffee and the other wants to eat pizza, then their needing are pizza and coffee. We
    call these demands as keywords. The store with these keywords is called the user's point of
    interest, POI (point of interest). We will try to recommend some regions for them, which
    include POIs that can for drinking coffee and POIs can for eating pizza. In the paper, we
    propose a mechanism for partitioning clusters suitable for the spatial-textual environment,
    and propose three algorithms, Baseline Algorithm, Grid Based Algorithm (GB) and IR-tree
    Based Algorithm (IRB). Both GB and IRB can effectively reduce the query time, and IRB
    can more quickly reduce the scope of the search cluster and reduce the number of POIs to
    consider, so that it can solve the problem more efficiently. To this end we have designed a
    series of experiments to evaluate the efficiency of the methods and to verify that the answers
    we find are valid.

    摘要 ........................................................................................................................................I ABSRTACT......................................................................................................................... II ACKNOWLEDGEMENTS .............................................................................................. III CONTENTS ....................................................................................................................... IV List of Tables ....................................................................................................................... V List of Figures .................................................................................................................... VI I INTRODUCTION............................................................................................................. 1 II RELATED WORK.......................................................................................................... 6 Spatial textual search .................................................................................................. 6 Clustering ..................................................................................................................... 8 Top-k ........................................................................................................................... 10 III PRELIMINARIES....................................................................................................... 11 IV Algorithm...................................................................................................................... 21 4.1 Baseline Algorithm: ............................................................................................. 21 4.2 Grid based Algorithm: ........................................................................................ 27 4.3 IR-tree based Algorithm: .................................................................................... 33 Index: IR-tree..................................................................................................... 33 Index: keyword table......................................................................................... 36 lemma.................................................................................................................. 36 V Experiment ..................................................................................................................... 42 Experimental Environment ...................................................................................... 42 Experimental Results ................................................................................................ 43 5.1 Synthetic data............................................................................................... 43 5.2 real data ........................................................................................................ 56 VI Conclusions ................................................................................................................... 62 References........................................................................................................................... 63

    [1] M. Ankerst, M. M. Breunig, H.-P. Kriegel, and J. Sander.Optics: ordering points to identify the clustering structure. In SIGMOD, pages 49–60, 1999.
    [2] K. Bøgh, A. Skovsgaard, and C. S. Jensen. Groupfinder: A new approach to top-k point-of-interest group retrieval. PVLDB, 6(12):1226–1229, 2013.
    [3] P. Bouros, S. Ge, and N. Mamoulis. Spatio-textual similarity joins. PVLDB, 6(1):1–12, 2012.
    [4] Dingming Wu.A Density-Based Approach to the Retrieval of Top-K Spatial Textual Clusters. In CIKM, 2016.
    [5] Kai Yao, Jianjun Li, Guohui Li1, and Changyin Luo. Efficient Group Top-k Spatial Keyword Query Processing. In APWeb, 2016.
    [6] I. D. Felipe, V. Hristidis, and N. Rishe. Keyword search on spatial databases. In ICDE, pages 656–665, 2008.
    [7] D. Wu, G. Cong, and C. S. Jensen. A framework for efficient spatial web object retrieval. VLDB J., 21(6):797–822, 2012.
    [8] J. B. Rocha-Junior, O. Gkorgkas, S. Jonassen, and K. Nørvåg. Efficient processing of top-k spatial keyword queries. In SSTD, pages 205–222, 2011.
    [9] K. Bøgh, A. Skovsgaard, and C. S. Jensen. Groupfinder: A new approach to top-k point-of-interest group retrieval. PVLDB, 6(12):1226–1229, 2013.
    [10] A. Skovsgaard and C. S. Jensen. Finding top-k relevant groups of spatial web objects. VLDB J., 24(4):537–555,2015.
    [11] D.-W. Choi, C.-W. Chung, and Y. Tao. A scalable algorithm for maximizing range sum in spatial databases. PVLDB,5(11):1088–1099, 2012.
    [12] J. Liu, G. Yu, and H. Sun. Subject-oriented top-k hot region queries in spatial dataset. In CIKM, pages 2409–2412, 2011.
    [13] Y. Tao, X. Hu, D.-W. Choi, and C.-W. Chung. Approximate maxrs in spatial databases. PVLDB, 6(13):1546–1557, 2013.
    [14] X. Cao, G. Cong, C. S. Jensen, and M. L. Yiu. Retrieving regions of interest for user exploration. PVLDB, 7(9):733–744, 2014.
    [15] Guttman, A.: R-trees: a dynamic index structure for spatial searching, vol. 14.ACM, 1984.
    [16] Papadias, D., Shen, Q., Tao, Y., Mouratidis, K.: Group nearest neighbor queries.In: Proceedings of 20th International Conference on Data Engineering, pp. 301–312, 2004.
    [17] Yiu, M.L., Mamoulis, Papadias, D.: Aggregate nearest neighbor queries in road networks. TKDE, 820–833, 2005.
    [18] Nick Roussopoulos, Stephen Kelley and Frédéric Vincent, “Nearest neighbor queries”, in ACM SIGMOD, pp. 71-79, 1995.
    [19] R. Benetis, C. S. Jensen, G. Karciauskas, and S. Saltenis,“Nearest neighbor and reverse nearest neighbor queries for moving objects”, in VLDB Journal, vol. 15, no. 3, pp. 229–249, 2006.
    [20] Reynold Cheng, Lei Chen, Jinchuan Chen, Xike Xie,“Evaluating probability threshold k-nearest-neighbor queries over uncertain data”, in EDBT, pp. 672-683, 2009.
    [21] Jiajia Li, Botao Wang, and Guoren Wang, “Efficient Probabilistic Reverse k-Nearest Neighbors Query Processing on Uncertain Data”, in DASFAA, pp. 456-471, 2013.
    [22] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD, pages 226–231, 1996.
    [23] M. Ankerst, M. M. Breunig, H.-P. Kriegel, and J. Sander. Optics: ordering points to identify the clustering structure. In SIGMOD, pages 49–60, 1999.
    [24] X. Wang and H. J. Hamilton. Dbrs: A density-based spatial clustering method with random sampling. In PAKDD, pages 563–575, 2003.
    [25] P. Liu, D. Zhou, and N. Wu. Vdbscan: Varied density base spatial clustering of applications with noise. In Service Systems and Service Management, pages 1–4, 2007.
    [26] J. Sander, M. Ester, H.-P. Kriegel, and X. Xu. Density-based clustering in spatial databases: The algorithm gdbscan and itsapplications. Data Min. Knowl. Discov., 2(2):169–194, 1998.
    [27] R. Fagin, A. Lotem, M. Naor, Optimal aggregation algorithms for middleware, J. Comput. Syst. Sci. 66 (4) (2003) 614–656.
    [28] G. Das, D. Gunopulos, N. Koudas, D. Tsirogiannis, Answering top-k queries using views, in: Proceedings of VLDB, 2006, pp. 451–462.
    [29] I.F. Ilyas, G. Beskales, M.A. Soliman, A survey of top-k query processing techniques in relational database systems, ACM Comput. Surv. 40 (4) (2008).
    [30] L. G. A. Marian, N. Bruno. Evaluating Top-k Queries Over Web Accesible Sources. TODS 29(2), 2004.
    [31] J. B. Rocha-Junior, O. Gkorgkas, S. Jonassen, and K. Nørvag. Efficient processing of top-k spatial keyword queries. In SSTD, pages 205–222, 2011.
    [32] G. Tsatsanifos and A. Vlachou. On processing top-k spatio-textual preference queries. In EDBT, pages 433–444, 2015.
    [33] Y.-Y. Chen, T. Suel, and A. Markowetz. Efficient query processing in geographic web search engines. In SIGMOD,pages 277–288, 2006.
    [34] G. Cong, C. S. Jensen, and D. Wu. Efficient retrieval of the top-k most relevant spatial web objects. PVLDB,2(1):337–348, 2009.
    [35] Xin, Cong Gao, Jensen Christian, Ooi Beng Chin, “Collective spatial keyword querying,” in Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, Pages 373-384, Athens, Greece, June 12 - 16, 2011.
    [36] L Chen, G Cong, CS Jensen, D Wu, “Spatial keyword query processing: an experimental evaluation,” in Proceedings of the 39th international conference on Very Large Data Bases (VLDB), Pages 217-228, January 2013.
    [37] JB Rocha-Junior, K Nørvåg, "Top-k spatial keyword queries on road networks," EDBT, 2012.
    [38] Siqiang Luo, Yifeng Luo, Shuigeng Zhou, Gao Cong, Jihong Guan, Zheng Yong,“Distributed Spatial Keyword Querying on Road Networks,”EDBT, 2014.
    [39] Dongxiang Zhang, Chee-Yong Chan, and Kian-Lee Tan, “Processing spatial keyword query as a top-k aggregation query,” in Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, 2014.
    [40] A Cary, O Wolfson, N Rishe, “Efficient and Scalable Method for Processing Top-k Spatial Boolean Queries”, SSDBM, 2010

    下載圖示 校內:2024-08-01公開
    校外:2024-08-01公開
    QR CODE