簡易檢索 / 詳目顯示

研究生: 張明智
Zhang, Mieng-Zhi
論文名稱: 以空間資料搜尋支援資料探勘之研究
Searching Spatial Data to Support Data Mining
指導教授: 洪榮宏
Hong, Jung-Hong
學位類別: 碩士
Master
系所名稱: 工學院 - 測量及空間資訊學系
Department of Geomatics
論文出版年: 2005
畢業學年度: 93
語文別: 中文
論文頁數: 143
中文關鍵詞: 資料選取知識空間資料倉儲空間資料探勘詮釋資料
外文關鍵詞: spatial data mining, metadata, spatial data warehouse, data selection knowledge
相關次數: 點閱:84下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •   時空資料探勘為近年新興之研究方向,可協助決策者在時空領域之大量資料中探索隱藏之資訊。然而其成功之前提在於分析對象之所有相關資料可被完整搜尋及正確解讀。基於空間資料倉儲存有大量空間資料,提供良好的資料分析環境之特性,自然可期待時空探勘之資料可由資料倉儲來提供及滿足。對於未必了解資料倉儲之操作者而言,如何確認取得資料可滿足需求,顯然在時空探勘的資料需求及資料倉儲的資料供應兩者間需要有一個更為有效的溝通機制。

      本研究即以此提出一個資料倉儲搜尋系統雛型架構,(1)利用資建構完善資料搜尋與應用之環境;(2)利用詮釋資料整合與資料選取知識之設計,組合而成之資料查詢規則,作為使用者與資料倉儲間溝通之橋樑。藉由詮釋資料之應用,提供對於資料倉儲內資料ㄧ致且適當的描述。而資料選取知識設計則分析資料需求條件中可能存在之樣式,並且將其轉換成特定查詢規則以便快速的用於資料搜尋應用上。本研究另設計查詢介面與結果網頁之包裝設計,使得使用者可以在有限的訓練下與系統進行溝通。此資料搜尋框架接收使用者查詢要求,利用所設定的資料選取知識,將其轉換為對詮釋資料的SQL查詢。該系統的優點在於使用者無須具備過多的資料倉儲的認知,僅需利用本系統便可將符合查詢需求的相關資料被尋獲並傳回使用者介面。隨著分散式GIS環境下資料倉儲的快速成長,本研究所提出的具有知識附輔助性的資料搜尋引擎,將可以確實提高資料的使用與GIS技術的提升。

      Spatio-temporal data mining, a research topic that helps decision makers to discover implicit information from massive volume of spatio-temporal data, has received a lot of attention in recent years. The success of such data mining, however, depends on if all spatial data related to the research domain can be found and correctly interpreted. Since spatial data warehouse has the capability of storing huge amount of spatial data as well as providing analyzing environment, it is reasonable to believe spatial data warehouse should be able to fulfill the needs of spatio-temporal data mining. Nonetheless, for users not familiar with the interactions to spatial data warehouse, an efficient communication mechanism is obviously in great demands.

      A prototype of data searching framework for spatial data warehouse is proposed in this thesis, which tries to 1) create a complete data searching and evaluation environment and 2) bridge the gap between users and the warehouse on the basis of the integration of metadata and selection knowledge. Metadata provides unified and appropriate descriptions about the data in the warehouse. The design of data selection knowledge on the other hand analyzes the possible patterns of data requirements and transforms them into formalized rules that can be readily used in querying spatial data. A query interface based on the data mining requirements is designed, such that users can interact with the system with limited training. The searching framework receives the request and transforms it into SQL-based query towards metadata with built-in selection knowledge. The merit of the proposed system is that there is no need for users to have a lot of understanding about warehouse, but all related data can still be found and sent to the users. With the fast growth of warehouse in the distributed GIS environment, such knowledgeable data searching mechanism can surely boost the use of data and the advances of GIS technology.

    摘要…………………………………………………………………… Ⅰ Abtract………… ……………………………………………………… Ⅲ 致謝…………………………………………………………………… Ⅴ 目錄…………………………………………………………………… Ⅵ 表目錄………………………………………………………………… Ⅸ 圖目錄………………………………………………………………… Ⅹ 第1章 緒論 1 §1.1 研究背景 1 §1.2 研究動機與目的 2 §1.3 研究流程與方法 4 §1.4 論文架構 6 第2章 文獻回顧 7 §2.1 資料探勘相關技術 8 §2.1.1 資料探勘技術與發展 9 §2.1.2 空間資料探勘 12 §2.2 資料倉儲 13 §2.2.1 資料倉儲定義 14 §2.2.2 空間資料倉儲 17 §2.3 詮釋資料 19 §2.3.1 詮釋資料標準介紹與發展 20 §2.3.2 詮釋資料應用 21 第3章 空間資料探勘資料需求分析 25 §3.1 資料探勘與資料倉儲應用 26 §3.2 空間資料需求分析 28 §3.2.1 空間分類任務資料需求 28 §3.2.2 空間叢集與誤差任務資料需求 32 §3.2.3 空間關聯規則資料需求 36 §3.2.4 時間序列分析資料需求 39 §3.3 資料需求分析結果與詮釋資料應用 41 §3.3.1 詮釋資料應用 42 §3.3.2 支援空間資料搜尋 44 第4章 系統架構概念與設計 51 §4.1 資料倉儲搜尋系統設計 51 §4.2 空間資料需求樣式之設計 55 §4.2.1 空間資料主題需求 55 §4.2.2 空間資料空間範圍需求 59 §4.2.3 空間資料坐標參考系統基準與精度需求 63 §4.2.4 空間資料完整性需求 63 §4.2.5 空間資料時間約制需求 64 §4.3 資料選取知識與查詢規則 69 §4.3.1 空間分類 70 §4.3.2 空間關聯分析 77 §4.3.3 空間叢集 79 §4.3.4 空間誤差 80 §4.3.5 時間序列分析 81 §4.4 空間資料搜尋應用 83 §4.4.1 查詢介面設計 84 §4.4.2 詮釋資料搜尋比對 87 §4.4.3 資料查詢結果包裝 93 第5章 系統測試與分析 105 §5.1 系統環境說明 105 §5.2 實驗資料說明 110 §5.3 資料搜尋案例及分析 117 第6章 結論與建議 135 第7章 參考文獻 139

    *洪榮宏,江宇嵐,地理資訊系統詮釋資料與資訊交換流通,地理資訊系統詮釋資料與資訊交
    換流通,國土資訊, 系統通訊第33 期,2000。
    *許錫賓,重建區地理資訊管理與資料倉儲, 國土資訊系統, 論文集, 2003。
    *Agrawal , R., Imielinski, T., Swami, A., ”Mining association rules between
    sets of items in large databases”, In Proc. of the ACM SIGMOD Conference on
    Management of Data, pages 207--216, Washington, DC, May 1993.
    *Agrwal, R., Gehrke, J., Gunopulos, D., and Raghavan, P., “Automatic subspace
    clustering of high dimensional data for data mining applications”, In
    Proc.1998 ACM-SIGMOD Int. Seattle,WA,1998.
    *ANZLIC,”ANZLIC, Core Metadata Elements for Land and Geographic Directories
    in Australia and New Zealand (http://www.anzlic.org.au/metaelem.htm), 1998.
    Chaudhuri, S., Dayal, U., ”An overview of data warehousing and OLAP
    technology”, ACM SIGMOD Record, 1997.
    *Cleveland ,W., 『Visualizing Data』, Summit, N.J, 1993.
    *CSDGM, ”Content Standard for Digital GeoSpatial Metadata”, Metadata Ad Hoc
    Working Group Federal Geographic Data Committee, 2002.
    *Dobson ,A. J., 『An Introduction to Generalized Linear Models』, New York:
    Chapman and Hall, 1990.
    *Domingos, P., Pazzani, M., ”Beyond independence: Conditions for the
    optimality of the simple Bayesian classifier”, Proceedings of the 13th
    International Conference on Machine Learning, Bari, Italy, pp. 105--112, 1996.
    *Ester, M, Kriegel, H-P., Sander, J., ”Spatial data mining: A database
    approach”, SSD’97, pages 47-66,1997.
    *Ester, M., Kriegel,H.-P., Sander,J., and Xu. X.,”A density-based algorithm
    for discovering cluster in large spatial databases with noise”,In Proc. 1996
    Int. Conf. Knowledge Discovery and Data Mining(KDD’96), 1996.
    *Fayyad, “Data Mining and Knowledge Discovery”, 1997.
    *Gorawski, M., Malczok, R., ”Distributed Spatial Data Warehouse Indexed with
    Virtual Memory Aggregation Tree”,Proceedings of the Second Workshop on
    Spatio-Temporal Database Management (STDBM’04),Toronto, Canada, August 30th,
    2004.
    *Han ,J., Yang, Q., Kim, E., ”Plan mining by divide-and-conquer”,1999.
    *Han, J., 『Data Mining Concepts and Techniques』, 2000.
    *Han, J., Koperski, K., Stefanovic, N., ” GeoMiner:A system prototype for
    spatial data mining” SIGMOD’97, pages 553-556 ,1997.
    *Han, J., Ng, R., Fu, Y., and Dao, S.,,” Dealing with Semantic Heterogeneity
    by Generalization-Based Data Mining Techniques”, in Papazoglou and
    Schlageter (Eds), Cooperative Information Systems, pages 207-231, Academic
    Press, 1997.
    *Hümmer,W., Bauer, A., Harde, G, ” XCube – XML For Data
    Warehouses”,Proceedings of the 6th ACM international workshop on Data
    warehousing and OLAP ,New Orleans, Louisiana, USA ,2003.
    *Inmon, W. H., 『Building the Data Warehouse』, New Tork:John Wiley &
    Sons,Inc., 1996.
    *Inmon, W. H., 『What is a Data Warehouse』, PRISM.,1995.
    *Jain, A. K., Murty, M. N., Flynn, P.J., ” Data clustering: A survey” ACM
    Comput, 1999.
    *Kaufman, L., Rousseuw, J., “Finding Groups in Data: An Introduction to
    Cluster Analysis”, John Wiley & Sons, 1990.
    *Klösgen, W., May, M., ”Census Data Mining – An Application”, Fraunhofer
    Institute for Autonomous Intelligent Systems Knowledge Discovery Team D-53757
    Sankt Augustin, Germany,2002.
    *Koperski, K., Han, J., ” Discovery of spatial association rules in
    geographic information database”, Proc. 4th Int. Symp. Advances in Spatial
    Databases, SSD ,Portland,1995.
    *Lina, J., ” VizTree: a Tool for Visually Mining and Monitoring Massive Time
    Series”, Proceedings of the 30th VLDB Conference, Toronto, Canada, 2004.
    *MacQueen , J., “Some methods for classification and analysis of
    multivariate observations”,In L. M. LeCam and J. Neyman, editors,
    Proceedings Fifth Berkeley Symposium on Math. Stat. and Prob., pages 281-297.
    University of California Press, 1967.
    *Murthy, S. K., ”Automatic construction of decision trees from data: A multi-
    disciplinary survey”, Data Mining and Knowledge Discovery, Vol. 2, (1998),
    345-389, 1998.
    *Open GIS Consortium, “OpenGIS® Catalogue Services Specification”, OpenGIS®
    Implementation Specification”Document, Version: 2.0,Date: 2004-05-11.
    *ISO/TC211, ISO19115:Geographic information — Metadata, 2003.
    *Papadias, D., Kalnis,P., Zhang,J., and Tao, Y., ”Efficient OLAP Operations
    in Spatial Data Warehouses”, The Hong Kong University of Science &
    Technology Technical Report Series Department of Computer Science,2000.
    *Singh, G., Bharathi, S., Chervenak, A., Deelman, E., Kesselman, C., Manohar,
    M., Patil, S, Pearlman, L.,” A Metadata Catalog Service for Data Intensive
    Applications”, Proceedings of the 2003 ACM/IEEE conference on
    Supercomputing ,2003.
    *Yao, X., “Research Issues in Spatio-temporal Data Mining” A white paper
    submitted to the University Consortium for Geographic Information Science
    (UCGIS) workshop on Geospatial Visualization and Knowledge
    Discovery,Lansdowne, Virginia, Nov. 18-20, 2003.
    *Zghal, H.B., Faiz, S., Ghezala, H.B.,” Exploration Techniques of Spatial
    Data Warehouse: Overview and Application to Incendiary Domain” Computer
    *Systems and Applications, ACS/IEEE International Conference on ,2001.
    *Zhang, T., Ramakrishnan, R., Livny, M., ” BIRCH: An efficient data cluster
    method for very large databases” In Proceedings of the 1996ACM-SIGMOD, pages
    103-114,1996.

    下載圖示 校內:立即公開
    校外:2005-07-14公開
    QR CODE