研究生: |
張明智 Zhang, Mieng-Zhi |
---|---|
論文名稱: |
以空間資料搜尋支援資料探勘之研究 Searching Spatial Data to Support Data Mining |
指導教授: |
洪榮宏
Hong, Jung-Hong |
學位類別: |
碩士 Master |
系所名稱: |
工學院 - 測量及空間資訊學系 Department of Geomatics |
論文出版年: | 2005 |
畢業學年度: | 93 |
語文別: | 中文 |
論文頁數: | 143 |
中文關鍵詞: | 資料選取知識 、空間資料倉儲 、空間資料探勘 、詮釋資料 |
外文關鍵詞: | spatial data mining, metadata, spatial data warehouse, data selection knowledge |
相關次數: | 點閱:84 下載:4 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
時空資料探勘為近年新興之研究方向,可協助決策者在時空領域之大量資料中探索隱藏之資訊。然而其成功之前提在於分析對象之所有相關資料可被完整搜尋及正確解讀。基於空間資料倉儲存有大量空間資料,提供良好的資料分析環境之特性,自然可期待時空探勘之資料可由資料倉儲來提供及滿足。對於未必了解資料倉儲之操作者而言,如何確認取得資料可滿足需求,顯然在時空探勘的資料需求及資料倉儲的資料供應兩者間需要有一個更為有效的溝通機制。
本研究即以此提出一個資料倉儲搜尋系統雛型架構,(1)利用資建構完善資料搜尋與應用之環境;(2)利用詮釋資料整合與資料選取知識之設計,組合而成之資料查詢規則,作為使用者與資料倉儲間溝通之橋樑。藉由詮釋資料之應用,提供對於資料倉儲內資料ㄧ致且適當的描述。而資料選取知識設計則分析資料需求條件中可能存在之樣式,並且將其轉換成特定查詢規則以便快速的用於資料搜尋應用上。本研究另設計查詢介面與結果網頁之包裝設計,使得使用者可以在有限的訓練下與系統進行溝通。此資料搜尋框架接收使用者查詢要求,利用所設定的資料選取知識,將其轉換為對詮釋資料的SQL查詢。該系統的優點在於使用者無須具備過多的資料倉儲的認知,僅需利用本系統便可將符合查詢需求的相關資料被尋獲並傳回使用者介面。隨著分散式GIS環境下資料倉儲的快速成長,本研究所提出的具有知識附輔助性的資料搜尋引擎,將可以確實提高資料的使用與GIS技術的提升。
Spatio-temporal data mining, a research topic that helps decision makers to discover implicit information from massive volume of spatio-temporal data, has received a lot of attention in recent years. The success of such data mining, however, depends on if all spatial data related to the research domain can be found and correctly interpreted. Since spatial data warehouse has the capability of storing huge amount of spatial data as well as providing analyzing environment, it is reasonable to believe spatial data warehouse should be able to fulfill the needs of spatio-temporal data mining. Nonetheless, for users not familiar with the interactions to spatial data warehouse, an efficient communication mechanism is obviously in great demands.
A prototype of data searching framework for spatial data warehouse is proposed in this thesis, which tries to 1) create a complete data searching and evaluation environment and 2) bridge the gap between users and the warehouse on the basis of the integration of metadata and selection knowledge. Metadata provides unified and appropriate descriptions about the data in the warehouse. The design of data selection knowledge on the other hand analyzes the possible patterns of data requirements and transforms them into formalized rules that can be readily used in querying spatial data. A query interface based on the data mining requirements is designed, such that users can interact with the system with limited training. The searching framework receives the request and transforms it into SQL-based query towards metadata with built-in selection knowledge. The merit of the proposed system is that there is no need for users to have a lot of understanding about warehouse, but all related data can still be found and sent to the users. With the fast growth of warehouse in the distributed GIS environment, such knowledgeable data searching mechanism can surely boost the use of data and the advances of GIS technology.
*洪榮宏,江宇嵐,地理資訊系統詮釋資料與資訊交換流通,地理資訊系統詮釋資料與資訊交
換流通,國土資訊, 系統通訊第33 期,2000。
*許錫賓,重建區地理資訊管理與資料倉儲, 國土資訊系統, 論文集, 2003。
*Agrawal , R., Imielinski, T., Swami, A., ”Mining association rules between
sets of items in large databases”, In Proc. of the ACM SIGMOD Conference on
Management of Data, pages 207--216, Washington, DC, May 1993.
*Agrwal, R., Gehrke, J., Gunopulos, D., and Raghavan, P., “Automatic subspace
clustering of high dimensional data for data mining applications”, In
Proc.1998 ACM-SIGMOD Int. Seattle,WA,1998.
*ANZLIC,”ANZLIC, Core Metadata Elements for Land and Geographic Directories
in Australia and New Zealand (http://www.anzlic.org.au/metaelem.htm), 1998.
Chaudhuri, S., Dayal, U., ”An overview of data warehousing and OLAP
technology”, ACM SIGMOD Record, 1997.
*Cleveland ,W., 『Visualizing Data』, Summit, N.J, 1993.
*CSDGM, ”Content Standard for Digital GeoSpatial Metadata”, Metadata Ad Hoc
Working Group Federal Geographic Data Committee, 2002.
*Dobson ,A. J., 『An Introduction to Generalized Linear Models』, New York:
Chapman and Hall, 1990.
*Domingos, P., Pazzani, M., ”Beyond independence: Conditions for the
optimality of the simple Bayesian classifier”, Proceedings of the 13th
International Conference on Machine Learning, Bari, Italy, pp. 105--112, 1996.
*Ester, M, Kriegel, H-P., Sander, J., ”Spatial data mining: A database
approach”, SSD’97, pages 47-66,1997.
*Ester, M., Kriegel,H.-P., Sander,J., and Xu. X.,”A density-based algorithm
for discovering cluster in large spatial databases with noise”,In Proc. 1996
Int. Conf. Knowledge Discovery and Data Mining(KDD’96), 1996.
*Fayyad, “Data Mining and Knowledge Discovery”, 1997.
*Gorawski, M., Malczok, R., ”Distributed Spatial Data Warehouse Indexed with
Virtual Memory Aggregation Tree”,Proceedings of the Second Workshop on
Spatio-Temporal Database Management (STDBM’04),Toronto, Canada, August 30th,
2004.
*Han ,J., Yang, Q., Kim, E., ”Plan mining by divide-and-conquer”,1999.
*Han, J., 『Data Mining Concepts and Techniques』, 2000.
*Han, J., Koperski, K., Stefanovic, N., ” GeoMiner:A system prototype for
spatial data mining” SIGMOD’97, pages 553-556 ,1997.
*Han, J., Ng, R., Fu, Y., and Dao, S.,,” Dealing with Semantic Heterogeneity
by Generalization-Based Data Mining Techniques”, in Papazoglou and
Schlageter (Eds), Cooperative Information Systems, pages 207-231, Academic
Press, 1997.
*Hümmer,W., Bauer, A., Harde, G, ” XCube – XML For Data
Warehouses”,Proceedings of the 6th ACM international workshop on Data
warehousing and OLAP ,New Orleans, Louisiana, USA ,2003.
*Inmon, W. H., 『Building the Data Warehouse』, New Tork:John Wiley &
Sons,Inc., 1996.
*Inmon, W. H., 『What is a Data Warehouse』, PRISM.,1995.
*Jain, A. K., Murty, M. N., Flynn, P.J., ” Data clustering: A survey” ACM
Comput, 1999.
*Kaufman, L., Rousseuw, J., “Finding Groups in Data: An Introduction to
Cluster Analysis”, John Wiley & Sons, 1990.
*Klösgen, W., May, M., ”Census Data Mining – An Application”, Fraunhofer
Institute for Autonomous Intelligent Systems Knowledge Discovery Team D-53757
Sankt Augustin, Germany,2002.
*Koperski, K., Han, J., ” Discovery of spatial association rules in
geographic information database”, Proc. 4th Int. Symp. Advances in Spatial
Databases, SSD ,Portland,1995.
*Lina, J., ” VizTree: a Tool for Visually Mining and Monitoring Massive Time
Series”, Proceedings of the 30th VLDB Conference, Toronto, Canada, 2004.
*MacQueen , J., “Some methods for classification and analysis of
multivariate observations”,In L. M. LeCam and J. Neyman, editors,
Proceedings Fifth Berkeley Symposium on Math. Stat. and Prob., pages 281-297.
University of California Press, 1967.
*Murthy, S. K., ”Automatic construction of decision trees from data: A multi-
disciplinary survey”, Data Mining and Knowledge Discovery, Vol. 2, (1998),
345-389, 1998.
*Open GIS Consortium, “OpenGIS® Catalogue Services Specification”, OpenGIS®
Implementation Specification”Document, Version: 2.0,Date: 2004-05-11.
*ISO/TC211, ISO19115:Geographic information — Metadata, 2003.
*Papadias, D., Kalnis,P., Zhang,J., and Tao, Y., ”Efficient OLAP Operations
in Spatial Data Warehouses”, The Hong Kong University of Science &
Technology Technical Report Series Department of Computer Science,2000.
*Singh, G., Bharathi, S., Chervenak, A., Deelman, E., Kesselman, C., Manohar,
M., Patil, S, Pearlman, L.,” A Metadata Catalog Service for Data Intensive
Applications”, Proceedings of the 2003 ACM/IEEE conference on
Supercomputing ,2003.
*Yao, X., “Research Issues in Spatio-temporal Data Mining” A white paper
submitted to the University Consortium for Geographic Information Science
(UCGIS) workshop on Geospatial Visualization and Knowledge
Discovery,Lansdowne, Virginia, Nov. 18-20, 2003.
*Zghal, H.B., Faiz, S., Ghezala, H.B.,” Exploration Techniques of Spatial
Data Warehouse: Overview and Application to Incendiary Domain” Computer
*Systems and Applications, ACS/IEEE International Conference on ,2001.
*Zhang, T., Ramakrishnan, R., Livny, M., ” BIRCH: An efficient data cluster
method for very large databases” In Proceedings of the 1996ACM-SIGMOD, pages
103-114,1996.