簡易檢索 / 詳目顯示

研究生: 杜維
Duy, Do
論文名稱: Improving Data Warehouse Performance Through Efficient Maintenance of Basic Statistic Functions
Improving Data Warehouse Performance Through Efficient Maintenance of Basic Statistic Functions
指導教授: 黃宇翔
Huang, Yeu-Shiang
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理研究所
Institute of Information Management
論文出版年: 2008
畢業學年度: 96
語文別: 英文
論文頁數: 42
外文關鍵詞: Data warehouse, self-maintainability, statistic function
相關次數: 點閱:116下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • When a decision needs to be made, the decision maker will usually try to get as much information as possible to assist in making such a decision. In general, some simple but very meaningful statistic functions are often used to retrieve information. However, the operational systems have to spend a lot of time and do a lot of calculations in order to fulfill the decision maker’s needs, and which will influence the performance of the daily operations. That is one of the major reasons why a data warehouse comes up in industry.
    Data warehouse contains a large amount of information in which a large number of summary tables or materialized aggregate views are built in order to increase the system performance. As it changes, most notably of new transactional data, are collected from the various data sources, all summary tables in the warehouse that depend upon this data ought to be updated accordingly. Since the number of summary tables that need to be maintained is often large, a critical issue for data warehousing is how to maintain the summary tables efficiently.
    In this thesis, we investigate the currently available maintenance methods being used by the industry to enhance the data warehouse. By reviewing the literature and proposing an approach, some more auxiliary data will be kept inside a data warehouse whose role is like a legacy to enhance the maintenance processes of some statistic functions. This proposed approach has made a big improvement in the maintenance of the data warehouse process. The time needed to maintain those basic statistical functions reduce from minutes to seconds. And finally, we perform some comparative analysis to verify the proposal methods.

    ABSTRACT I TABLE OF CONTENTS IV LIST OF TABLES VI LIST OF FIGURES VII LIST OF FIGURES VII CHAPTER 1 INTRODUCTION 1 CHAPTER 2 LITERATURE REVIEW 4 2.1 Data warehouse 4 2.1.1 Basic concept 4 2.1.2 Maintenance of Data warehouse 7 2.1.3 Analysis in Data warehouse 9 2.2 Maintenance in data warehouse 10 2.2.1 Types of maintenance 11 2.2.2 Methods of maintenance 12 2.3 Self-maintenance 14 CHAPTER 3 RESEARCH APPROACH 17 3.1 Problem description 17 3.1.1 The Concepts of Self-maintainability, semi-self-maintainability and non-self-maintainability 19 3.1.2 Problem formulation 20 3.2 Maintain statistic functions incrementally 22 3.2.1 Maintain self-maintainable statistic functions 22 3.2.2 Maintain semi-self-maintainable statistic functions 24 3.2.2.1 Structure of the auxiliary data for semi-self-maintainable function 25 3.2.2.2 Maintain auxiliary table of semi-self-maintainable functions 27 3.2.3 Maintain non-self-maintainable function incrementally 29 3.2.3.1 Structure of auxiliary table for non-self-maintainable function 29 3.2.3.2 Using auxiliary table to maintain non-self-maintainable function 30 CHAPTER 4 APPLICATION 32 4.1 Experimental environment specification 32 4.2 Statistical analysis 34 4.3 Discussion on the finding 36 CHAPTER 5 CONCLUSION 38 5.1 Contribution 38 5.2 Limitation 38 5.3 Future research 39 REFERENCES 40

    Agrawal, D., Abbadi, A. E., Singh, A., & Yurek, T. (1997). Efficient view maintenance at data warehouses. Paper presented at the ACM SIGMOD international conference on management of data, Tucson, Arizona, United States.
    Akinde, M. O., Jensen, O. G., & Bohlen, M. H. (1998). Minimizing detail data in data warehouses. Paper presented at the The sixth international conference on extending database technology, Spain.
    Benader, A., Benader, B., Fadlalla, A., & James, G. (2000). Data warehouse administration and management. information systems management, 17(1), 71-80.
    Blakeley, J. A., Larson, P.-A., & tompa, F. W. (1986). Efficiently updating materialized views. Paper presented at the 1986 ACM SIGMOD international conference on management of data, Washington, D.C., United States.
    Chao, C.-M. (2004). Incremental maintenance of object-oriented data warehouses. Information sciences, 160, 91-110.
    Cho, M., Pei, J., & Wang, K. (2007). Answering ad hoc aggregate queries form data streams using prefix aggregate trees. Knowledge and information systems, 12(3), 301-329.
    Devlin, B. (2000). Data warehouse from architecture to implementation: Addision Wesley.
    Gupta, A., Mumick, I. S., & Subrahmanian, V. S. (1993). Maintaining views incrementally. Paper presented at the ACM SIGMOD international conference on managemetn of data, Washington DC.
    Gupta, H., & Mumick, I. S. (2005). Selection of views to materialize in a data warehouse. IEEE transactions on knowledge and data engineering, 17, 24-43.
    Gupta, H., & Mumick, I. S. (2006). Incremental maintenance of aggregate and outerjoin expressions. Information systems, 31, 435-464.
    Imilinski, T., & Witold Lipski, J. (1984). Imcomplete information in relational database. Jounal of the ACM, 31(4), 761-791.
    Inmon, W. H. (1996). Building the data warehouse: Wiley.
    Inmon, W. H., Welch, J. D., & Glassey, K. L. (1997). Managing the data warehouse: Wiley.
    Kimball, R., Reeves, L., Ross, M., & Thornthwaite, W. (1998). The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing, and Deploying Data Warehouses: Wiley.
    KDD Cup (1999) http://archive.ics.uci.edu/ml/databases/kddcup99/kddcup99.html.
    Laurent, D., Lechtenborger, J., Spyratos, N., & Vossen, G. (2001). Monotonic complements for independent data warehouses. The VLDB journal, 10, 295-315.
    Lee, K. Y., Song, J. H., & Kim, M. H. (2007). Reducing the cost of accessing relations in incremental view maintenance. Decision Support Systems, 43, 512-526.
    Li, H.-G., Yu, H., Agrawal, D., & Abbadi, A. E. (2007). Progressive ranking of range aggregates. Data & Knowledge Engineering, 63, 4-25.
    Liang, W., Li, H., Wang, H., & Orlowska, M. E. (1999). Making multiple views self-maintainable in a data warehouse. Data & Knowledge Engineering, 30, 121-134.
    Mannino, M. V., & Walter, Z. (2006). A framework for data warehouse refresh policies. Decision Support Systems, 42, 121-143.
    Mohania, M., & Kambayshi, Y. (2000). Making aggregate views self-maintainable. Data & Knowledge Engineering, 32, 87-109.
    Mumick, I. S., Quass, D., & Mumick, B. S. (1997). Maintenance of data cubes and summary tables in a warehouse. Paper presented at the 1997 ACM SIGMOD international conference of management of data, Tucson, Arizona, United States.
    Nemati, H. R., Steiger, D. M., Iyer, L. S., & Herschel, R. T. (2002). Knowledge warehouse: an architectural integration of knowledge management, decision support, artificial intelligence and data warehousing. Decision Support Systems, 33, 143-161.
    Ponniah, P. (2001). Data warehousing fundamentals: Wiley.
    Quass, D., Gupta, A., Mumick, I. S., & Widom, J. (1996). Making views self-maintainable for data warehousing. Paper presented at the PDIS.
    Sen, A. (2004). Metadata management: past, present and future. Decision Support Systems, 37, 151-173.
    Shi, Z., Huang, Y., He, Q., Xu, L., liu, S., Qin, L., et al. (2007). MSMiner-a developing platform for OLAP. Decision Support Systems, 42, 2016-2028.
    Shu, H. (1997). View maintenance using conditional tables. Paper presented at the 5th DOOD, Berlin Heidelberg New York.
    Skyt, J., Jensen, C. S., & Pedersen, T. B. (2008). Specification -based data reduction in
    dimensional data warehouses. Information systems, 33, 36-63.
    Tremblay, M. C., Fuller, R., Berndt, D., & Studmicki, J. (2007). Doing more with more
    information: Changing healthcare planning with OLAP tools. Decision Support Systems,
    43, 1305-1320.
    Watson, H. J., & Ariyachandra, T. (2005). Data warehouse architectures: factors in the selection
    decision and the success of the architectures . Athens, Georgia: Terry college of business,
    university of Georgia.
    Yeung, G. C. H., & Gruver, W. A. (2005). Multiagent immediate incremental view maintenance
    for data warehouses. IEEE transactions on systems man and cybernetics part a -systems
    and humans, 35, 305-310.
    Zhang, X., & Rundensteiner, E. A. (2002). Integrating the maintenance and synchronizaiton of
    data warehouses using a cooperative framework. Information systems, 27, 219-243.

    下載圖示 校內:2009-08-20公開
    校外:2009-08-20公開
    QR CODE