簡易檢索 / 詳目顯示

研究生: 張凱亮
Chang, Kai-Liang
論文名稱: 一個用於萃取知識和創作特定領域文件內容之框架
A Framework for Extracting Knowledge and Composing Domain Specific Contents
指導教授: 焦惠津
Jiau, Hewijin Christine
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2003
畢業學年度: 91
語文別: 英文
論文頁數: 61
外文關鍵詞: Framework, Knowledge engineering, Information extraction, Agent
相關次數: 點閱:62下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • When authors compose their documents, they usually start from collecting the information pertinent to their composition contexts. They can either intend to understand a subject matter further or to acquire the materials that can be applied in the document they try to compose. After that, authors might need to devise fluent logical structures to organize the composition materials acquired and their personal statements into their compositions. To streamline the document composition process, we propose a framework, ExcDoc (a framework for Extracting knowledge and composing Domain Specific contents), to facilitate the priori material preparation and later document composition process. The framework employs agents to perform the information-extracting task on specific information sources by consulting an ontology that captures the structure in that source. Also, we iteratively elicit representative templates from documents in similar styles to reflect the logical structure of the documents in specific writing perspectives. Thereafter, an agent adopts certain strategy to deploy applicable materials to the templates
    for authors’compositions.

    1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3.1 Participating Agents . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3.2 Supportive Knowledge Facilities . . . . . . . . . . . . . . . . . . . 5 1.4 Objective and Contribution . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1 GETESS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Ontobroker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.1 SystemArchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2 SystemDevelopment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.3 SystemFunctionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3.1 Establish Draft . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.3.2 Refine Document . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4 Template Elicitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.1 Template Construct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.2 Characteristic Attributes of Organizational Concepts . . . . . . . . . . . 22 4.3 Issues of Template Acquisition . . . . . . . . . . . . . . . . . . . . . . . . 23 4.4 A Template for Documents in a Specific Perspective . . . . . . . . . . . . 24 4.4.1 Template Elicitation Process . . . . . . . . . . . . . . . . . . . . . 25 4.4.2 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5 Material Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.1 Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.2 Semantic Unit Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5.2.1 Concepts Conveyed by Semantic Units . . . . . . . . . . . . . . . 30 5.2.2 Appropriate Granularity of Semantic Units . . . . . . . . . . . . . 30 5.2.3 Semantic Unit Construct . . . . . . . . . . . . . . . . . . . . . . . 31 5.3 Technical Document Processing . . . . . . . . . . . . . . . . . . . . . . . 32 ii 6 Material Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.1 Dominant Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.1.1 Subject Information . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.1.2 Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 6.2 Auxiliary Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 6.2.1 Semantic Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 6.2.2 Information Sources . . . . . . . . . . . . . . . . . . . . . . . . . 39 7 Framework Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 7.1 Domain Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 7.2 Hot-spot Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 7.3 Evolution of The Framework . . . . . . . . . . . . . . . . . . . . . . . . . 44 8 Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 8.1 Prologue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 8.2 Experimental Background . . . . . . . . . . . . . . . . . . . . . . . . . . 47 8.3 Sample Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 8.3.1 Scenario 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 8.3.2 Scenario 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 9 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 56 9.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 9.2 FutureWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

    [1] M. R. Genesereth and S. P. Ketchpel, “Software Agents,” Communications of the
    ACM, vol. 37, no. 7, pp. 48–53, July 1994.
    [2] H. S. Nwana, “Software Agents: An Overview,” Knowledge Engineering Review,
    vol. 11, no. 3, pp. 1–40, 1996.
    [3] Y. Peng, T. Finin, Y. Labrou, B. Chu, J. Long, W. J. Tolone, and A. Boughannam,
    “A Multi-Agent System for Enterprise Integration,” Proceedings of the 3rd International
    Conference on the Practical Applications of Agents and Multi-Agent Systems
    (PAAM-98), pp. 155–169, Mar. 1998.
    [4] R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. Addison-
    Wesley, 1999.
    [5] S. Soderland, “Learning Information Extraction Rules for Semi-Structured and Free
    Text,” Machine Learning, vol. 34, no. 1-3, pp. 233–272, 1999.
    [6] Y. Gong and X. Liu, “Creating Generic Text Summaries,” Sixth International Conference
    on Document Analysis and Recognition (ICDAR ’01), pp. 903–907, Sept.
    2001.
    [7] T. Brasethvik and J. A. Gulla, “Natural language analysis for semantic document
    modeling,” Data & Knowledge Engineering, vol. 38, no. 1, pp. 45–62, July 2001.
    [8] V. Sugumaran and V. C. Storey, “Ontologies for conceptual modeling: their creation,
    use, and management,” Data & Knowledge Engineering, vol. 42, no. 3, pp. 251–271,
    Sept. 2002.
    [9] Q. Li, P. Shilane, N. Noy, and M. Musen, “Ontology acquisition from on-line knowledge
    sources,” 2000.
    [10] M. Musen, “Ontology-oriented design and programming,” 2000.
    [11] S. Staab, C. Braun, I. Bruder, A. D¨usterh¨oft, A. Heuer, M. Klettke, G. Neumann,
    B. Prager, J. Pretzel, H.-P. Schnurr, R. Studer, H. Uszkoreit, and B. Wrenger,
    “GETESS - searching the web exploiting german texts,” Cooperative Information
    Agents, pp. 113–124, 1999.
    [12] R. Studer, S. Decker, D. Fensel, and S. Staab, “Situation and Perspective of Knowledge
    Engineering,” 2000.
    [13] R. Yangarber, “Scenario Customization for Information Extraction,” 2000.
    60
    [14] A. Heuer and D. Priebe, “IRQL - Yet Another Language for Querying Semistructured
    Data?,” 1999.
    [15] “Resource Description Framework (RDF) Model and Syntax Specification, W3C
    Working Draft 19 August 1998,” 1998. URL http://www.w3.org/TR/1998/WDrdf-
    syntax-19980819/.
    [16] S. Decker, M. Erdmann, D. Fensel, and R. Studer, “Ontobroker: Ontology Based
    Access to Distributed and Semi-Structured Unformation,” Semantic Issues in Multimedia
    Systems. Proceedings of DS-8, pp. 351–369, 1999.
    [17] J. L. Whitten and L. D. Bentley, Systems Analysis and Design Methods. McGraw-
    Hill, 1998.
    [18] G. B. Shelly, T. J. Cashman, and H. J. Rosenblatt, Systems Analysis and Design,
    Fourth Edition. Course Technology, 2001.
    [19] M. Kay and M. Roscheisen, “Text-translation alignment,” Computational Linguistics,
    vol. 19, no. 1, pp. 121–142, Mar. 1993.
    [20] E. Riloff, “Automatically Generating Extraction Patterns from Untagged Text,”
    AAAI/IAAI, vol. 2, pp. 1044–1049, 1996.
    [21] “Taligent white papers,” 1996. URL http://www.taligent.com/.
    [22] M. E. Fayad, D. C. Schmidt, and R. E. Johnson, Building Application Frameworks:
    Object-Oriented Foundations of Framework Design. John Wiley & Sons, 1999.
    [23] H. A. Schmid, “Design Patterns for Constructing the Hot Spots of a Manufacturing
    Framework,” Journal of Object-Oriented Programming, vol. 9, no. 3, pp. 25–37, June
    1996.

    下載圖示 校內:2005-07-23公開
    校外:2008-07-23公開
    QR CODE