成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	陳威男 Chen, Wei-nan
論文名稱：	利用隱藏式馬可夫模型辨識網頁上的結構化資源 Identifying Structured Resource Objects on the Web Using Hidden Markov Model
指導教授：	盧文祥 Lu, Wen-hsiang
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering
論文出版年：	2009
畢業學年度：	97
語文別：	中文
論文頁數：	39
中文關鍵詞：	結構化網頁資源、結構化資源辨識模型、網路搜尋
外文關鍵詞：	Suquential Structured Resource Objects Identification Model, Structured Web Resources, Web Search
相關次數：	點閱：189 下載：1
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在網頁中存在著很多資源是對使用者有用的，在一般的網路世界提供使用者各種形式的豐富資源，在一般的搜尋引擎上使用者只能利用關鍵字，經過搜尋引擎利用關鍵字配對(keyword-matching)的技術尋找相關的頁面，接著使用者再自行尋找所要的資源，因此搜尋引擎沒有辦法有效的回應使用者其所針對的特定資源。實際上根據網路上的資源，我們觀察到其中有某些文字的資源具有特定的次序結構，而這類資源許多描述方法並有步驟順序的句子所構成，而這一連串的步驟指令是為了製作、準備某件物品或進行某項事務，因此我們定義這種資源為Sequential Structured Resource Objects。為了能夠在網頁上有效的辨識出結構化資源，本論文提出一個結構化資源辨識模型(Sequential Structured Resource Objects Identification Model)，進行結構化資源結構的統計式訓練，最後可以有效的在網頁上辨識結構化資源。
本論文的實驗挑選了三種不同的結構化資源進行訓練和測試，根據我們實驗的結果，在三種不同的結構化資源中，利用我們提出的模型可以有效的辨識出網頁內容是否為結構化資源，未來可擴大資源的種類，則可以應用到整個網路上所有的網頁。

There are a lot of usuful web resources for users on the Web. However, users can only use keywords to search in general search engines, and usually have to find the web resources they want by themselves. In fact, search engines can not response the user demand effectively.
We observed that some textual web resources are structured. Those web resources are a set of instructions for making, preparing or doing something, and they are made by sequential sentences. We called this kind of web resources sequential structured resource objects. In order to identify sequential structured resource objects, we proposed a Sequential Structured Resource Objects Identification Model to learn the structure of sequential structured resource objects then identify them on the Web.
In this paper, we collected three kinds of sequential structured resource objects to training the model . Our model can effectively identify sequential structured resource objects on the Web in all the three web resources.

章節目錄
摘要	I
Abstract	III
誌謝	V
章節目錄	VI
表目錄	VIII
圖目錄	IX
第一章	導論	1
1.1	研究動機	1
1.2	研究方法	2
1.3	論文架構	7
第二章	相關研究與參考文獻	8
2.1	Web Resources的相關研究	8
2.2	Structured Web Resources相關研究	9
2.3	Database的相關研究	11
第三章	研究方法	14
3.1	問題描述	14
3.2	系統架構	15
3.3	結構化資源辨識模型(Sequential Structured Resource Objects Identification Model)	16
3.3.1	結構化資源辨識模型	16
3.3.2	Extract Feature States	17
3.3.3	物件映射模型(Object Emission Model)、動作序列模型(Action Sequence Model)和資源類別模型(Resource Object Model)	18
第四章	實驗	23
4.1 實驗資料集	23
4.2 評估方法	24
4.3 結構化資源辨識正確率評估與錯誤分析	26
4.4 結構化與非結構化資源辨識正確率評估與錯誤分析	29
4.4 結構化資源模型學習效果評估與錯誤分析	32
第五章	結論與未來研究方向	35
5.1 結論	35
5.2 未來研究工作及方向	35
參考文獻	37
                                    

Gangemi, A., & Mika, P. (2003). Understanding the semantic Web through descriptions and situations. Proceedings of International Conference of Ontologies, Databases, and Applications of Semantics (ODBASE2003) (pp. 689-706). Catalina, Italy. November 3-7th 2003. London: Springer-Verlag.

Indrajit Bhattacharya, Shantanu Godbole and Sachindra Joshi. Structured Entity Identification and Document Categorization: Two Tasks with One Joint Model. In Proc. of KDD’08.

J. Jeon and W. B. Croft. Learning Translation-based Language Models Using Q&A Archives. Technical Report, University of Massachusetts.

J. Jeon, W. B. Croft and J. Lee. Finding Semantically Similar Questions Based on Their Answers. In Proc. of SIGIR'05.

J. Jeon, W. B. Croft and J. Lee. Finding Similar Questions in Large Question and Answer Archives. In Proc. of CIKM'05.

Kimberly Tee, Karyn Moffatt, Leah Findlater, Eve Mac Gregor, Joanna McGrenere, Barbara Purves and Sidney S. Fels. A Visual Recipe Book for Persons with Language Impairments. In proc. of CHI’05.

L. R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. In Proc. IEEE, 77(2):257–286, 1989.

Li, Y., Meng, X., Wang, L., and Li, Q. RecipeCrawler: collecting recipe data from www incrementally. In Proc. of the 7th International Conference on Web-Age Information Management (WAIM), Hong Kong, China, 2006, pp. 263-274.

Liping Wang and Qing Li. A Personalized Recipe Database system with User-Centerd Adaption and Tutoring Suppoort. In Proc. of SIGMOD Ph.D. workshop on IDAR, 2007

Liping Wang, Qing Li, Na Li, Guozhu Dong and Yu Yang. Substructure Similarity Measurement in Chinese Recipes. In Proc. of WWW’08.

Michael L. Nelson, Joan A. Smith, Ignacio Garcia del Campo, Herbert Van de Sompel and Xiaoming Liu Efficient. Automatic Web Resource Harvesting. In Proc. of WIDM’06.

Valentina Presutti and Aldo Gangemi. Identify of Resources and Entities on the Web. In Proc.of the International Journal on Semantic Web & Information Systems 2008

Valentina Presutti and Aldo Gangemi. The bourne identity of a Web resource. In Proc. of IRW’06

Venkatesan T. Chakaravarthy, Himanshu Gupta, Prason Roy and Mukesh Mohania. Efficient Linking Text Documents with Relevant Strutured Information. In Proc. of VLDB’06.

Wang, L. CookRecipe - towards a versatile and fully-fledged recipe analysis and learning system. Ph.D. thesis, Department of Computer Science, City University of Hong Kong, Hong Kong (Jan. 2008).

2009-08-24公開

簡易檢索 / 詳目顯示

相關論文