簡易檢索 / 詳目顯示

研究生: 黃詩婷
Huang, Shih-Ting
論文名稱: 微網誌資料之產品特徵識別
Identification of Item Features in Microblogging Data
指導教授: 高宏宇
Kao, Hung-Yu
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2015
畢業學年度: 103
語文別: 英文
論文頁數: 54
中文關鍵詞: 微網誌社群網路意見探勘資訊擷取
外文關鍵詞: Microblogging, Social Network, Opinion Mining, Information Extraction
相關次數: 點閱:106下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近幾年由於社群網站的盛行以及智慧型手機的普遍,使得在社群網站上發表言論的人越來越多。使用者已漸漸習慣使用這些服務來分享他們的生活及意見評論。在這大量的訊息中要如何抓取有用的資訊就變得很重要,這也是近年來在資訊截取領域重要議題之一。許多研究會對使用者們在社群網路上的言論意見進行分析,想從中了解大眾討論某些產品或事物的熱烈程度。但多數皆單純地分析其產品事物的意見好壞,並無法詳細了解這些產品或事物是因為哪些因素而受到大眾的討論。因而,我們想從微網誌中的訊息串中,來分析產品或事物的特徵,進而獲得更精確的意見表達。
    在我們的研究中,我們希望能夠從像Twitter這樣的社群網站上,透過意見探勘找出此產品或事物受歡迎的特徵與其相關事物。本論文提出了利用情緒字詞判別產品及事物特徵的想法,並建構了一個架構於微網誌服務上進行事物特徵擷取的系統。方法部分分成兩個階段,資料處理階段將微網誌資料處理成有用的資訊;而特徵擷取階段會分析言論中的情緒做特徵的擷取,並且根據這些特徵在訊息對話串中出現的特性,進行產品相關特徵的關係分辨並將其特徵分群。
    最後研究結果顯示,我們的方法能夠有效地在不同主題下識別出各項產品中較受大眾討論的特徵,並且能夠將概念上相似的特徵分群出來。我們邀請了五位使用者評估實驗結果,並顯示我們的方法在不同的閥值下皆能表現得比任何基礎方法好。此外,也請另外三位使用者評估特徵分群的實驗結果,最後顯示出此研究方法在不同主題中皆能得到不錯的表現。

    In recent years, microblogging services have become very popular and almost everyone have smartphones or tablets. Accordingly, it attracts more and more users to share their daily life on the platforms. The larger volume of real-time information generated by millions of users, more important to extract useful information from the microblogging services will be. Many existing works could analysis the users’ opinions in microblogging and try to realize how popular the products or items are. They usually directly assign the opinion polarity to the items; however, we could not know what features make the items become very popular in public.
    In this work, we want to use opinion mining to find the relevant and significant features of items from the microblogging services, like Twitter. We construct a sentiment-based framework to identify the relevant features in microblogging. Our method consists of two stages. First, the data process stage processes the raw data from microblogging services. Then, in second stage we extract the relevant features by the sentiment characteristics from these messages and utilize these extracted features to construct the relevant feature network by Pointwise Mutual Information and group them according their concepts relations. Therefore, our system could be applied for knowing the characteristics of a product quickly and explicitly.
    In our experiments, our system can identify the popular item features in different domains effectively and the same concept features can cluster together in small groups. We invited five users to estimate the results and it shows that our final method is generally greater than baselines even using different threshold. In addition, another three users estimated the performance of feature grouping and the results in different domains show our method can do well averagely.

    中文摘要 III ABSTRACT IV TABLE LISTING VIII FIGURE LISTING IX 1. INTRODUCTION 1 1.1 Background 1 1.2 Motivation 2 1.3 Method abstract 6 1.4 Paper structure 7 2. RELATED WORK 8 2.1 Sentiment analysis 8 2.1.1 Lexicon approach 8 2.1.2 Machine learning approach 9 2.2 Feature extraction 10 3. METHOD 12 3.1 Overall architecture of our approach 12 3.2 Tweet conversation collecting 14 3.3 Text Information Processing 16 3.3.1 Part-of-Speech tagging (POS) 16 3.3.2 The advertisement or spam tweets removal 18 3.4 Opinion feature filtering 19 3.5 Feature extraction 21 3.6 Feature grouping 24 4. EXPERIMENTS 28 4.1 Dataset 28 4.2 Opinion feature filtering 29 4.3 Feature extraction 31 4.4 Feature grouping 37 5. CONCLUSIONS AND FUTURE WORKS 41 6. REFERENCES 42

    [1] Liu, B., M. Hu, and J. Cheng, Opinion observer: analyzing and comparing opinions on the Web. presented at the Proceedings of the 14th international conference on World Wide Web, 2005.
    [2] Chamlertwat, W., et al., Discovering Consumer Insight from Twitter via Sentiment Analysis. Journal of Universal Computer Science, 2012: p. 973-992.
    [3] Pang, B. and L. Lee, Opinion Mining and Sentiment Analysis. Found. Trends Inf. Retr., 2008. 2: p. 1-135.
    [4] Hu, M. and B. Liu, Mining and summarizing customer reviews. presented at the Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, 2004.
    [5] Utiyama, M. and H. Isahara, A statistical model for domain-independent text segmentation. Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, 2001: p. 499-506.
    [6] Han, B. and T. Baldwin, Lexical normalisation of short text messages: makn sens a #twitter. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011. 1: p. 368-378.
    [7] Dave, K., S. Lawrence, and D.M. Pennock, Mining the peanut gallery: opinion extraction and semantic classification of product reviews. Proceedings of the 12th international conference on World Wide Web, 2003: p. 519-528.
    [8] Turney, P.D., Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. presented at the Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, 2002.
    [9] O'Connor, B., et al., From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series. Proceedings of the International AAAI Conference on Weblogs and Social Media, 2010.
    [10] A., E. and S. F., SentiWordNet: A Publicly Available Lexical Resource for Opinion Mining. In Proceedings of the 5th Conference on Language Resources and Evaluation, 2007: p. 417-422.
    [11] Pang, B., L. Lee, and S. Vaithyanathan, Thumbs up?: sentiment classification using machine learning techniques. presented at the Proceedings of the ACL-02 conference on Empirical methods in natural language processing, 2002.
    [12] Go, A., R. Bhayani, and L. Huang, Twitter Sentiment Classification using Distant Supervision. 2009.
    [13] Pak, A. and P. Paroubek, Twitter as a Corpus for Sentiment Analysis and Opinion Mining. presented at the Proceedings of the Seventh conference on International Language Resources and Evaluation, 2010.
    [14] Wang, X., et al., Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach. in Proceedings of the 20th ACM international conference on Information and knowledge management, 2011: p. 1031-1040.
    [15] Jiang, L., et al., Target-dependent Twitter sentiment classification. Proceeding HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, 2011. 1: p. 151-160.
    [16] Liu, X., et al., Exacting social events for tweets using a factor graph. AAAI, 2012: p. 1692-1698.
    [17] hu, X., et al., Exploiting social relations for sentiment analysis in microblogging. in Proceedings of the sixth ACM international conference on Web search and data mining, 2013: p. 537-546
    [18] Salton, G. and C. Buckley, Term-weighting approaches in automatic text retrieval. Information Processing & Management, 1988: p. 513-523.
    [19] Zhou, M., Y. Xu, and X. Zhao, Study of Feature Extract on Microblog User Occupation Classification. Proceedings of the 2012 Fourth International Symposium on Information Science and Engineering, 2012: p. 20-23.
    [20] Zhao, W.X., et al., Topical keyphrase extraction from Twitter. Proceeding HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011. 1: p. 379-388.
    [21] Liu, X., et al., Recognizing named entities in tweets. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011. 1: p. 359-367.
    [22] Hu, M. and B. Liu, Mining Opinion Features in Customer Reviews. Proceedings of the National Conference on Artificial Intelligence, 2004: p. 755-760.
    [23] Li, Z., et al., Keyword extraction for social snippets. Proceedings of the 19th international conference on World wide web, 2010: p. 1143-1144.
    [24] Li, C., et al., Exploiting hybrid contextsfor tweet segmentation. Proceedings of the 36th international ACM SIGIRconference on Research and development in information retrieval, 2013: p. 523-532.
    [25] Peng, Z., L. Xue, and W. Ke, Feature extraction from micro-blogs for comparison of products and services. Web Information Systems Engineering, WISE 2013 - 14th International Conference, Proceedings 2013: p. 82-91.
    [26] Popescu, A.-M. and O. Etzioni, Extracting Product Features and Opinions from Reviews. Proceeding HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, 2005: p. 339-346.
    [27] Gimpel, K., et al., Part-of-speech tagging for Twitter: annotation, features, and experiments. presented at the Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers, 2011. 2.

    下載圖示 校內:2016-04-27公開
    校外:2020-04-27公開
    QR CODE