簡易檢索 / 詳目顯示

研究生: 黃微芬
Huang, Wei-Fen
論文名稱: 改善實體識別與特徵擷取之評論摘要方法
Developing a Summarization Method by Improving Entity Identification and Feature Extraction
指導教授: 王惠嘉
Wang, Hei-Chia
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理研究所
Institute of Information Management
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 66
中文關鍵詞: 產品特徵擷取情感分析意見摘要
外文關鍵詞: Product feature extraction, Sentiment analysis, Opinion summarization
相關次數: 點閱:105下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •   藉由 Web 2.0的技術,社群媒體、電子商務網站、論壇、部落格等平台提供消費者能透過文字口碑來分享他們使用產品的經驗,不少消費者在購買產品前會先上網搜尋評論,這些意見常會影響消費者是否購買產品和訂閱服務的決定。隨著網路使用者增加,評論數量也愈來愈多,造成資訊過載的問題,使消費者不易快速找到需要的資訊。雖然部分網站提供評級、產品推薦的功能,但這些訊息是概括性的總評,不能完全反應使用者感興趣的功能評比,因為許多消費者真正在意的是產品的細節而不再只是價格。
      在眾多產品中,3C產品價格較高,消費者傾向在進行購買決策前,花費更多心力去蒐集和評估資訊,因此需要提供評論意見摘要來滿足消費者的需求。另外,3C產品中智慧型手機的普及率達到82%,顯示行動裝置的盛行與趨勢,因此本研究將以智慧型手機做為資料驗證的對象。
      過去研究大多針對英文評論來處理,然而中文與英文的用字習慣、語法結構等都不同造成分析方法有所差別,造成在特徵擷取上相對較英文困難,因此本研究將針對中文評論進行分析。本研究透過三個主要任務包含產品特徵擷取、識別意見句的語義取向、評論摘要並計算每個特徵的意見分數,以特徵為基礎的摘要方式彙整中文產品評論。
      最後,透過實驗來驗證本研究所提出的方法,從實驗結果發現本研究的方法能夠有效地擷取出評論中的產品名稱,評估指標F-measure達到.73,而詞性排列規則的部分,評估指標F-measure達到.58。

    Consumers can share their products experience through word-of-mouth via social media, e-commerce sites, forums and other platforms with Web 2.0. It attracts many buyers search for reviews, which often affect their decision to purchase products. As the number of Internet users grows rapidly, the number of reviews increases, causing information overload problems that make it difficult for consumers to quickly find the information they need. Some e-commerce sites provide ratings and product recommendations, but these information do not fully reflect the user's interest because many consumers really care about the details of the product, such as image quality. In particular, the 3C products is relatively expensive. Consumers tend to spend more effort to collect and evaluate information before making a purchase decision. It is necessary to provide a summary of comments to meet consumer needs.
    Currently, the smartphone penetration rate is about 82%. This shows the prevalence of mobile devices, so this study will use mobile phone reviews to do data validation. This study includes three main tasks: product feature extraction, recognition of opinion sentences, summary of comments and calculation of each feature's opinion score. The proposed method integrates the Chinese product reviews in a feature-based summary approach.
    The result of the experiments will be evaluated by the precision, recall and f-measure. Experimental results show that our entity identification method outperforms the state of the art methods with F-measure .73. Our POS pattern extraction method can identify feature and opinion word with F-measure .58.

    第1章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 5 1.3 研究範圍與限制 5 1.4 研究流程 6 1.5 論文大綱 7 第2章 文獻探討 9 2.1 實體識別 9 2.2 產品特徵擷取 11 2.2.1 Term Frequency Inverse Document Frequency (TF-IDF) 11 2.2.2 卡方統計量 (Chi-square Statistic Measure, CHI) 11 2.2.3 相互資訊 (Mutual Information, MI) 12 2.2.4 資訊獲利 (Information Gain, IG) 12 2.3 情感分析 14 2.4 摘要 17 2.5 小結 18 第3章 研究方法 20 3.1 研究架構 20 3.2 資料蒐集與前處理模組 22 3.3 實體識別模組 24 3.4 特徵與意見字詞擷取模組 28 3.5 情感分析模組 33 3.6 小結 39 第4章 系統建置與驗證 40 4.1 系統環境建置 40 4.2 實驗方法 40 4.2.1 資料來源 41 4.2.2 評估指標 43 4.3 參數設定 44 4.3.1 參數一:實體識別模組的n-gram長度 44 4.3.2 參數二:詞性標籤模式(POS Patterns)的長度n 44 4.3.3 參數三:前k個詞性排列規則 45 4.3.4 參數四:特徵語義相似度的門檻值 46 4.4 實驗結果與分析 46 4.4.1 實驗一 46 4.4.2 實驗二 48 4.4.3 實驗三 51 4.4.4 實驗四 52 4.4.5 實驗五 53 4.4.6 實驗六 54 4.5 小結 58 第5章 結論 59 5.1 研究成果 59 5.2 未來研究方向 61 參考文獻 63

    Baccianella, S., Esuli, A., & Sebastiani, F. (2010). SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. Paper presented at the International Conference on Language Resources and Evaluation, Valletta, Malta.
    Bafna, K., & Toshniwal, D. (2013). Feature Based Summarization of Customers’ Reviews of Online Products. Procedia Computer Science, 22, 142-151.
    Bagheri, A., Saraee, M., & De Jong, F. (2013). Care More about Customers: Unsupervised Domain-Independent Aspect Detection for Sentiment Analysis of Customer Reviews. Knowledge-Based Systems, 52, 201-213.
    Bazaarvoice. (2016). Show Me The Money: Bazaarvoice ROBO Study Pinpoints Review Value for In-store, Online Buys. Retrieved from http://www.bazaarvoice.com/about/newsroom/press-releases/bazaarvoice-robo-study.html
    Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New Avenues in Opinion Mining and Sentiment Analysis. IEEE Intelligent Systems, 28(2), 15-21.
    D’Avanzo, E., & Pilato, G. (2015). Mining Social Network Users Opinions’ to Aid Buyers’ Shopping Decisions. Computers in Human Behavior, 51, 1284-1294.
    Delone, W. H., & McLean, E. R. (2003). The DeLone and McLean Model of Information Systems Success: A Ten-Year Update. Journal of Management Information Systems, 19(4), 9-30.
    Derczynski, L., Maynard, D., Rizzo, G., van Erp, M., Gorrell, G., Troncy, R., . . . Bontcheva, K. (2015). Analysis of Named Entity Recognition and Linking for Tweets. Information Processing and Management, 51(2), 32-49.
    Eirinaki, M., Pisal, S., & Singh, J. (2012). Feature-Based Opinion Mining and Ranking. Journal of Computer and System Sciences, 78(4), 1175-1184.
    Günal, S. (2012). Hybrid Feature Selection for Text Classification. Turkish Journal of Electrical Engineering & Computer Sciences, 20(2), 1296-1311.
    Hu, M., & Liu, B. (2004). Mining Opinion Features in Customer Reviews. Paper presented at the Association for the Advancement of Artificial Intelligence, San Jose, California, USA.
    Huang, S. L., & Cheng, W. C. (2015). Discovering Chinese Sentence Patterns for Feature-Based Opinion Summarization. Electronic Commerce Research and Applications, 14(6), 582-591.
    Huang, Y. Y., Chang, C. H., & Chou, C. L. (2015). A Tool for Web NER Model Generation Using Search Snippets of Known Entities. Paper presented at the The 2015 Conference on Computational Linguistics and Speech Processing, Hsinchu, Taiwan.
    Khan, F. H., Qamar, U., & Bashir, S. (2016). Multi-Objective Model Selection (MOMS)-Based Semi-Supervised Framework for Sentiment Analysis. Cognitive Computation, 8(4), 614-628.
    Ku, L. W., & Chen, H. H. (2007). Mining Opinions from the Web: Beyond Relevance Retrieval. Journal of American Society for Information Science and Technology, 58(12), 1838-1850.
    Li, S., Ye, Q., & Li, Y. J. (2009). Mining Features of Products from Chinese Customer Online Reviews. Journal of Management Sciences in China, 12(2), 142-152.
    Li, S., Zhou, L., & Li, Y. (2015). Improving Aspect Extraction by Augmenting A Frequency-Based Method with Web-Based Similarity Measures. Information Processing and Management, 51(1), 58-67.
    Li, Y. M., & Li, T. Y. (2013). Deriving Market Intelligence from Microblogs. Decision Support Systems, 55(1), 206-217.
    Lin, S., Huang, S., Chung, Y., & Chen, K. (2013). The Lexical Knowledge and Semantic Representation of E-HowNet. Contemporary Linguistics, 15(2), 177-194.
    Liu, C. L., Hsaio, W. H., Lee, C. H., Lu, G. C., & Jou, E. (2012). Movie Rating and Review Summarization in Mobile Environment. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICSS—PART C: APPLICATIONS AND REVIEWS, 42(3), 397-407.
    Liu, H., He, J., Wang, T., Song, W., & Du, X. (2013). Combining User Preferences and User Opinions for Accurate Recommendation. Electronic Commerce Research and Applications, 12(1), 14-23.
    Liu, Y., Wang, Y., Feng, L., & Zhu, X. (2014). Term Frequency Combined Hybrid Feature Selection Method for Spam Filtering. Pattern Analysis and Applications, 19(2), 369-383.
    Pang, B., & Lee, L. (2008). Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval, 1(2).
    Quan, C., & Ren, F. (2014). Unsupervised Product Feature Extraction for Feature-Oriented Opinion Determination. Information Sciences, 272, 16-28.
    Ravi, K., & Ravi, V. (2015). A Survey on Opinion Mining and Sentiment Analysis: Tasks, Approaches and Applications. Knowledge-Based Systems, 89, 14-46.
    Sankarasubramaniam, Y., Ramanathan, K., & Ghosh, S. (2014). Text Summarization Using Wikipedia. Information Processing and Management, 50(3), 443-461.
    Tang, B., Kay, S., & He, H. (2016). Toward Optimal Feature Selection in Naive Bayes for Text Categorization. IEEE Transactions on Knowledge and Data Engineering, 28(9), 2508-2521.
    Turney, P. (2001). Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL. Paper presented at the European Conference on Machine Learning, Freiburg, Germany.
    Yan, Z., Xing, M., Zhang, D., & Ma, B. (2015). EXPRS: An Extended Pagerank Method for Product Feature Extraction from Online Consumer Reviews. Information and Management, 52(7), 850-858.
    Yang, J. Y., Kim, H. J., & Lee, S. G. (2010). Feature-Based Product Review Summarization Utilizing User Score. Journal of Information Science and Engineering, 26(6), 1973-1990.
    Yao, X. (2011). A Method of Chinese Organization Named Entities Recognition Based on Statistical Word Frequency, Part of Speech and Length. Paper presented at the 2011 4th IEEE International Conference on Broadband Network and Multimedia Technology, Shenzhen, China.
    Yao, Y., & Sun, A. (2016). Mobile Phone Name Extraction from Internet Forums: A Semi-Supervised Approach. World Wide Web, 19(5), 783-805
    Zaiyadi, M., & Baharudin, B. (2010). A Proposed Hybrid Approach for Feature Selection in Text Document Categorization. World Academy of Science, Engineering and Technology, 48, 111-116.
    Zhang, Z., Guo, C., & Goes, P. (2013). Product Comparison Networks for Competitive Analysis of Online Word-of-Mouth. ACM Transactions on Management Information Systems, 3(4), 20.
    Zhuang, L., Jing, F., & Zhu, X. Y. (2006). Movie Review Mining and Summarization. Paper presented at the Conference on Information and Knowledge Management, Arlington, Virginia, USA.
    Alexa(2016)。Alexa Top Sites in Taiwan。2016年8月27日,取自
    http://www.alexa.com/topsites/countries/TW
    beephone(2016)。2016年10月20日,取自http://www.beephone.com.tw/
    Ericsson ConsumerLab(2015)。2016年十大熱門消費者趨勢。2016年8月26日,取自https://www.ericsson.com/res/site_TW/docs/Ericsson-ConsumerLab-10-Hot-Consumer-Trends-2016-Report_TC.PDF
    ePrice(2016)。2016年10月17日,取自http://www.eprice.com.tw/
    GEMarketing(2016)。口碑行銷實戰分享:消費型3C產品(相機/手機/導航)。2017年4月10日,取自
    www.gemarketing.com.tw/article/口碑行銷實戰分享:消費型3c產品 - 相機手機導航
    Google(2016)。消費者洞察報告。2017年4月10日,取自https://www.bnext.com.tw/article/40682/BN-2016-08-23-093950-40
    InsightXplorer創市際市場研究顧問(2016)。市調解析—網路口碑篇。創市際雙週刊,56。
    SOGI手機王(2016)2016年10月17日,取自https://www.sogi.com.tw/
    Underwriters Laboratories Inc(2015)。3C產品消費意識大調查。2016年8月27日,取自http://taiwan.ul.com/news/0316/
    中央研究院中文詞知識庫小組(2011)。廣義知網知識本體架構。取自http://ehownet.iis.sinica.edu.tw/index.php
    吳蕙欣(2011)。結合多辭典與常識網路的情緒分析系統(未出版之碩士論文)。國立台灣大學資訊工程學系,台北市。
    東方線上(2014a)。2015消費趨勢:大夢想時代。2016年8月27日,取自 http://www.isurvey.com.tw/7_eol/2_detail.aspx?id=3849
    東方線上(2014b)。網購常態化,消費&資訊型態2.0趨勢報告。2016年8月27日,取自http://www.isurvey.com.tw/7_eol/2_detail.aspx?id=3590
    高照明(2012)。語料庫建構技術-研究報告。取自http://wd.naer.edu.tw/project/NAER-101-12-F-2-03-00-2-01.pdf
    資策會產業情報研究所(2014)。網路社群口碑需求調查。2016年8月27日,取自 https://mic.iii.org.tw/micnew/IndustryObservations_PressRelease02.aspx?sqno=366
    資策會FIND /經濟部技術處(2015)。資策會FIND(2015)/服務創新體驗設計系統研究與推動計畫(3/4)。2016年8月27日,取自http://www.find.org.tw/market_info.aspx?n_ID=7203
    董振東(2007)。HowNet。取自http://www.keenage.com/

    無法下載圖示 校內:2022-08-22公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE