簡易檢索 / 詳目顯示

研究生: 高啟原
Kao, Chi-Yuan
論文名稱: 個人化微網誌潛在熱門話題發掘與探測方法 - 以Facebook為例
A Potential Hot Topic Discovery and Probing Approaches for Microblog Users - An Example on Facebook
指導教授: 郭耀煌
Kuo, Yau-Hwang
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2013
畢業學年度: 101
語文別: 英文
論文頁數: 49
中文關鍵詞: 微網誌潛在熱門話題話題探針
外文關鍵詞: microblog, potential hot topic, topic probing
相關次數: 點閱:112下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 微網誌在近幾年快速成長並產生了相當龐大的資料。人們很難去完全吸收如此龐大的資料,因此部份的學者試圖從這些資料中找出重要、有價值的資訊好讓使用者能輕鬆的瞭解現在微網誌上的狀況。與其找出那些已經廣為人知的熱門話題,還不如去發掘潛在的熱門話題更具價值,可以用作安置廣告、社群網路經營以及網路行銷。
    本篇論文提出一個PHTDP方法用來發掘潛在的熱門話題。PHTDP包含三個部份,第一部份是定期地對收集到的作話題偵測與追蹤,第二部份是從第一部份得到的結果中找出潛在的熱門話題,第三部份是利用話題探針來驗證潛在話題的潛力。

    Microblog has grown rapidly in recent years and very huge data are created. These huge amount data are difficult to absorb with people. Thus some researchers try to extract the important or valuable information that makes people easy to know what are going on microblog. In contrast with detecting hot topic which is already known by a large amount of people, potential hot topic discovery is more valuable for advertising, social network managing and marketing.
    This thesis proposed an approach named PHTDP which concentrated on potential hot topic discovery. The PHTDP is composited of three parts. First, TDT is periodically performed to the collected data. Second, potential hot topics are extracted from the result of the first part. Third, the potential of topics are verified by topic probing.

    LIST OF CONTENTS V LIST OF TABLES VII LIST OF ALGORITHMS VIII LIST OF FIGURES IX CHAPTER 1. INTRODUCTION 1 1.1 MOTIVATION 1 1.2 PROBLEM 3 1.3 BACKGROUND 5 1.4 CHALLENGE 6 1.5 CONTRIBUTION 7 1.6 ORGANIZATION OF THIS THESIS 8 CHAPTER 2. RELATED WORK 9 2.1 MICROBLOGS 9 2.2 TOPIC DETECTION AND TRACKING 10 2.3 HOT TOPIC EXTRACTION 11 2.4 EMERGING TOPIC DETECTION 12 2.5 POPULARITY PREDICTION 13 2.6 TIME SERIES PREDICTION 14 CHAPTER 3. METHOD 15 3.1 PROBLEM FORMULATION 15 3.2 OVERVIEW 17 3.3 TOPIC DETECTION AND TRACKING 19 3.3.1 DATA PREPROCESSING 19 3.3.2 ONLINE POSTS CLUSTERING 20 3.3.3 REPRESENTATIVE WORD FINDING 22 3.4 POTENTIAL HOT TOPIC 25 3.4.1 HOT TOPIC DEFINITION 25 3.4.2 SEQUENCE MODELING 25 3.4.3 POTENTIAL HOT TOPIC DISCOVERY 28 3.5 TOPIC PROBING 34 3.5.1 PROBING PROCEDURE 34 3.5.2 PROBE GENERATING 34 3.5.3 PROBE EVALUATION 35 CHAPTER 4. EXPERIMENTAL RESULTS ANALYSIS 36 4.1 DATA COLLECTION 36 4.2 HOT TOPIC MODELING 38 4.3 PREDICTION RESULT 40 4.4 DISCUSSION 43 CHAPTER 5. CONCLUSION AND FUTURE WORK 44 5.1 CONCLUSION 44 5.2 FUTURE WORK 45 REFERENCES 46

    Allan, J. (2002). Introduction to topic detection and tracking. In Topic detection and tracking. Norwell: Kluwer Academic Publishers.
    Allan, J. (2002). Topic Detection and Tracking: Event-Based Information Organization. Norwell, MA, USA: Kluwer Academic Publishers.
    Alvanaki, F., Sebastian, M., Ramamritham, K., & Weikum, G. (2011). EnBlogue: emergent topic detection in web 2.0 streams. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data (SIGMOD '11) (pp. 1271-1274). New York: ACM.
    Bun, K., & Ishizuka, M. (2002). Topic Extraction from News Archive Using TF*PDF Algorithm. In Proceedings of the 3rd International Conference on Web Information Systems Engineering (WISE '02) (pp. 73-82). Washington: IEEE Computer Society.
    Cataldi, M., Caro, L., & Schifanella, C. (2010). Emerging topic detection on Twitter based on temporal and social terms evaluation. In Proceedings of the Tenth International Workshop on Multimedia Data Mining (MDMKDD '10). New York.
    Chen, K., Luesukprasert, L., & Chou, S.-C. (2007, 8 19). Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling. IEEE Trans. on Knowl. and Data Eng., pp. 1016-1025.
    Chen, S., & Chen, C. (2011, 2). TAIEX Forecasting Based on Fuzzy Time Series and Fuzzy Variation Groups. Fuzzy Systems, IEEE Transactions on, pp. 1-12.
    Ferro, T., Divine, D., & Zachry, M. (2012). Knowledge workers and their use of publicly available online services for day-to-day work. In Proceedings of the 30th ACM international conference on Design of communication (SIGDOC '12), (pp. 47-54). New York.
    Hastie, T., Tibshirani, R., & Friedman, J. (2009). 14.3.12 Hierarchical clustering. In The Elements of Statistical Learning (2nd ed.).
    Kaleel, S., AlMeshary, M., & Abhari, A. (2013). Event detection and trending in multiple social networking sites. In Proceedings of the 16th Communications & Networking Symposium (CNS '13). San Diego: Society for Computer Simulation International.
    Kaplan, A., & Haenlein, M. (2010). Users of the world, unite! The challenges and opportunities of Social Media. Business Horizons, (pp. 59-68).
    Kaplan, A., & Haenlein, M. (2011). The early bird catches the news: Nine things you should know about micro-blogging. Business Horizons, (pp. 105-113).
    Kong, S., Feng, L., Sun, G., & Luo, K. (2012). Predicting lifespans of popular tweets in microblog. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (SIGIR '12), (pp. 1129-1130). New York.
    Kontostathis, A., Galitsky, L., Pottenger, W., Roy, S., & Phelps, D. (2003). A survey of emerging trend detection in textual data mining. In Survey of Text Mining (pp. 185-224).
    Kwak, H., Lee, C., Park, H., & Moon, S. (2010). What is Twitter, a social network or a news media? In Proceedings of the 19th international conference on World wide web (WWW '10), (pp. 591-600). New York.
    Lerman, K., & Hogg, T. (2010). Using a model of social dynamics to predict popularity of news. In Proceedings of the 19th international conference on World wide web (WWW '10)., (pp. 621-630). New York.
    Peng, F., Qian, X., Meng, H., Zhou, D., & Li, G. (2011). Research on algorithm of extracting micro-blog's hot topics. Electronics, Communications and Control (ICECC), (pp. 986-989).
    Pervin, N., Fang, F., Datta, A., Dutta, K., & Vandermeer, D. (2013, 1 19). Fast, Scalable, and Context-Sensitive Detection of Trending Topics in Microblog Post Streams. ACM Trans. Manage. Inf. Syst.
    Porter, M. (1997). An algorithm for suffix stripping. In Readings in information retrieval, (pp. 313-316). San Francisco.
    Sokal, R., & Michener, C. (1958). A statistical method for evaluating systematic relationships. University of Kansas Science Bulletin, pp. 1409–1438.
    Tang, J., Wang, X., Gao, H., Hu, X., & Liu, H. (2012). Enriching short text representation in microblog for clustering. Front. Comput. Sci China, (pp. 88-101).
    Wang, C., Zhang, M., Ru, L., & Ma, S. (2008). Automatic online news topic ranking using media focus and user attention based on aging theory. In Proceedings of the 17th ACM conference on Information and knowledge management (CIKM '08), (pp. 1033-1042). New York.
    Wang, Y., Agichtein, E., & Benzi, M. (2012). TM-LDA: efficient online modeling of latent topic transitions in social media. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '12), (pp. 123-131). New York.
    Wu, H., Salzberg, B., & Zhang, D. (2004). Online event-driven subsequence matching over financial data streams. In Proceedings of the 2004 ACM SIGMOD international conference on Management of data (SIGMOD '04), (pp. 23-34). New York.
    Yu, C., Zhang, X., & Luo, H. (2010). Mining Hot Topics from Free-Text Customer Reviews An LDA-Based Approach. Web Information Systems and Applications Conference (WISA), (pp. 85-89).
    Zheng, D., & Li, F. (2009). Hot Topic Detection on BBS Using Aging Theory. In Proceedings of the International Conference on Web Information Systems and Mining (WISM '09), (pp. 129-138).

    無法下載圖示 校內:2018-08-28公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE