| 研究生: |
黃繼民 Huang, Chi-Min |
|---|---|
| 論文名稱: |
以訊框及語義探勘為基礎之自動化地點標記方法 Frame-based Semantics Mining for Automatic Place Labeling |
| 指導教授: |
曾新穆
Tseng, Shin-Mu |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2013 |
| 畢業學年度: | 101 |
| 語文別: | 英文 |
| 論文頁數: | 62 |
| 中文關鍵詞: | 地點標記 、使用者行為分析 、類別不平衡問題 、多層式分類 |
| 外文關鍵詞: | Place labeling, User behavior analysis, Class imbalance problem, Multi-level classification |
| 相關次數: | 點閱:104 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,關於利用使用者智慧型手機資料以自動化地點標記的研究議題備受矚目。雖然在文獻上已有許多關於自動化地點標記的技術,但是大部份技術的觀念都是單純地把自動化地點標記轉換為一個多類別的分類問題。此外,由於使用者經常在相同標記地點之中做各種不同的活動,而這些地點都被標示為相同的標記,如何在這樣異質性的資料集中建立出一個彈性的標記模型也是一個具有挑戰性的議題。在此論文中,我們提出一個創新的方法名為以訊框為基礎的語義探勘(Frame-based Semantics Mining)來結合使用者的智慧型手機資料並同時利用使用者行為及環境資訊來標記地點。本研究為首例在使用者智慧型手機資料中同時考慮到語義相似度及訊框特徵值之研究。經由一系列使用Nokia Mobile Data Challenge [31] 之真實資料進行的完整實驗,證明我們提出之以訊框及語義探勘為基礎之自動化地點標記方法能有效地標記地點。
In recent years, researches on automatic place labeling based on users’ smartphone data have attracted a lot of attention. However, most of proposed automatic place labeling techniques only transform the automatic place labeling to a multi-class classification problem. Furthermore, since users always perform many different activities in the places which are labeled the same semantic label, how to build a flexible labeling model based on such kind of heterogeneous data is also a challenging issue. In this thesis, we propose a novel approach named Frame-based Semantics Mining (FS-Mining) that integrates users’ smartphone data for labeling a place based on the users’ behaviors and environment of place. To our best knowledge, this is the first work on automatic place labeling that considers similarity between semantic labels and frame features in users’ smartphone data. Through comprehensive experimental evaluations on a real dataset from Nokia Mobile Data Challenge [31], the proposed FS-Mining is shown to deliver excellent performance.
[1] https://research.nokia.com/page/12000
[2] https://foursquare.com/
[3] https://www.everytrail.com/
[4] S. Bergamaschi, E. Domnori, F. Guerra, M. Orsini, R. T. Lado, and Y. Velegrakis, "Keymantic: semantic keyword-based searching in data integration systems," Proceedings of the Very Large Data Base Endowment, vol. 3, pp. 1637-1640, 2010.
[5] L. Breiman, "Bagging predictors," Machine Learning, vol. 24, pp. 123-140, 1996.
[6] P. Buchlmann and B. Yu, "Analyzing Bagging," The Annals of Statistics, vol. 30, pp. 927-961, 2002.
[7] A. Buja and W. Stuetzle, "Observations on bagging," Statistica Sinica, vol. 16, pp. 323-351, 2006.
[8] N. Chawla, K. Bowyer, L. Hall, and W. Kegelmeyer, "SMOTE: Synthetic minority over-sampling technique," Journal of Artificial Intelligence Research, vol. 16, pp. 321-357, 2002.
[9] N. Chawla, A. Lazarevic, L. Hall, and K. Bowyer, "SMOTEBoost: Improving prediction of the minority class in boosting," Proceedings of European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 107-119, Cavtat-Dubrovnik, Croatia, 2003.
[10] X. Chen, B. Gerlach, and D. Casasent, "Pruning Support Vectors for Imbalanced Data Classification," Proceedings of International Joint Conference on Neural Networks, pp. 1883-1888, Canada, 2005.
[11] Y. Chon, Y. Kim, and H. Cha, "Autonomous place naming system using opportunistic crowdsensing and knowledge from crowdsourcing," Proceedings of the ACM/IEEE Conference on Information Processing in Sensor Networks, pp. 19-30, Philadelphia, USA, 2013.
[12] Y. Chon, Y. Kim, H. Shin, and H. Cha, "Topic Modeling-based Semantic Annotation of Place using Personal Behavior and Environmental Features," Proceedings of the Mobile Data Challenge by Nokia Workshop, co-located with Pervasive 2012, Newcastle, United Kingdom, 2012.
[13] B. V. Dasarathy and B. V. Sheela, "Composite Classifier System-Design - Concepts and Methodology," Proceedings of the IEEE, vol. 67, pp. 708-713, 1979.
[14] A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum Likelihood from Incomplete Data Via Em Algorithm," Journal of the Royal Statistical Society Series B-Methodological, vol. 39, pp. 1-38, 1977.
[15] T. Do and D. Gatica-Perez, "The Places of Our Lives: Visiting Patterns and Automatic Labeling from Longitudinal Smartphone Data," IEEE Transactions on Mobile Computing, vol. PP, p. 1, 2013.
[16] P. Domingos, "MetaCost: A General Method for Making Classifiers Cost-Sensitive," Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 155-164, San Diego, CA , USA, 1999.
[17] A. Estabrooks, T. Jo, and N. Japkowicz, "A Multiple Resampling Method for Learning from Imbalanced Data Sets," Computational Intelligence, vol. 20, pp. 18-36, 2004.
[18] Y. Freund and R. E. Schapire, "Experiments with a new boosting algorithm," Proceedings of the International Conference on Machine Learning, pp. 148-156, Bari, Italy, 1996.
[19] Y. Freund and R. E. Schapire, "A decision-theoretic generalization of on-line learning and an application to boosting," Journal of Computer and System Sciences, vol. 55, pp. 119-139, 1997.
[20] B. Guc, M. May, Y. Saygin, and C. Korner, "Semantic Annotation of GPS Trajectories," Proceedings of the AGILE International Conference on Geographic Information Science, Girona, Spain, 2008.
[21] V. Hegde, J. X. Parreira, and M. Hauswirth, "Semantic Tagging of Places Based on User Interest Profiles from Online Social Networks," Proceedings of European Conference on Information Retrieval, pp. 218-229, Moscow, Russia 2013.
[22] T. R. Hoens, Q. Qian, N. V. Chawla, and Z.-H. Zhou, "Building Decision Trees for the Multi-class Imbalance Problem," Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 122-134, Malaysia, 2012.
[23] C.-W. Hsu and C.-J. Lin, "A comparison of methods for multi-class support vector machines," IEEE Transactions on Neural Networks, vol. 13, pp. 415-425, 2002.
[24] C.-M. Huang, J. J.-C. Ying, and V. S. Tseng, "Mining Users Behaviors and Environments for Semantic Place Prediction," Proceedings of the Mobile Data Challenge by Nokia Workshop, co-located with Pervasive 2012, Newcastle, United Kingdom, 2012.
[25] S.-J. Huang, Y. Yu, and Z.-H. Zhou, "Multi-label hypothesis reuse," Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp.525-533, Beijing, China, 2012.
[26] S.-J. Huang and Z.-H. Zhou, "Multi-label learning by exploiting label correlations locally," Proceedings of the 26th AAAI Conference on Artificial Intelligence, pp. 949-955, Toronto, Canada, 2012.
[27] M. Kubat and S. Matwin, "Addressing the Curse of Imbalanced Data Sets: One Sided Sampling," Proceedings of the 14th International Conference on Machine Learning, pp. 179-186, Nashville, Tennessee, USA, 1997.
[28] M. Kubat and S. Matwin, "Learning When Negative Examples Abound," Proceedings of the European Conference on Machine Learning, pp. 146-153, Prague, Czech Republic, 1997.
[29] L. I. Kuncheva, Combining Pattern Classifiers: Methods and Algorithms: Wiley-Interscience, 2004.
[30] L. I. Kuncheva and C. J. Whitaker, "Controlling the diversity in classifier ensembles through a measure of agreement," Pattern Recognition, vol. 38, no. 11, pp. 2195–2199, 2005.
[31] J. K. Laurila, D. Gatica-Perez, I. Aad, J. Blom, O. Bornet, T. Do, O. Dousse, J. Eberle, and M .Miettinen, "The Mobile Data Challenge: Big Data for Mobile Computing Research," Proceedings of the Mobile Data Challenge by Nokia Workshop, co-located with Pervasive 2012, Newcastle, United Kingdom, 2012.
[32] D. Lian and X. Xie, "Learning location naming from user check-in histories," Proceedings of ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, pp. 112-121, Chicago, IL, USA, 2011.
[33] L. Liao, D. Fox, and H. Kautz, "Location-based activity recognition using relational markov networks," Proceedings of the International Joint Conference on Artificial Intelligence, pp. 773-778, Edinburgh, Scotland, United Kingdom, 2005.
[34] L. Liao, D. Fox, and H. Kautz, "Extracting places and activities from GPS traces using hierarchical conditional random fields," International Journal of Robotics Research, vol. 26, pp. 119-134, 2007.
[35] X. Y. Liu, J. X. Wu, and Z. H. Zhou, "Exploratory Undersampling for Class-Imbalance Learning," IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics, vol. 39, pp. 539-550, 2009.
[36] H.-Y. Lo, K.-W. Chang, S.-T. Chen, T.-H. Chiang, C.-S. Ferng, C.-J. Hsieh, Y.-K. Ko, T.-T. Kuo, H.-C. Lai, K.-Y. Lin, C.-H. Wang, H.-F. Yu, C.-J. Lin, H.-T. Lin, and S.-D. Lin, "An Ensemble of Three Classifiers for KDD Cup 2009: Expanded Linear Model, Heterogeneous Boosting, and Selective Naive Bayes," Journal of Machine Learning Research - Proceedings Track, vol. 7, pp. 57-64, 2009.
[37] R. Montoliu, A. Martnez-Uso, and J. Martnez-Sotoca, "Semantic place prediction by combining smart binary classifiers," Proceedings of the Mobile Data Challenge by Nokia Workshop, co-located with Pervasive 2012, Newcastle, United Kingdom, 2012.
[38] M. Pazzani, C. Merz, P. Murphy, K. Ali, T. Hume, and C. Brunk, "Reducing misclassification costs," Proceedings of the International Conference on Machine Learning, pp. 217–225, New Brunswick, NJ, 1994.
[39] T. Rattenbury, N. Good, and M. Naaman, "Towards automatic extraction of event and place semantics from flickr tags," Proceedings of ACM SIGIR Special Interest Group on Information Retrieval, pp. 103-110, Amsterdam, The Netherlands, 2007.
[40] N. Ravi, N. Dandekar, P. Mysore, and M. L. Littman, "In Activity Recognition from Accelerometer Data," Proceedings of Association for the Advancement of Artificial Intelligence, pp. 1541-1546, Pittsburgh, Pennsylvania, USA, 2005.
[41] J. D. Rodríguez, A. P. Martínez, and J. A. Lozano, "Sensitivity Analysis of k-Fold Cross Validation in Prediction Error Estimation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, pp. 569`575, 2010.
[42] L. Rokach, Pattern Classification Using Ensemble Methods: World Scientific, 2010.
[43] S. M. Ross, Introduction to probability and statistics for engineers and scientists: Elsevier Academic Press, 2009.
[44] A. Sae-Tang, M. Catasta, and L. K. McDowell, "Report on Dedicated Task 1: Semantic Place Prediction," Proceedings of the Mobile Data Challenge by Nokia Workshop, co-located with Pervasive 2012, Newcastle, United Kingdom, 2012.
[45] R. E. Schapire, "The Strength of Weak Learnability," Machine Learning, vol. 5, pp. 197-227, 1990.
[46] Y. Sun, M. Kamel, and Y. Wang, "Boosting for learning multiple classes with imbalanced class distribution," Proceedings of the IEEE International Conference on Data Mining, pp. 592-602, Hong Kong, China, 2006.
[47] J. Wu, H. Xiong, and J. Chen, "COG: local decomposition for rare class analysis," Data Mining and Knowledge Discovery, vol. 20, pp. 191-220, 2010.
[48] G. Xu, Y. Gu, P. Dolog, Y. Zhang, and M. Kitsuregawa, "SemRec: A Semantic Enhancement Framework for Tag Based Recommendation," Proceedings of the AAAI Conference on Artificial Intelligence, pp. 1267-1272, San Francisco, California, USA, 2011.
[49] Z. Yan and D. Chakraborty, "SAMMPLE: Detecting Semantic Indoor Activities in Practical Settings using Locomotive Signatures," Proceedings of International Symposium on Wearable Computers, pp. 37-40, Newcastle, United Kingdom, 2012.
[50] Y. Yang and J. Pedersen, "A comparative study on feature selection in text categorization," Proceedings of the International Conference on Machine Learning, pp. 412-420, Nashville, Tennessee, USA, 1997.
[51] M. Ye, K. Janowicz, C. Mulligann, W.-C. Lee, "What you are is when you are: the temporal dimension of feature types in location-based social networks," Proceedings of ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, pp. 102-111, Chicago, IL, USA,2011.
[52] M. Ye, D. Shou, W.-C. Lee, P. Yin, K. Janowicz, "On the semantic annotation of places in location-based social networks," Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 520-528, San Diego, CA, USA, 2011.
[53] J. J.-C. Ying, W.-C. Lee, T.-C. Weng, and V. S. Tseng, "Semantic Trajectory Mining for Location Prediction," Proceedings of the ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 34-43, Chicago, IL, USA, 2011.
[54] H.-F. Yu, H.-Y. Lo, H.-P. Hsieh, J.-K. Lou, T. G.McKenzie, J.-W. Chou, P.-H. Chung, C.-H. Ho, C.-F. Chang, Y.-H. Wei, J.-Y. Weng, E.-S. Yan, C.-W. Chang, T.-T. Kuo, Y.-C. Lo, P. T. Chang, C. Po, C.-Y. Wang, Y.-H. Huang, C.-W. Hung, Y.-X. Ruan, Y.-S. Lin, S.-d. Lin, H.-T. Lin, and C.-J. Lin, "Feature Engineering and Classifier Ensemble for KDD Cup 2010," Proceedings of the KDD Cup 2010 Workshop, pp. 1-16, Washington, DC, 2010.
[55] J. Yuan, Y. Zheng, X. Xie, "Discovering regions of different functions in a city using human mobility and POIs," Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 186-194, Beijing, China, 2012.
[56] Z.-H. Zhou, Ensemble Methods: Foundations and Algorithms: Chapman & Hall/CRC, 2012.
[57] Z.-H. Zhou and X.-Y. Liu, "Training cost-sensitive neural networks with methods addressing the class imbalance problem," IEEE Transactions on Knowledge and Data Engineering, vol. 18, pp. 63-77, 2006.
[58] Z.-H. Zhou and X.-Y. Liu, "On Multi-Class Cost-Sensitive Learning," Computational Intelligence, vol. 26, pp. 232-257, 2010.
[59] Y. Zhu, Y. Sun, and Y. Wang, "Predicting Semantic Place and Next Place via Mobile Data," Proceedings of the Mobile Data Challenge by Nokia Workshop, co-located with Pervasive 2012, Newcastle, United Kingdom, 2012.
[60] Y. Zhu, E. Zhong, Z. Lu, and Q. Yang, "Feature Engineering for Place Category Classification" Proceedings of the Mobile Data Challenge by Nokia Workshop, co-located with Pervasive 2012, Newcastle, United Kingdom, 2012.
校內:2018-08-29公開