研究生: |
吳政瑋 Wu, Cheng-Wei |
---|---|
論文名稱: |
高效性高效益型樣探勘演算法之研究 A Study of Efficient Algorithms for Mining High Utility Patterns in Transactional and Event Databases |
指導教授: |
曾新穆
Tseng, Vincent S. |
學位類別: |
博士 Doctor |
系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2015 |
畢業學年度: | 103 |
語文別: | 英文 |
論文頁數: | 126 |
中文關鍵詞: | 高效益型樣 、高效益項目集 、前k高效益項目集 、封閉高效益項目集 、高效益情節 |
外文關鍵詞: | Data mining, utility mining, top-k high utility itemset, closed+ high utility itemset, high utility episode |
相關次數: | 點閱:99 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
高效益型樣探勘(High Utility Pattern Mining)是資料探勘領域中一門新興的研究議題,其與傳統型樣探勘問題不同之處乃在於其不只考慮資料集中項目之發生頻率或相互關聯等特性,同時還加入了項目的效益屬性(例如:價值、重要度等)此新概念,因此能廣泛運用於商業、生醫、工程等領域中。高效益項目集探勘(High Utility Itemset Mining)為其中一個重要的研究議題,其主要目的是從程序資料庫中挖掘出效益值不低於使用者自訂之最小效益門檻值(Minimum Utility Threshold)的項目集(例如:利潤高於門檻值的商品組合)。雖然近年已有許多相關研究被提出,但現今研究仍存有以下不足之處: (1)使用者有不易設定最小效益門檻值之問題。若門檻值設定過高,將產生太少型樣;反之,若門檻值設定過低,將產生太多不重要的型樣並降低執行效率。對於想找出最重要之高效益項目集的使用者而言,缺乏有效率之方式。(2)對於需用門檻值探勘高效益項目集的使用者而言,現今方法往往在探勘過程中產生過多冗餘的項目集,因而導致探勘工作之執行效能及儲存空間等問題,而此問題在處理密集型資料集時變得更加嚴重。(3)雖然事件資料庫(Event Database)這種循序資料在許多應用中很常見,然而,目前尚未存有能針對這種較複雜的資料有效率挖掘高效益型樣的方法。
為了解決上述議題,本論文探討了一系列新穎的高效益型樣探勘問題:(1)前k高效益項目集探勘(Top-k High Utility Itemset Mining);(2)封閉高效益項目集探勘(Closed+ High Utility Itemset Mining);(3)高效益情節探勘(High Utility Episode Mining)。
在第一個研究主題中,本論文針對使用者不易設定門檻值之問題,率先提出一個創新探勘前k高效益項目集的架構,其中k 代表的是使用者期望找到的高效益項目集個數。此外,我們還提出第一個有效率從交易資料庫挖掘前k高效益項目集的演算法TKU (mining Top-K Utility itemsets)及其相關技術。實驗結果顯示TKU能夠非常有效率地從交易資料庫中挖掘前k高效益項目集。
在第二個研究主題中,本論文率先提出一個不失真的高效益項目集精簡表示法,稱為封閉高效益項目集(Closed+ High Utility Itemset),簡稱CHUI,以解決現今方法在探勘過程中產生過多冗餘項目集之問題。只挖掘CHUI不僅能減少探勘所需的時間及記憶體使用量,也不會遺失任何高效益項目集的資訊。因此CHUI又稱為不冗餘的高效益項目集。為了挖掘CHUI,我們提出三個有效率的演算法AprioriHC、AprioriHC-D及CHUD (Closed+ High Utility itemset Discovery)。不僅如此,我們還提出第一個有效率從CHUI還原所有高效益項目集的方法DAHU (Derive All High Utility itemsets)。實驗結果顯示,所提出的精簡表示法能夠非常有效地壓縮高效益項目集,且所提出的方法能夠非常有效率地挖掘CHUI。合併CHUD與DAHU不僅能找出所有高效益項目集,其執行效能甚至優於既存最優之高效益項目集探勘方法。
在第三個研究主題中,本研究針對事件資料庫這種複雜度更高的循序資料,率先提出一個新穎探勘高效益情節的架構,用來找出高效益且相關的循序事件組合,也就是高效益情節。在此架構下,我們提出一套計算情節效益值的方式,並提出第一個探勘高效益情節的演算法UP-Span (Utility ePisodes mining by Spanning prefixes)。實驗結果顯示,UP-Span 能夠非常有效率地挖掘高效益情節且具備優良的延展性。
本研究中所探討的三個主題之共同目的,乃基於實際應用上之需求及現今方法不足之處,提出一系列新穎之高效益型樣探勘演算法,有效性地提供使用者精簡且重要的高效益型樣。我們將上述三個研究主題有系統地整合於本論文中,從理論及實務的角度分析所提方法之優點及限制,並經由一系列之實驗驗證所提出新創方法之優越性。本研究之主要貢獻一方面對高效益型樣探勘領域之學理研究演進可產生高度之影響性,另一方面則將能廣泛運用於各種領域之應用中,對人類生活產生創新之價值。
High utility pattern mining is an emerging topic in data mining, which differs from traditional pattern mining in that it considers not only the basic properties of items (like frequency or associations) but also the utility (e.g., profit) as a new concept. Hence, it can be applied to various kinds of domains like commerce, biomedicine, and engineering. One important research issue in this field is high utility itemset (abbreviated as HUI) mining, which refers to discovering itemsets having a utility no less than a user-specified minimum utility threshold min_util in transactional databases. Though many studies have been conducted on HUI mining, existing methods incur the following deficiencies: First, it is not easy for users to set an appropriate min_util threshold. Inappropriate setting of this threshold will result in few or too large HUIs. Second, for the users who need to use min_util to find HUIs in dense databases, existing methods may produce too many redundant HUIs, which not only seriously degrades the performance of the mining task but also makes it difficult for analysis. Third, HUI mining cannot discover ordered sets of items carrying high utility in sequential data such as event database even though event databases are essential to many real-life applications.
To resolve the above issues and meet the different requirements in various applications, this dissertation addresses a series of novel high utility pattern mining problems, including (1) top-k high utility itemset mining, (2) closed+ high utility itemset mining and (3) high utility episode mining.
Regarding the first research topic, we address the first problem mentioned above by proposing a novel framework called top-k high utility itemset mining, where k is the desired number of HUIs to be mined. In this framework, we propose an efficient algorithm called TKU (mining Top-K Utility itemsets) for efficiently mining the complete set of top-k high utility itemsets in transactional databases. Experiments show that the performance of TKU is very close to that of the optimal case of state-of-the-art HUI mining algorithms.
Regarding the second research topic, we tackle the problem of redundancy of HUIs by proposing a lossless and compact representation of HUIs, named closed+ high utility itemset (abbreviated as CHUI). Mining CHUI reduces both the execution time and the memory consumption of the mining task. CHUIs are non-redundant HUIs because the set of all the CHUIs retains information of all the HUIs. Three algorithms respectively called AprioriHC AprioriHC-D and CHUD (Closed+ High Utility itemset Discovery) are proposed for efficiently mining CHUIs. Moreover, a novel algorithm called DAHU (Derive All High Utility itemsets) is proposed for efficiently recovering all HUIs from CHUIs. Experiments show that the proposed algorithms are very efficient even in some cases where the state-of-the-art algorithms fail to complete the mining task.
As to the third research topic, we propose the first framework for mining high utility episodes (abbreviated as HUEs), that is, ordered sets of relevant events carrying high utility. In this framework, we define a new way to calculate the utility of episodes and propose an efficient algorithm called UP-Span (Utility ePisodes mining by Spanning prefixes) for mining the complete set of HUEs in event databases.
The above three research topics are connected around a major research aim: Devising a series of novel utility mining methods for efficiently discovering different types of high utility patterns from transactional or event databases, with the goal of providing users important and concise mining results. Experimental results on both real and synthetic datasets show that the proposed algorithms outperform the state-of-the-art utility mining algorithms significantly. As an ultimate goal, this dissertation contributes in achieving impactful research methods and high-potential applications for human life with advancing the emerging field of utility pattern mining.
[1] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules,” in Proc. of Int'l Conf. on Very Large Data Bases, pp. 487– 499, 1994.
[2] R. Agrawal and R. Srikant, “Mining Sequential Patterns,” in Proc. of Int'l Conf. on Data Engineering, pp. 3–14, 1995.
[3] C. F. Ahmed, S. K. Tanbeer and B. Jeong, “A Novel Approach for Mining High-Utility Sequential Patterns in Sequence Databases,” ETRI Journal, 32(5):676-686, 2010.
[4] C. F. Ahmed, S. K. Tanbeer and B. Jeong, “Efficient Mining of High Utility Patterns over Data Streams with a Sliding Window Method,” in Proc. of Int'l Conf. on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, pp. 99–113, 2010.
[5] C. F. Ahmed, S. K. Tanbeer and B. Jeong, “Mining High Utility Web Access Sequences in Dynamic Web Log Data,” in Proc. of Int'l Conf. on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, pp. 76-81, 2010.
[6] C. F. Ahmed, S. K. Tanbeer, B. Jeong and H. Choi, “A Framework for Mining Interesting High Utility Patterns with a Strong Frequency Affinity,” Information Sciences, 181(21):4878-4894, 2011.
[7] C. F. Ahmed, S. K. Tanbeer, B. Jeong and H. Choi, “Interactive Mining of High Utility Patterns over Data Streams,” Expert Systems with Applications, 39(15):11979-11991, 2012.
[8] C. F. Ahmed, S. K. Tanbeer, B. Jeong and Y. Lee, “An Efficient Candidate Pruning Technique for High Utility Pattern Mining,” in Proc. of Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 749-756, 2009.
[9] C. F. Ahmed, S. K. Tanbeer, B. Jeong and Y. Lee, “Mining High Utility Patterns in Incremental Databases,” in Proc. of Int'l Conf. on Ubiquitous Information Management and Communication, pp. 656-663, 2009.
[10] C. F. Ahmed, S. K. Tanbeer, B. Jeong and Y. Lee, “Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases,” IEEE Transactions on Knowledge and Data Engineering, 21(12):1708-1721, 2009.
[11] J. Boulicaut, A. Bykowski and C. Rigotti, “Free-sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries,” Data Mining and Knowledge Discovery, 7(1): 5–22, 2003.
[12] Y. L. Cheung and A. W. Fu, “Mining Frequent Itemsets without Support Threshold: with and without Item Constraints,” IEEE Transactions on Knowledge and Data Engineering, 16(6): 1052-1069, 2004.
[13] T. Calders and B. Goethals, “Mining All Non-derivable Frequent Itemsets,” in Proc. of the Int’l Conf. on European Conference on Principles of Data Mining and Knowledge Discovery, pp. 74-85, 2002.
[14] K. Chuang, J. Huang and M. Chen, “Mining Top-K Frequent Patterns in the Presence of the Memory Constraint,” VLDB Journal, Vol. 17, pp. 1321-1344, 2008.
[15] C. Chu, V. S. Tseng and T. Liang, “An Efficient Algorithm for Mining Temporal High Utility Itemsets from Data Streams,” Journal of Systems and Software, 81(7):1105-1117, 2008.
[16] C. Chu, V. S. Tseng and T. Liang, “An Efficient Algorithm for Mining High Utility Itemsets with Negative Item Values in Large Databases,” Applied Mathematics and Computation, 215(2):767-778, 2009.
[17] R. Chan, Q. Yang and Y. Shen, “Mining High Utility Itemsets,” in Proc. of IEEE Int’l Conf. on Data Mining, pp. 19-26, 2003.
[18] A. Erwin, R. P. Gopalan and N. R. Achuthan, “A Bottom-up Projection Based Algorithm for Mining High Utility Itemsets,” in Proc. of Int’l Conf. on Artificial Intelligence and Data Mining, pp. 3-10, 2007.
[19] A. Erwin, R. P. Gopalan and N. R. Achuthan, “CTU-Mine: An Efficient High Utility Itemset Mining Algorithm Using the Pattern Growth Approach,” in Proc. of Int’l Conf. on Instruction & Technology, pp. 71-76, 2007.
[20] A. Erwin, R. P. Gopalan and N. R. Achuthan, “Efficient Mining of High Utility Itemsets from Large Datasets,” in Proc. of Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 554-561, 2008.
[21] P. Fournier-Viger, A. Gomariz, T. Gueniche, A. Soltani, C. Wu and V. S. Tseng, “SPMF: a Java Open-Source Pattern Mining Library,” Journal of Machine Learning Research, 15:3389-3393, 2014. Available at (http://www.philippe-fournier-viger.com/spmf/).
[22] P. Fournier-Viger and V. S Tseng, “Mining Top-K Sequential Rules,” in Proc. of Int’l Conf. on Advanced Data Mining and Applications, pp. 180-194, 2011
[23] P. Fournier-Viger and V. S Tseng, “Mining Top-K Non-redundant Association Rules,” in Proc. of Int’l Conf. on Foundations of Intelligent Systems, pp. 31-40, 2012.
[24] P. Fournier-Viger and V. S Tseng, “TNS: Mining Top-K Non-redundant Sequential Rules,” in Proc. of Int’l Conf. on Annual ACM Symposium on Applied Computing, pp. 164–166, 2013.
[25] L. Feng, L. Wang and B. Jin, “UT-Tree: Efficient Mining of High Utility Itemsets from Data Streams,” Intelligent Data Analysis, 17(4):585-602, 2013.
[26] P. Fournier-Viger, C. Wu and V. S. Tseng, “Mining Top-K Association Rules,” in Proc. of Int’l Conf. on Canadian conference on Advances in Artificial Intelligence, pp. 61–73, 2012.
[27] P. Fournier-Viger, C. Wu, S. Zida and V. S. Tseng, “FHM: Faster High-Utility Itemset Mining Using Estimated Utility Co-occurrence Pruning,” in Proc. of Int’l Symposium on Methodologies for Intelligent Systems, pp. 83-92, 2014.
[28] R. Gwadera, M. J. Atallah and W. Szpankowski, “Reliable Detection of Episodes in Event Sequences,” Knowledge and Information System, 7(4):415-437, 2005.
[29] J. M. Gandhi and K. S. Gayathri, “Activity Modeling in Smart Home using High Utility Pattern Mining over Data Streams,” Computing Research Repository, abs/1306.5982, 2013.
[30] K. Gouda and M. J. Zaki, “Efficiently Mining Maximal Frequent Itemsets,” in Proc. of the IEEE Int'l Conf. on Data Mining, pp. 163-170, 2001.
[31] K. Huang and C. Chang, “Efficient Mining of Frequent Episodes from Complex Sequences,” Information Systems, 33(1):96-114, 2008.
[32] T. Hong, C. Lee and S. Wang, “Mining High Average-utility Itemsets,” in Proc. of IEEE Int'l Conf. on Systems, Man, and Cybernetics, pp. 2526-2530, 2009.
[33] T. Hong , C. Lee and S. Wang, “Incremental Mining Algorithm for High Average-utility Itemsets,” in Proc. of Int'l Symposium on Pervasive Systems, Algorithms, and Networks, pp. 421-425, 2009.
[34] T. Hong, C. Lee and S. Wang, “Effective Utility Mining with the Measure of Average Utility,” Expert Systems with Applications, 38(7):8259-8265, 2011.
[35] J. Han, J. Pei and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” in Proc. of the ACM SIGMOD Int'l Conf. on Management of Data, pp. 1-12, 2000.
[36] J. Han, J. Wang, Y. Lu and P. Tzvetkov, “Mining Top-k Frequent Closed Patterns without Minimum Support,” in Proc. of IEEE Int'l Conf. on Data Mining, pp. 211-218, 2002.
[37] S. Krishnamoorthy, “Pruning Strategies for Mining High Utility Itemsets,” Expert Systems with Applications, Online Available, 2014.
[38] B. Kochar and R. S. Chhillar, “Mining Optimal Utility Incorporated Sequential Pattern from RFID Data Warehouse Using Genetic Algorithm,“ in Proc. of Int'l Conf. on Software Engineering and Computer Systems, pp. 659-676, 2011.
[39] J. Koh and I. Chiu, “An Efficient Approach for Mining Top-k High Utility Specialized Query Expansions on Social Tagging Systems,” in Proc. of Int’l Conf. on Database Systems for Advanced Applications, pp. 361-376, 2014.
[40] H. Li, “MHUI-max: An Efficient Algorithm for Discovering High-utility Itemsets from Data Streams,” Journal of Information Science, pp. 532-245, 2011.
[41] Y. Liu, C. Cheng and V. S. Tseng, “Mining Differential Top-k Co-expression Patterns from Time Course Comparative Gene Expression Datasets,” BMC Bioinformatics, 14:230, 2013.
[42] H. Li, H. Huang, Y. Chen, Y. Liu and S. Lee, “Fast and Memory Efficient Mining of High Utility Itemsets in Data Streams,” in Proc. of IEEE Int’l Conf. on Data Mining, pp. 881-886, 2008.
[43] G. Lan, T. Hong, J. Huang and V. S. Tseng, “On-shelf Utility Mining with Negative Item Values,” Expert Systems with Applications, 41(7):3450-3459, 2014.
[44] C. Lin, T. Hong and W. Lu, “Efficiently Mining High Average Utility Itemsets with a Tree Structure,” in Proc. of Asian Conference on Intelligent Information and Database Systems, pp. 131-139, 2010.
[45] H. Li, H. Huang and S. Lee, “Fast and Memory Efficient Mining of High-utility Itemsets from Data Streams: with and without Negative Item Profits,” Knowledge Information Systems, 28(3):495-522, 2011.
[46] C. Lin, T. Hong and W. Lu, “An Effective Tree Structure for Mining High Utility Itemsets,” Expert Systems with Applications, 38(6):7419-7424, 2011.
[47] C. Lin, T. Hong, G. Lan, H. Chen and H. Kao, “Incrementally Mining High Utility Itemsets in Dynamic Databases,” in Proc. of IEEE Int’l Conf. on Granular Computing, 303-307, 2010.
[48] C. Lin, T. Hong, G. Lan, J. Wong and W. Lin, “Incrementally Mining High Utility Patterns based on Pre-large Concept,” Applied Intelligence, 40(2):343-357, 2014.
[49] G. Lan, T. Hong and V. S. Tseng, “Discovery of High Utility Itemsets from On-shelf Time Periods of Products,” Expert Systems with Applications, 38(5):5851-5857, 2011.
[50] G. Lan, T. Hong and V. S. Tseng, “An Efficient Gradual Pruning Technique for Utility Mining,” International Journal of Innovative Computing Information and Control, 8(7):5165-5178, 2012.
[51] G. Lan, T. Hong and V. S. Tseng, “A Projection-based Approach for Discovering High Average-utility Itemsets,” Journal of Information Science and Engineering, 28(1):193-209, 2012.
[52] G. Lan, T. Hong and V. S. Tseng, “An Efficient Projection-based Indexing Approach for Mining High Utility Itemsets,” Knowledge Information Systems, 38(1):85-107, 2014.
[53] G. Lan, T. Hong, V. S. Tseng and S. Wang, “Applying the Maximum Utility Measure in High Utility Sequential Pattern Mining,” Expert Systems with Applications, 41(11):5071-5081, 2014.
[54] G. Lan, T. Hong, V. S. Tseng and S. Wang, ”An Improved Approach for Sequential Utility Pattern Mining,” in Proc. of IEEE Int’l Conf. on Granular Computing, pp. 226-230, 2012.
[55] Y. Liu, W. K. Liao and A. Choudhary, “A Fast High Utility Itemsets Mining Algorithm,” in Proc. of Utility-Based Data Mining Workshop, pp. 90-99, 2005.
[56] C. Lin, G. Lan and T. Hong, “An Incremental Mining Algorithm for High Utility Itemsets,” Expert Systems with Applications, 39(8):7173-7180, 2012.
[57] C. Lin, G. Lan, T. Hong and L. Kong, “Mining High Utility Itemsets Based on Transaction Deletion,” Advanced Technologies, Embedded and Multimedia for Human-centric Computing, pp 983-990, 2014.
[58] C. Lucchese, S. Orlando and R. Perego, “Fast and Memory Efficient Mining of Frequent Closed Itemsets,” IEEE Transactions on Knowledge and Data Engineering, 18(1): 21-36, 2006.
[59] M. Liu and J. Qu, “Mining High Utility Itemsets without Candidate Generation,” in Proc. of Int’l Conf. on Information and Knowledge Management, pp. 55-64, 2012.
[60] S. Laxman, P. S. Sastry and K. P. Unnikrishnan, “A Fast Algorithm for Finding Frequent Episodes in Event Streams,” in Proc. of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 410-419, 2007.
[61] J. Liu, K. Wang and B. C. M. Fung, “Direct Discovery of High Utility Itemsets without Candidate Generation,” in Proc. of IEEE Int'l Conf. on Data Mining, pp. 984-989, 2012.
[62] Y. Li, J. Yeh and C. Chang, “Isolated Items Discarding Strategy for Discovering High Utility Itemsets,” Data & Knowledge Engineering, 64:198–21, 2008.
[63] H. Mannila , H. Toivonen and A. I. Verkamo, “Discovery of Frequent Episodes in Event Sequences,” Data Mining and Knowledge Discovery, 1(3): 259-289, 1997.
[64] A. Ng and A. W. Fu, “Mining Frequent Episodes for Relating Financial Events and Stock Trends,” in Proc. of Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, pp. 27-39, 2003.
[65] D. Patnaik, P. Butler, N. Ramakrishnan, L. Parida, B. J. Keller and A. Hanauer, “Experiences with Mining Temporal Event Sequences from Electronic Medical Records,” in Proc. of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 360-368, 2011.
[66] N. Pasquier, Y. Bastide, R. Taouil and L. Lakhal, “Efficient Mining of Association Rules using Closed Itemset Lattice,” Journal of Information Systems, 24(1):25–46, 1999.
[67] J. Pisharath, Y. Liu, B. Ozisikyilmaz, R. Narayanan, W. K. Liao, A. Choudhary and G. Memik, NU-MineBench Version 2.0 Dataset and Technical Report.
Available at (http://cucis.ece.northwestern.edu/projects/DMS/MineBench.html)
[68] J. Pei, J. Han, B. Mortazavi-Asl, J. Wang, H. Pinto, Q. Chen, U. Dayal and M. Hsu, “Mining Sequential Patterns by Pattern-growth: The PrefixSpan Approach,” IEEE Transactions on Knowledge and Data Engineering, 16(11):1424–1440, 2004.
[69] B. Shie, J. Cheng, K. Chuang and V. S. Tseng, “A One-phase Method for Mining High Utility Mobile Sequential Patterns in Mobile Commerce Environments,” in Proc. of Int’l Conf. on Industrial, Engineering and Other Applications of Applied Intelligent Systems, pp. 616-626, 2012.
[70] B. Shie, H. Hsiao, V. S. Tseng and P. S. Yu, “Mining High Utility Mobile Sequential Patterns in Mobile Commerce Environments,” in Proc. of Int’l Conf. on Database Systems for Advanced Applications, pp. 224-238, 2011.
[71] K. Subramanian, P. Kandhasamy and S. Subramanian, “A Novel Approach to Extract High Utility Itemsets from Distributed Databases,” Computing and Informatics, 31:1597-1615, 2012.
[72] W. Song, Y. Liu and J. Li, “Vertical Mining for High Utility Itemsets,” in Proc. of IEEE Int’l Conf. on Granular Computing, pp. 429-434, 2012.
[73] B. Shie, V. S. Tseng and P. S. Yu, “Online Mining of Temporal Maximal Utility Itemsets from Data Streams,” in Proc. of Annual ACM Symposium on Applied Computing, pp. 1622-1626, 2010.
[74] B. Shie, P. S. Yu and V. S. Tseng, “Efficient Algorithms for Mining Maximal High Utility Itemsets from Data Streams with Different Models,” Expert Systems with Applications, 39(17):12947-12960, 2012.
[75] V. S. Tseng, C. J. Chu and T. Liang, “Efficient Mining of Temporal High Utility Itemsets from Data Streams,” in Proc. ACM KDD Workshop on Utility-Based Data Mining, 2006.
[76] V. S. Tseng, B. Shie, C. Wu and P. S. Yu, “Efficient Algorithms for Mining High Utility Itemsets from Transactional Databases,” IEEE Transactions on Knowledge and Data Engineering, 25(8):1772-1786, 2013.
[77] N. Tatti and J. Vreeken, “The Long and the Short of It: Summarizing Event Sequences with Serial Episodes,” in Proc. of ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, pp. 462-470, 2012.
[78] V. S. Tseng, C. Wu, P. Fournier-Viger and P. S. Yu, “Efficient Algorithms for Mining the Concise and Lossless Representation of Closed+ High Utility Itemsets,” IEEE Transactions on Knowledge and Data Engineering, DOI:10.1109/TKDE.2014.2345377, 2014.
[79] V. S. Tseng, C. Wu, B. Shie and P. S. Yu, “UP-Growth: An Efficient Algorithm for High Utility Itemset Mining,” in Proc. of ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, pp. 253-262, 2010.
[80] B. Vo, H. Nguyen, T. B. Ho and B. Le, “Parallel Method for Mining High utility Itemsets from Vertically Partitioned Distributed Databases,” in Proc. of Int’l Conf. on Knowledge-based and Intelligent Information and Engineering Systems, pp. 251-260, 2009.
[81] C. Wu, P. Fournier-Viger, P. S. Yu and V. S. Tseng, “Efficient Mining of a Concise and Lossless Representation of High Utility Itemsets,” in Proc. of IEEE Int'l Conf. on Data Mining, pp. 824-833, 2011.
[82] J. Wang and J. Han, “TFP: An Efficient Algorithm for Mining Top-K Frequent Closed Itemsets,” IEEE Transactions on Knowledge and Data Engineering, 17(5): 652-664, 2005.
[83] J. Wang, J. Han and J. Pei, “Closet+: Searching for the Best Strategies for Mining Frequent Closed Itemsets,” in Proc. of the ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, pp. 236–245, 2003.
[84] C. Wang, Y. Liu, L. Jian and P. Zhang, “A Utility-based Web Content Sensitivity Mining Approach,” in Proc. of Web Intelligence/Intelligent Agent Technology Workshops, pp. 428-431, 2008.
[85] C. Wu, Y. Lin, P. S. Yu and V. S. Tseng, “Mining High Utility Episodes in Complex Event Sequences,” in Proc. of ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, pp. 536-544, 2013.
[86] J. Wang, Y. Liu, L. Zhou, Y. Shi and X. Zhu, “Pushing Frequency Constraint to Utility Mining Model,” in Proc. of Int’l Conf. on Computational Science, pp. 685-692, 2007
[87] C. Wu, B. Shie, V. S. Tseng and P. S. Yu, “Mining Top-K High Utility Itemsets,” in Proc. of ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, pp. 78–86, 2012.
[88] U. Yun, “Mining Lossless Closed Frequent Patterns with Weight Constraints,” Knowledge-Based Systems, 20(1):86–97, 2007.
[89] H. Yao, H. J. Hamilton and C. J. Butz, “A Foundational Approach to Mining Itemset Utilities from Databases,” in Proc. of SIAM Int’l Conf. on Data Mining, pp. 482-486, 2004.
[90] H. Yao, H. J. Hamilton and L. Geng, “A Unified Framework for Utility-based Measures for Mining Itemsets,” in Proc. of ACM SIGKDD Workshop on Utility-Based Data Mining, pp. 28-37, 2006.
[91] S. Yen and Y. Lee, “Mining High Utility Quantitative Association Rules,” in Proc. of Int’l Conf. on Data Warehousing and Knowledge Discovery, pp283-292, 2007.
[92] J. Yeh, Y. Li and C. Chang, “Two-Phase Algorithms for a Novel Utility-frequent Mining Model,” in Proc. of PAKDD Workshops, pp. 433-444, 2007.
[93] G. Yu, K. Li and S. Shao, “Mining High Utility Itemsets in Large High Dimensional Data,” in Proc. of Int’l Workshop on Knowledge Discovery and Data Mining, pp. 17-20, 2008.
[94] J. Yin, Z. Zheng and L. Cao, “USpan: An Efficient Algorithm for Mining High Utility Sequential Patterns,” in Proc. of ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, pp. 660-668, 2012.
[95] J. Yin, Z. Zheng, L. Cao, Y. Song and W. Wei, “Efficiently Mining Top-K High Utility Sequential Patterns,” in Proc. of IEEE Int’l Conf. on Data Mining, pp. 1259-1264, 2013.
[96] M. Zihayat and A. An, “Mining Top-K High Utility Patterns over Data Streams,” Information Sciences, 285:138–161, 2014.
[97] X. Zhu, J. Guo, X. Cheng and Y. Lan, “More than Relevance: High Utility Query Recommendation by Mining Users' Search Behaviors,” in Proc. of Int'l Conf. on Information and Knowledge Management, pp. 1814-1818, 2012.
[98] M. J. Zaki and C. J. Hsiao, “Efficient Algorithms for Mining Closed Itemsets and Their Lattice Structure,” IEEE Transactions on Knowledge and Data Engineering, 17(4): 462–478, 2005.
[99] L. Zhou, Y. Liu, J. Wang and Y. Shi, “Utility-based Web Path Traversal Pattern Mining,” in Proc. of ICDM Workshops, pp. 373-380, 2007.
[100] Frequent Itemset Mining Implementations Repository.
Available at (http://fimi.cs.helsinki.fi/).
[101] IBM Quest Data Mining Project, Quest Synthetic Data Generation Code.
Available at (http://www.almaden.ibm.com/cs /quest/syndata.html).
[102] Microsoft Corporation. Example Database FoodMart of Microsoft SQL Server Analysis Server.