| 研究生: |
尹鈞緯 Yin, Jiun-Wei |
|---|---|
| 論文名稱: |
新穎之疾病風險樣式探勘與評估系統 : 以慢性腎臟病為例 A Novel Framework for Disease Risk Pattern Mining and Assessment: An Application on Chronic Kidney Disease |
| 指導教授: |
曾新穆
Tseng, Shin-Mu |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 醫學資訊研究所 Institute of Medical Informatics |
| 論文出版年: | 2014 |
| 畢業學年度: | 102 |
| 語文別: | 英文 |
| 論文頁數: | 88 |
| 中文關鍵詞: | 資料探勘 、疾病風險樣式 、疾病風險評估 、慢性腎臟病 |
| 外文關鍵詞: | data mining, disease risk pattern, disease risk assessment, chronic kidney disease |
| 相關次數: | 點閱:104 下載:2 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在醫學上,多數疾病的導因仍然是未知的,並且部分疾病在初期沒有顯著的病徵。因此,大部分的疾病是非常難以偵測或是預防的。為了達到有效的預防與早期治療,我們需要更加了解疾病的風險因子與發展樣式。然而,傳統尋找疾病風險因子與樣式的方式是藉由檢驗臨床所觀察到的可疑因子。不過這些由臨床經驗藉由人力所歸納出的疾病風險因子與樣式數量非常有限,並且有許多較少發生的因子容易被忽略。
在本研究中,我們提出了一個藉由醫學資料分析以提供醫療輔助功能的“疾病風險樣式探勘與評估系統”。我們希望可以藉由資料探勘的技術分析電子健康紀錄,找到疾病的潛在高風險因子與樣式,為疾病之醫學研究提供方向,並進一步利用找到的高風險樣式建立預測模型,在疾病未發生以前或是初期辨識出高風險病患。除此之外,為了驗證此系統,我們進一步採用全民健康保險研究資料庫,並以慢性腎臟病作為範例目標疾病進行實驗。根據醫療專家的鑑定,在我們的實驗結果中,確實有部分為潛在新穎的發現,可以有效地為未來之醫學研究提供可靠並具前瞻性的方向。
There are diseases with unknown risk patterns. Some of them may not cause obvious symptoms at the initiation stage such that they are hard to detect or predict. To achieve effective early treatment and prevention, we need to understand better the risk patterns and factors of diseases. Traditional methods for finding disease risk patterns primarily examine the clinical observations with statistical approaches one by one. However, there tend to be limited clinical observations which are manually validated through clinical experiences.
In this work, we proposed a novel framework for disease risk pattern mining and assessment by adopting data mining techniques. To evaluate the proposed framework, we utilize the National Health Insurance Research Database of Taiwan and take the Chronic Kidney Disease as an application target disease. Consequently, we found a number of risk patterns for CKD in form of association rules. Through evaluation by medical experts, some of the discoveries are found to be potentially new findings on CKD that may bring new insight for further medical studies.
[1] R. Agrawal, T. Imieliński, and A. Swami, “Mining association rules between sets of items in large databases,” in ACM SIGMOD Record, 1993, vol. 22, no. 2, pp. 207–216.
[2] R. Agrawal, R. Srikant, and others, “Fast algorithms for mining association rules,” in Proc. 20th int. conf. very large data bases, VLDB, 1994, vol. 1215, pp. 487–499.
[3] A. Aksakal and E. ADIŞEN, “Hidradenitis suppurativa: importance of early treatment; efficient treatment with electrosurgery,” Dermatologic Surg., vol. 34, pp. 228–231, 2008.
[4] E. Di Angelantonio, P. Gao, L. Pennells, S. Kaptoge, M. Caslake, A. Thompson, A. S. Butterworth, N. Sarwar, D. Wormser, D. Saleheen, C. M. Ballantyne, B. M. Psaty, J. Sundström, P. M. Ridker, D. Nagel, R. F. Gillum, I. Ford, P. Ducimetiere, S. Kiechl, W. Koenig, R. P. F. Dullaart, G. Assmann, R. B. D’Agostino, G. R. Dagenais, J. a Cooper, D. Kromhout, A. Onat, R. W. Tipping, A. Gómez-de-la-Cámara, A. Rosengren, S. E. Sutherland, J. Gallacher, F. G. R. Fowkes, E. Casiglia, A. Hofman, V. Salomaa, E. Barrett-Connor, R. Clarke, E. Brunner, J. W. Jukema, L. a Simons, M. Sandhu, N. J. Wareham, K.-T. Khaw, J. Kauhanen, J. T. Salonen, W. J. Howard, B. G. Nordestgaard, A. M. Wood, S. G. Thompson, S. M. Boekholdt, N. Sattar, C. Packard, V. Gudnason, and J. Danesh, “Lipid-related markers and cardiovascular disease prediction.,” JAMA, vol. 307, no. 23, pp. 2499–2506, Jun. 2012.
[5] F. C. Arnett, S. M. Edworthy, D. A. Bloch, D. J. McShane, J. F. Fries, N. S. Cooper, L. A. Healey, S. R. Kaplan, M. H. Liang, H. S. Luthra, and others, “The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis,” Arthritis Rheum., vol. 31, no. 3, pp. 315–324, 1988.
[6] Z. H. Bajwa, K. A. Sial, A. B. Malik, and T. I. Steinman, “Pain patterns in patients with polycystic kidney disease,” Kidney Int., vol. 66, no. 4, pp. 1561–1569, Oct. 2004.
[7] S. Brin, R. Motwani, J. D. Ullman, and S. Tsur, “Dynamic Itemset Counting and Implication Rules for Market Basket Data,” in Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, 1997, vol. 26, no. 2, pp. 255–264.
[8] M. L. Caramori, P. Fioretto, and M. Mauer, “The need for early predictors of diabetic nephropathy risk: is albumin excretion rate sufficient?,” Diabetes, vol. 49, no. 9, pp. 1399–1408, 2000.
[9] C. C.-L. Chan, C. C.-W. Chen, and B.-J. B. B.-J. Liu, “Discovery of association rules in Metabolic Syndrome related diseases,” in IEEE International Joint Conference on Neural Networks, 2008, pp. 856–862.
[10] N. V Chawla, N. Japkowicz, and P. Drive, “Editorial : Special Issue on Learning from Imbalanced Data Sets,” ACM SIGKDD Explor. Newsl., vol. 6, pp. 1–6, 2004.
[11] K.-S. Chen, M. J. Bullard, Y.-Y. Chien, and S.-Y. Lee, “Baclofen toxicity in patients with severely impaired renal function,” Ann. Pharmacother., vol. 31, no. 11, pp. 1315–1320, 1997.
[12] T. J. Chen, L. F. Chou, and S. J. Hwang, “Application of a Data-Mining Technique to Analyze Coprescription Patterns for Antacids in Taiwan,” Clin. Ther., vol. 25, pp. 2453–2463, 2003.
[13] J. Cheung, A. Yu, J. LaBossiere, Q. Zhu, and R. N. Fedorak, “Peptic ulcer bleeding outcomes adversely affected by end-stage renal disease,” Gastrointest. Endosc., vol. 71, no. 1, pp. 44–49, 2010.
[14] F. Coenen, P. Leng, and L. Zhang, “Threshold tuning for improved classification association rule mining,” Knowl. Discov. Data Min., vol. 3518 LNAI, pp. 216–225, 2005.
[15] D. Cukor, J. Coplan, C. Brown, S. Friedman, H. Newville, M. Safier, L. A. Spielman, R. A. Peterson, and P. L. Kimmel, “Anxiety disorders in adults treated by hemodialysis: a single-center study,” Am. J. Kidney Dis., vol. 52, no. 1, pp. 128–136, 2008.
[16] C. S. Dangare and D. S. S. Apte, “A Data Mining Approach For Prediction of Heart Disease using Neural Networks,” Int. J. Comput. Eng. Technol., vol. 3, no. 3, pp. 30–40, 2012.
[17] M. L. Dansinger, J. A. Gleason, J. L. Griffith, H. P. Selker, and E. J. Schaefer, “Comparison of the Atkins, Ornish, Weight Watchers, and Zone diets for weight loss and heart disease risk reduction: a randomized trial,” JAMA, vol. 293, no. 1, pp. 43–53, Jan. 2005.
[18] D. Delen, G. Walker, and A. Kadam, “Predicting breast cancer survivability: a comparison of three data mining methods,” Artif. Intell. Med., vol. 34, no. 2, pp. 113–127, Jun. 2005.
[19] A. C. Delgado, M. Ruiz, J. A. Alarcón, and E. González, “Dentinogenesis imperfecta: the importance of early treatment.,” Quintessence Int., vol. 39, pp. 257–263, 2008.
[20] G. Dong, X. Zhang, L. Wong, and J. Li, “CAEP: Classification by aggregating emerging patterns,” Discov. Sci., pp. 30–42, 1999.
[21] G. Eknoyan and N. Levin, “K/DOQI clinical practice guidelines for chronic kidney disease: evaluation, classification, and stratification,” Am J Kidney Dis, pp. 1–23, 2002.
[22] C. Fiehn, Y. Hajjar, K. Mueller, R. Waldherr, A. D. Ho, and K. Andrassy, “Improved clinical outcome of lupus nephritis during the past decade: importance of early diagnosis and treatment,” Ann Rheum Dis, vol. 62, pp. 435–439, 2003.
[23] M. A. Fisher, G. W. Taylor, B. J. Shelton, K. A. Jamerson, M. Rahman, A. O. Ojo, and A. R. Sehgal, “Periodontal disease and other nontraditional risk factors for CKD,” Am. J. Kidney Dis., vol. 51, no. 1, pp. 45–52, 2008.
[24] L. A. R. Group, “Reduction in weight and cardiovascular disease risk factors in individuals with type 2 diabetes: one-year results of the look AHEAD trial,” Diabetes Care, vol. 30, no. 6, pp. 1374–1383, 2007.
[25] S. I. Hallan, E. Ritz, S. Lydersen, S. Romundstad, K. Kvenild, and S. R. Orth, “Combining GFR and albuminuria to classify CKD improves prediction of ESRD,” J. Am. Soc. Nephrol., vol. 20, no. 5, pp. 1069–1077, May 2009.
[26] B. J. Hamburg, G. A. Storch, S. T. Micek, and M. H. Kollef, “The importance of early treatment with doxycycline in human ehrlichiosis.,” Medicine (Baltimore)., vol. 87, pp. 53–60, 2008.
[27] J. Han, J. Pei, and Y. Yin, “Mining frequent patterns without candidate generation,” ACM SIGMOD Rec., pp. 1–12, 2000.
[28] M. Jabbar, B. Deekshatulu, and P. Chandra, “Knowledge Discovery Using Associative Classification for Heart Disease Prediction,” Intell. Informatics, pp. 29–39, 2013.
[29] D. E. Knuth, The Art of Computer Programming, Volume 2 (3rd Ed.): Seminumerical Algorithms. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., 1997.
[30] A. S. Krolewski, J. H. Warram, A. R. Christlieb, E. J. Busick, and C. R. Kahn, “The changing natural history of nephropathy in type I diabetes,” Am. J. Med., vol. 78, no. 5, pp. 785–794, May 1985.
[31] A. V Kshirsagar, K. L. Moss, J. R. Elter, J. D. Beck, S. Offenbacher, and R. J. Falk, “Periodontal disease is associated with renal insufficiency in the Atherosclerosis Risk In Communities (ARIC) study,” Am. J. kidney Dis., vol. 45, no. 4, pp. 650–657, 2005.
[32] H.-W. Kuo, S.-S. Tsai, M.-M. Tiao, and C.-Y. Yang, “Epidemiological features of CKD in Taiwan.,” Am. J. Kidney Dis., vol. 49, no. 1, pp. 46–55, Jan. 2007.
[33] A. Kusiak, B. Dixon, and S. Shah, “Predicting survival time for kidney dialysis patients: a data mining approach,” Comput. Biol. Med., vol. 35, no. 4, pp. 311–327, May 2005.
[34] W. L. W. Li, J. H. J. Han, and J. P. J. Pei, “CMAR: Accurate and efficient classification based on multiple class-association rules,” Proc. 2001 IEEE Int. Conf. Data Min., pp. 369–376, 2001.
[35] B. Liu, W. Hsu, Y. Ma, and B. Ma, “Integrating Classification and Association Rule Mining,” in Knowledge Discovery and Data Mining, 1998, pp. 80–86.
[36] F. Locatelli, L. Del Vecchio, and P. Pozzoni, “The importance of early detection of chronic kidney disease,” Nephrol. Dial. Transplant., vol. 17, no. suppl 11, pp. 2–7, Jan. 2002.
[37] R. Longadge and S. Dongre, “Class Imbalance Problem in Data Mining Review,” CoRR, vol. abs/1305.1, 2013.
[38] M. M. J. M. Lysaght, “Maintenance dialysis population dynamics: current trends and long-term implications,” J. Am. Soc. Nephrol., vol. 13 Suppl 1, pp. 37–40, 2002.
[39] L. Mailloux and W. Haley, “Hypertension in the ESRD patient: Pathophysiology, therapy, outcomes, and future directions,” Am. J. Kidney Dis., vol. 32, no. 5, pp. 705–719, Nov. 1998.
[40] M. M. Mazid, a. B. M. S. Ali, and K. S. Tickle, “A Comparison Between Rule Based and Association Rule Mining Algorithms,” 2009 Third Int. Conf. Netw. Syst. Secur., pp. 452–455, 2009.
[41] Ministry of the Interior, “Ministry of the Interior of Taiwan,” 01-Jan-2012. [Online]. Available: http://www.moi.gov.tw/. [Accessed: 16-Jun-2014].
[42] J. Mursu, J. K. Virtanen, T.-P. Tuomainen, T. Nurmi, and S. Voutilainen, “Intake of fruit, berries, and vegetables and risk of type 2 diabetes in Finnish men: the Kuopio Ischaemic Heart Disease Risk Factor Study.,” Am. J. Clin. Nutr., vol. 99, no. 2, pp. 328–333, Feb. 2014.
[43] National Development Council, “National Development Committee - vital statistics,” 06-Jun-2014. [Online]. Available: http://www.ndc.gov.tw/dn.aspx?uid=36402. [Accessed: 10-Jul-2014].
[44] National Health Insurance Administration, “National Health Insurance Administration of Taiwan,” 16-Jun-2014. [Online]. Available: http://www.nhi.gov.tw/. [Accessed: 16-Jun-2014].
[45] S. Palaniappan and R. Awang, “Intelligent heart disease prediction system using data mining techniques,” in IEEE/ACS International Conference on Computer Systems and Applications, 2008, pp. 108–115.
[46] N. Pannu and M. K. Nadim, “An overview of drug-induced acute kidney injury,” Crit. Care Med., vol. 36, no. 4, pp. 216–223, 2008.
[47] S. Parasa, U. Navaneethan, A. R. M. Sridhar, P. G. K. Venkatesh, and K. Olden, “End-stage renal disease is associated with worse outcomes in hospitalized patients with peptic ulcer bleeding,” Gastrointest. Endosc., vol. 77, no. 4, pp. 609–616, 2013.
[48] S. K. Park and K. W. Miller, “Random Number Generators: Good Ones Are Hard to Find,” Commun. ACM, vol. 31, no. 10, pp. 1192–1201, 1988.
[49] T. V Perneger, P. K. Whelton, and M. J. Klag, “Risk of kidney failure associated with the use of acetaminophen, aspirin, and nonsteroidal antiinflammatory drugs,” N. Engl. J. Med., vol. 331, no. 25, pp. 1675–1679, 1994.
[50] T. S. Polonsky, R. L. Mcclelland, N. W. Jorgensen, D. E. Bild, G. L. Burke, A. D. Guerci, and P. Greenland, “Coronary artery calcium score and risk classification for coronary heart disease prediction.,” JAMA, vol. 303, no. 16, pp. 1610–1616, 2010.
[51] J. R. Quinlan, C4.5: Programs for Machine Learning. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1993.
[52] M. V Rao, Y. Qiu, C. Wang, and G. Bakris, “Hypertension and CKD: Kidney Early Evaluation Program (KEEP) and National Health and Nutrition Examination Survey (NHANES), 1999-2004,” Am. J. Kidney Dis., vol. 51, no. 4, pp. 30–37, Apr. 2008.
[53] Republic of China Ministry of Health and Welfare, “National Health Expenditure of Taiwan in 2012,” 10-Jul-2014. [Online]. Available: http://www.mohw.gov.tw/cht/DOS/DisplayStatisticFile.aspx?d=42647&s=1. [Accessed: 10-Jul-2014].
[54] M. Rezapour, M. Khavanin Zadeh, and M. M. Sepehri, “Implementation of predictive data mining techniques for identifying risk factors of early AVF failure in hemodialysis patients.,” Comput. Math. Methods Med., vol. 2013, Jan. 2013.
[55] P. Rossing, K. Rossing, P. Jacobsen, and H. H. Parving, “[Diabetic nephropathy. Unchanged occurrence in patients with insulin-dependent diabetes mellitus],” Ugeskr. Laeger, vol. 158, no. 42, pp. 5940–5943, 1996.
[56] A. D. Rule, E. J. Bergstralh, L. J. Melton, X. Li, A. L. Weaver, and J. C. Lieske, “Kidney stones and the risk for chronic kidney disease,” Clin. J. Am. Soc. Nephrol., vol. 4, no. 4, pp. 804–811, Apr. 2009.
[57] D. P. Sandler, J. C. Smith, C. R. Weinberg, V. M. Buckalew, V. W. Dennis, W. B. Blythe, and W. P. Burgess, “Analgesic Use and Chronic Renal Disease,” N. Engl. J. Med., vol. 320, no. 19, pp. 1238–1243, 1989.
[58] A. Saraux, J. M. Berthelot, G. Chalès, C. Le Henaff, J. B. Thorel, S. Hoang, I. Valls, V. Devauchelle, A. Martin, D. Baron, Y. Pennec, E. Botton, J. Y. Mary, P. Le Goff, P. Youinou, and others, “Ability of the American College of Rheumatology 1987 criteria to predict rheumatoid arthritis in patients with early arthritis and classification of these patients two years later.,” Arthritis Rheum., vol. 44, no. 11, pp. 2485–2491, Nov. 2001.
[59] a M. Shin, I. H. Lee, G. H. Lee, H. J. Park, H. S. Park, K. Il Yoon, J. J. Lee, and Y. N. Kim, “Diagnostic analysis of patients with essential hypertension using association rule mining.,” Healthc. Inform. Res., vol. 16, no. 2, pp. 77–81, Jun. 2010.
[60] P. Sood, G. Kumar, R. Nanchal, A. Sakhuja, S. Ahmad, M. Ali, N. Kumar, and E. A. Ross, “Chronic kidney disease and end-stage renal disease predict higher risk of mortality in patients with primary upper gastrointestinal bleeding,” Am. J. Nephrol., vol. 35, no. 3, pp. 216–224, 2012.
[61] Y.-M. Tai and H.-W. Chiu, “Comorbidity study of ADHD: applying association rule mining (ARM) to National Health Insurance Database of Taiwan,” Int. J. Med. Inform., vol. 78, no. 12, pp. 75–83, Dec. 2009.
[62] P. A. Theofilou, “Sexual functioning in chronic kidney disease: the association with depression and anxiety,” Hemodial. Int., vol. 16, no. 1, pp. 76–81, 2012.
[63] P. M. Tulkens, “Nephrotoxicity of aminoglycoside antibiotics,” Toxicol. Lett., vol. 46, no. 1, pp. 107–123, 1989.
[64] M. B. M. Wadhonkar, P. A. Tijare, and S. N. Sawalkar, “Artificial Neural Network Approach for Classification of Heart Disease Dataset,” Int. J. Appl. or Innov. Eng. Manag., vol. 3, no. 4, pp. 388–392, 2014.
[65] K. Wang, S. Zhou, and Y. He, “Growing decision trees on support-less association rules,” Proc. sixth ACM SIGKDD Int. Conf. Knowl. Discov. data Min. - KDD ’00, pp. 265–269, 2000.
[66] A. T. Whaley-Connell, J. R. Sowers, S. I. McFarlane, K. C. Norris, S.-C. Chen, S. Li, Y. Qiu, C. Wang, L. A. Stevens, J. A. Vassalotti, and others, “Diabetes mellitus in CKD: Kidney Early Evaluation Program (KEEP) and National Health and Nutrition and Examination Survey (NHANES) 1999-2004,” Am. J. Kidney Dis., vol. 51, no. 4 Suppl 2, pp. 21–29, Apr. 2008.
[67] P. W. F. Wilson, R. B. D’Agostino, D. Levy, a. M. Belanger, H. Silbershatz, and W. B. Kannel, “Prediction of Coronary Heart Disease Using Risk Factor Categories,” Circulation, vol. 97, no. 18, pp. 1837–1847, May 1998.
[68] Y. Xing, J. Wang, Z. Zhao, and A. Gao, “Combination Data Mining Methods with New Medical Data to Predicting Outcome of Coronary Heart Disease,” in International Conference on Convergence Information Technology (ICCIT), 2007, pp. 868–872.
[69] J. L. Xue, J. Z. Ma, T. a Louis, and a J. Collins, “Forecast of the number of patients with end-stage renal disease in the United States to the year 2010.,” J. Am. Soc. Nephrol., vol. 12, no. 12, pp. 2753–8, Dec. 2001.
[70] C.-S. Yang, C.-H. Lin, S.-H. Chang, and H.-C. Hsu, “Rapidly progressive fibrosing interstitial nephritis associated with Chinese herbal drugs,” Am. J. Kidney Dis., vol. 35, no. 2, pp. 313–318, Feb. 2000.
[71] X. Yin and J. Han, “CPAR: Classification based on Predictive Association Rules.,” SDM, pp. 331–335, 2003.
[72] K. Zolfaghar, N. Meadem, A. Teredesai, S. B. Roy, S.-C. Chin, and B. Muckian, “Big data solutions for predicting risk-of-readmission for congestive heart failure patients,” in IEEE International Conference on Big Data, 2013, pp. 64–71.
[73] Annual report of bureau of health promotion In: Bureau of Health Promotion. Department of Health, R.O.C. (Taiwan), 2012.