| 研究生: |
謝至嘉 Hsieh, Chih-Chia |
|---|---|
| 論文名稱: |
以羅吉斯迴歸模型與機器學習方法預測急診病患菌血症之研究 Prediction to the bacteremia in the patients in the emergency department with logistic regression and machine learning |
| 指導教授: |
馬瀰嘉
Ma, Mi-Chia |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 高階管理碩士在職專班(EMBA) Executive Master of Business Administration (EMBA) |
| 論文出版年: | 2022 |
| 畢業學年度: | 110 |
| 語文別: | 中文 |
| 論文頁數: | 27 |
| 中文關鍵詞: | 菌血症 、血液細菌培養 、機器學習 、羅吉斯迴歸 、淨重新分類指數 |
| 外文關鍵詞: | bacteremia, blood culture, machine learning, logistic regression, net reclassification index |
| 相關次數: | 點閱:102 下載:24 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
當病人血液感染細菌,稱為菌血症,常會導致嚴重的併發症,並有高死亡率。菌血症的診斷是根據血液細菌培養的結果,但是菌血症的風險往往被高估了,血液培養真的培養出病原菌的比率大約在4%到7%。與此同時,因為污染而造成偽陽性的比率和真正的陽性率大約一樣。過度地進行血液培養檢查會增加醫療系統在財務與勞務上的負擔,也會使病人接受不需要的抗生素治療。本研究希望透過羅吉斯迴歸與機器學習的方式,建立可以預測急診病人菌血症的模型並減少不必要的血液細菌培養檢查。
本研究是在單一醫學中心所進行的回溯性研究,一共有40,395位疑似菌血症的病人被納入本研究。本研究使用病人的基本資料、過去病史、檢傷時的生命徵象,以及實驗室的檢查數值來做為建立預測模型的變項。所有的資料被分做佔70%的訓練資料集與30%的驗證資料集。本研究分別以羅吉斯迴歸(logistic regression)、支持向量機演算法(support vector machine, SVM)、隨機森林演算法(random forest)的方式建立預測模型,然後比較接受者操作特徵曲線下面積(area under receiver operating characteristic curve, AUROC, 簡寫為AUC)和淨重新分類指數(net reclassification index, NRI)來評估模型的優劣。
在所有被納入本研究的病人中,有4,058人被診斷為菌血症。使用羅吉斯迴歸所建立的預測模型,AUC可以達到0.725,其信賴區間(confidence interval, CI) 為 (0.708, 0.742);使用支持向量機演算法所建立的預測模型, AUC可以達到0.730 (CI: 0.713 - 0.747);使用隨機森林演算法所建立的預測模型,AUC可以達到0.725 (CI: 0.708 - 0.742)。將支持向量機演算法的模型與羅吉斯迴歸模型做比較,NRI是0.016,p值大於0.01;將隨機森林演算法的模型與羅吉斯迴歸模型做比較,NRI是-0.012,p值大於0.01。三個方法不相上下,沒有統計上的證據支持羅吉斯迴歸模型的預測力優於機器學習的模型,機器學習模型的優點是預測模型較彈性不限於線性關係,但羅吉斯迴歸模型的優點是容易解釋自變數對反應變數的影響,各有優點。醫學研究者應以開放的態度,在未來的研究中考慮採用機器學習作為研究方法的一部分。
Blood stream infection, so-called bacteremia, usually results in severe complications and high mortality. The study aims to establish the models with logistic regression and machine learning to predict bacteremia and prevent unnecessary blood culture study.
It was a hospital-wide retrospective study; 40,395 patients’ basic information, past medical history, vital signs at the triage, and laboratory parameters were used as the variables to build the predictive models. The predictive models were built with logistic regression, support vector machine (SVM), and random forest respectively. The performance of these models was assessed with the method of the area under the receiver operating characteristic curve (AUROC, for simplicity’s sake, represented as AUC) and net reclassification index (NRI).
The best performance of the logistic regression model could reach the AUC of 0.725 and the confidence interval (CI) is (0.708, 0.742). The best AUC of the SVM model is 0.73 (CI: 0.713 - 0.747). The best AUC of the random forest model is 0.705 (CI: 0.688 - 0.722). Comparing the SVM model to the logistic regression model, the NRI is 0.016 with the p-value > 0.01. Comparing the random forest model to the logistic regression model, the NRI is -0.012 with the p-value > 0.01. There is no statistical evidence supporting that the logistic regression model could perform better than the machine learning models. The advantage of the machine learning model is that the prediction model is more elastic and not limited to a linear relationship. The advantage of the logistic regression model is that it is easy to explain the influence of the independent variable on the response variable. Each method has its own advantage. Since the performance of machine learning seems to equal the traditional logistic regression, medical researchers could be more open-minded to adopt the method of machine learning in their future studies.
[1] Ratzinger F, Dedeyan M, Rammerstorfer M, et al. A risk prediction model for screening bacteremic patients: a cross sectional study. PLoS One, 2014. 9(9): p. e106765.
[2] Takeshima T, Yamamoto Y, Noguchi Y, et al. Identifying Patients with Bacteremia in Community-Hospital Emergency Rooms: A Retrospective Cohort Study. PLoS One, 2016. 11(3): p. e0148078.
[3] Bates DW, Cook EF, Goldman L, et al. Predicting bacteremia in hospitalized patients. A prospectively validated model. Ann Intern Med, 1990. 113(7): p. 495-500.
[4] Pien BC, Sundaram P, Raoof N, et al. The clinical and prognostic importance of positive blood cultures in adults. Am J Med, 2010. 123(9): p. 819-28.
[5] Little JR, Trovillion E, and Fraser V. High frequency of pseudobacteremia at a university hospital. Infect Control Hosp Epidemiol, 1997. 18(3): p. 200-2.
[6] Hall KK and Lyman JA. Updated review of blood culture contamination. Clin Microbiol Rev, 2006. 19(4): p. 788-802.
[7] van der Heijden YF, Miller G, Wright PW, et al. Clinical impact of blood cultures contaminated with coagulase-negative staphylococci at an academic medical center. Infect Control Hosp Epidemiol, 2011. 32(6): p. 623-5.
[8] Qamruddin A, Khanna N, and Orr D. Peripheral blood culture contamination in adults and venepuncture technique: prospective cohort study. J Clin Pathol, 2008. 61(4): p. 509-13.
[9] Coburn B, Morris AM, Tomlinson G, et al. Does this adult patient with suspected bacteremia require blood cultures? JAMA, 2012. 308(5): p. 502-11.
[10] Pfitzenmeyer P, Decrey H, Auckenthaler R, et al. Predicting bacteremia in older patients. J Am Geriatr Soc, 1995. 43(3): p. 230-5.
[11] Kim KS, Kim K, Jo YH, et al. A simple model to predict bacteremia in women with acute pyelonephritis. J Infect, 2011. 63(2): p. 124-30.
[12] Falguera M, Trujillano J, Caro S, et al. A prediction rule for estimating the risk of bacteremia in patients with community-acquired pneumonia. Clin Infect Dis, 2009. 49(3): p. 409-16.
[13] Nakamura T, Takahashi O, Matsui K, et al. Clinical prediction rules for bacteremia and in-hospital death based on clinical data at the time of blood withdrawal for culture: An evaluation of their development and use. J Eval Clin Pract, 2006. 12(6): p. 692-703.
[14] Shapiro NI, Wolfe RE, Wright SB, et al. Who needs a blood culture? A prospectively derived and validated prediction rule. J Emerg Med, 2008. 35(3): p. 255-64.
[15] Su CP, Chen TH, Chen SY, et al. Predictive model for bacteremia in adult patients with blood cultures performed at the emergency department: a preliminary report. J Microbiol Immunol Infect, 2011. 44(6): p. 449-55.
[16] Chen JH and Asch SM. Machine Learning and Prediction in Medicine - Beyond the Peak of Inflated Expectations. N Engl J Med, 2017. 376(26): p. 2507-2509.
[17] Stoltzfus JC. Logistic regression: a brief primer. Acad Emerg Med, 2011. 18(10): p. 1099-104.
[18] Tibshirani R. Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological), 1996. 58(1): p. 267-288.
[19] Cox DR. The Regression Analysis of Binary Sequences. Journal of the Royal Statistical Society: Series B (Methodological), 1959. 21(1): p. 238-238.
[20] Cortes C. and Vapnik V. Support-vector networks. Machine Learning, 1995. 20(3): p. 273-297.
[21] Breiman L. Random Forests. Machine Learning, 2001. 45(1): p. 5-32.
[22] Breiman L. Bagging predictors. Machine Learning, 1996. 24(2): p. 123-140.
[23] Ho TK. The Random Subspace Method for Constructing Decision Forests. IEEE Trans. Pattern Anal. Mach. Intell., 1998. 20: p. 832-844.
[24] He H. and Garcia EA. Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering, 2009. 21(9): p. 1263-1284.
[25] Hayward J, Hagtvedt R, Ma W, et al. Predictors of Admission in Adult Unscheduled Return Visits to the Emergency Department. West J Emerg Med, 2018. 19(6): p. 912-918
[26] Jaimes F, Arango C, Ruiz G, et al. Predicting bacteremia at the bedside. Clin Infect Dis, 2004. 38(3): p. 357-62.
[27] Bannay A, Chaignot C, Blotière PO, et al. The Best Use of the Charlson Comorbidity Index With Electronic Health Care Database to Predict Mortality. Med Care, 2016. 54(2): p. 188-94.
[28] Olsson T, Terent A, and Lind L. Rapid Emergency Medicine score: a new prognostic tool for in-hospital mortality in nonsurgical emergency department patients. J Intern Med, 2004. 255(5): p. 579-87.
[29] Bone RC. Toward an epidemiology and natural history of SIRS (systemic inflammatory response syndrome). JAMA, 1992. 268(24): p. 3452-5.
[30] Kim HS, Lee E, Cho YJ, et al., Linezolid-induced thrombocytopenia increases mortality risk in intensive care unit patients, a 10 year retrospective study. J Clin Pharm Ther, 2019. 44(1): p. 84-90.
[31] Casserly B, Phillips GS, Schorr C, et al. Lactate measurements in sepsis-induced tissue hypoperfusion: results from the Surviving Sepsis Campaign database. Crit Care Med, 2015. 43(3): p. 567-73.
[32] Nuñez S, Hexdall A, and Aguirre-Jaime A. Unscheduled returns to the emergency department: an outcome of medical errors? Qual Saf Health Care, 2006. 15(2): p. 102-8.
[33] Kim DU, Park YS, Park JM, et al. Influence of Overcrowding in the Emergency Department on Return Visit within 72 Hours. J Clin Med, 2020. 9(5).
[34] Safwenberg U, Terént A, and Lind L. Increased long-term mortality in patients with repeated visits to the emergency department. Eur J Emerg Med, 2010. 17(5): p. 274-9.
[35] Hiti EA, Tamim H, Makki M, et al. Characteristics and determinants of high-risk unscheduled return visits to the emergency department. Emerg Med J, 2020. 37(2): p. 79-84.
[36] Lin SY, Sung CW, Huang EP, et al. Intravenous antibiotics at the index emergency department visit as an independent risk factor for hospital admission at the return visit within 72 hours. PLoS One, 2022. 17(3): p. e0264946.
[37] Ntusi N, Aubin L, Oliver S, et al. Guideline for the optimal use of blood cultures. S Afr Med J, 2010. 100(12): p. 839-43.
[38] Bates DW, Pruess KE, and Lee TH, How bad are bacteremia and sepsis? Outcomes in a cohort with suspected bacteremia. Arch Intern Med, 1995. 155(6): p. 593-8.
[39] Steyerberg EW. Clinical prediction models. 2019: Springer.
[40] Christodoulou E, Ma J, Collins GS, et al. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol, 2019. 110: p. 12-22.
[41] Beam AL and Kohane IS. Big Data and Machine Learning in Health Care. JAMA, 2018. 319(13): p. 1317-1318.
[42] Goldstein BA, Navar AM, and Carter RE. Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges. Eur Heart J, 2017. 38(23): p. 1805-1814.
[43] Boulesteix AL and Schmid M. Machine learning versus statistical modeling. Biom J, 2014. 56(4): p. 588-93.
[44] Deo RC and Nallamothu BK. Learning About Machine Learning: The Promise and Pitfalls of Big Data and the Electronic Health Record. Circ Cardiovasc Qual Outcomes, 2016. 9(6): p. 618-620.
[45] Pochet NL and Suykens JA. Support vector machines versus logistic regression: improving prospective performance in clinical decision-making. Ultrasound Obstet Gynecol, 2006. 27(6): p. 607-8.
[46] Rajkomar A, Oren E, Chen K, et al. Scalable and accurate deep learning with electronic health records. NPJ Digit Med, 2018. 1: p. 18.
[47] Luo W, Phung D, Tran T, et al. Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View. J Med Internet Res, 2016. 18(12): p. e323.
[48] Petch J, Di S, and Nelson W. Opening the Black Box: The Promise and Limitations of Explainable Machine Learning in Cardiology. Can J Cardiol, 2022. 38(2): p. 204-213.
[49] Tu JV. Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. Journal of Clinical Epidemiology, 1996. 49(11): p. 1225-1231.
[50] Cabitza F, Rasoini R, and Gensini GF. Unintended Consequences of Machine Learning in Medicine. JAMA, 2017. 318(6): p. 517-518.
[51] Vellido A. The importance of interpretability and visualization in machine learning for applications in medicine and health care. Neural Computing and Applications, 2020. 32(24): p. 18069-18083.
[52] Churpek MM, Yuen TC, Huber MT, et al. Predicting cardiac arrest on the wards: a nested case-control study. Chest. 2012 May;141(5):1170-1176.
[53] Churpek MM, Yuen TC, Winslow C, et al. Multicenter development and validation of a risk stratification tool for ward patients. Am J Respir Crit Care Med. 2014 Sep 15;190(6):649-655.
[54] Escobar GJ, LaGuardia JC, Turk BJ, et al. Early detection of impending physiologic deterioration among patients who are not in intensive care: development of predictive models using data from an automated electronic medical record. J Hosp Med. 2012 May-Jun;7(5):388-95.
[55] van Werkhoven CH, Huijts SM, Postma DF, et al. Predictors of Bacteraemia in Patients with Suspected Community-Acquired Pneumonia. PLoS One, 2015. 10(11):e0143817.
[56] Takeshima T, Yamamoto Y, Noguchi Y, et al. Identifying Patients with Bacteremia in Community-Hospital Emergency Rooms: A Retrospective Cohort Study. PLoS One, 2016. 11(3):e0148078.
[57] Raoult, D., Strange world of emergency medicine. J Emerg Med, 2010. 39(4): p. 501; author reply 501-2.