簡易檢索 / 詳目顯示

研究生: 張碩恒
Chang, Shuo-Heng
論文名稱: 使用結構限制GAN架構進行資料擴增及資料平衡:以透析低血壓資料為例
Data Augmentation and Data Balancing using Restrained Generative Adversarial Network Structure: An example of Intradialytic Hypotension Data
指導教授: 呂執中
Lyu, Jr-Jung
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理研究所
Institute of Information Management
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 53
中文關鍵詞: 資料擴增資料平衡生成對抗網絡結構限制透析低血壓
外文關鍵詞: data augmentation, data balancing, generative adversarial network, restrained structure, intradialytic hypotension
相關次數: 點閱:116下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 若慢性腎病患者病狀進展至末期腎病階段時,將需要透過血液透析來治療末期腎病,而病患在透析過程中,最常見的風險為透析低血壓。透析低血壓的發生會直接或間接的加重共病症與併發症,如冠狀動脈心臟病與腦血管疾病等,並可能會導致病患出現意識不清、休克、虛弱、痙攣、腸胃症狀等。輕則降低透析病患的生活品質,重則直接影響透析病患之生命安全。因此,在進行血液透析期間,不同的臨床決策,將會進一步影響血液透析過程中的風險。
    機器學習近年來被廣泛使用於醫療領域,透過機器學習在醫療過程中的預測,將可以輔助醫療人員判斷病情並給予醫療上的建議。然而,訓練出良好的模型並非易事,在機器學習訓練過程中經常資料量不足導致訓練成績不盡理想,或者資料處於不平衡的狀態導致少數類別資料的特徵不顯著,進而影響訓練成績的狀況。
    生成式對抗網路(generative adversarial network)是一種透過兩個類神經網路互相競爭對抗的非監督式學習架構。過去主要應用在圖片生成,近三年開始應用於數值資料擴增,使用生成器生成足以以假亂真的資料以進行資料集的擴增與平衡。因此,本研究提出使用IWGAN-GP架構將血液透析資料集進行資料擴增與資料平衡,並使用擴增完成與平衡完成資料集進行透析低血壓的預測,以提升預測的準確率及降低資料搜集的時間。
    本研究的結果指出,不論是資料擴增或資料平衡,本研究提出的IWGAN-GP方法皆能在資料擴增與資料平衡中提升模型效能,並能達到比近年韓國透析低血壓預測論文(Lee et al., 2021)更高的ROC-AUC (ROC-AUC = 0.972)。這將減輕醫療大數據中資料常常不足的問題,亦能使血液透析病患遠離透析低血壓所帶來的生活品質下降,以及生命上的威脅。資料平衡與資料擴增同時進行則無法提升預測模型的效能。

    關鍵字:資料擴增、資料平衡、生成對抗網絡、結構限制、透析低血壓

    With the rapid development of big data and machine learning, there are several applications related to using big data analysis in medical areas. However, medical data sets typically have insufficient data and are also imbalanced, resulting in poor performance and a neglect of minority class features. Generative adversarial network (GAN) is an unsupervised learning architecture in which two neural networks compete to achieve superior performance. GAN was first deployed for image generation; since 2019, it has been used to generate numerical data. This work proposes the use of Wasserstein GAN with gradient penalty architecture to augment and balance a hemodialysis dataset, in order to improve the performance of the decision model with limited dataset. This augmented and balanced dataset was then used to predict intradialytic hypotension. While the performance improvement is not so significant, the efforts in using this mechanism are also relatively easy and could be extended to other applications to examine its effectiveness.

    Keywords: data augmentation, data balancing, generative adversarial network, restrained structure, intradialytic hypotension

    摘要 I 致謝 VIII 目錄 IX 圖目錄 XI 表目錄 XII 第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 2 1.3 研究流程 2 第二章 文獻探討 4 2.1 腎臟疾病之現況 4 2.2 透析治療 5 2.3 透析低血壓 6 2.4 醫療大數據與血液透析 8 2.5 GAN資料擴增策略 10 第三章 研究方法 17 3.1 研究架構 17 3.2 資料蒐集 17 3.3 資料預處理 18 3.4 IWGAN-GP架構建立 18 3.5 資料生成與預測 19 3.5.1 IWGAN-GP資料擴增 19 3.5.2 IWGAN-GP資料平衡 20 3.5.3 IWGAN-GP資料平衡與資料擴增同時進行 21 3.6 模型評估 21 第四章 實證研究與結果分析 25 4.1 數據說明 25 4.2 模型評估 31 4.2.1 原始訓練集 31 4.2.2 資料擴增 32 4.2.3 資料平衡 34 4.2.4 資料平衡與資料擴增同時進行 36 4.3 研究小結與討論 38 4.4 管理意涵 39 第五章 結論與建議 41 5.1研究結論 41 5.2 未來研究方向 43 參考文獻 45 附錄 52

    Antoniou, A., Storkey, A., & Edwards, H. (2018). Data augmentation generative adversarial networks. arXiv pre-print arxiv:1711.04340,2019.
    Arjovsky, M., Chintala, S., & Bottou, L. e. (2017). Wasserstein GAN. arXiv pre-print
    arxiv:1701.07875,2014.
    Bermúdez-López, M., Arroyo, D., Betriu, A., Masana, L., Fernández, E., & Valdivielso, J. (2017). New perspectives on ckd-induced dyslipidemia. Expert Opinion on Therapeutic Targets, 21. https://doi.org/10.1080/14728222.2017.1369961
    Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321-357. https://doi.org/10.1613/jair.953
    Chen, J.-B., Wu, K.-C., Moi, S.-H., Chuang, L.-Y., & Yang, C.-H. (2020). Deep learning for intradialytic hypotension prediction in hemodialysis patients. IEEE Access, 8, 82382-82390. https://doi.org/10.1109/access.2020.2988993
    Chen, X., & Ishwaran, H. (2012). Random forests for genomic data analysis. GENOMICS, 99(6), 323-329. https://doi.org/10.1016/j.ygeno.2012.04.003
    Cutler, D. R., Edwards, T. C., Beard, K. H., Cutler, A., Hess, K. T., Gibson, J., & Lawler, J. J. (2007). Random forests for classification in ecology. Ecology, 88(11), 2783-2792. https://doi.org/10.1890/07-0539.1
    Fanelli, G., Dantone, M., Gall, J., Fossati, A., & Van Gool, L. (2013). Random forests for real time 3d face analysis. International Journal of Computer Vision, 101(3), 437-458. https://doi.org/10.1007/s11263-012-0549-0
    Flythe, J. E., Xue, H., Lynch, K. E., Curhan, G. C., & Brunelli, S. M. (2015). Association of mortality risk with various definitions of intradialytic hypotension. Journal of the American Society of Nephrology, 26(3), 724-734. https://doi.org/10.1681/asn.2014020222
    Gabutti, L., Machacek, M., Marone, C., & Ferrari, P. (2005). Predicting intradialytic hypotension from experience, statistical models and artificial neural networks. Journal of Nephrology, 18(4), 409-416. <Go to ISI>://WOS:000232848400010
    Gangwar, A. K., & Ravi, V. (2019). WiP: Generative adversarial network for oversampling data in credit card fraud detection. In Information Systems Security (pp. 123-134). https://doi.org/10.1007/978-3-030-36945-3_7
    Genuer, R., Poggi, J.-M., Tuleau-Malot, C., & Villa-Vialaneix, N. (2017). Random forests for big data. Big Data Research, 9, 28-46. https://doi.org/10.1016/j.bdr.2017.07.003
    Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative adversarial networks. arXiv pre-print arxiv:1406.2661,2014.
    Gul, A., Miskulin, D., Harford, A., & Zager, P. (2016). Intradialytic hypotension. Current Opinion in Nephrology and Hypertension, 25, 1. https://doi.org/10.1097/MNH.0000000000000271
    Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. C. (2017). Improved Training of Wasserstein GANs. NIPS.
    Haibo, H., Yang, B., Garcia, E. A., & Shutao, L. (2008, 2008). ADASYN: adaptive synthetic sampling approach for imbalanced learning.
    He, Z., Zuo, W., Kan, M., Shan, S., & Chen, X. (2018). AttGAN: facial attribute editing by only changing what you want. arXiv pre-print arxiv:1711.10678,2018.
    Hu, F., & Li, H. (2013). A novel boundary oversampling algorithm based on neighborhood rough set model: NRSBoundary-SMOTE. Mathematical Problems in Engineering, 2013, 694809. https://doi.org/10.1155/2013/694809
    Huang, J. C., Tsai, Y. C., Wu, P. Y., Lien, Y. H., Chien, C. Y., Kuo, C. F., Hung, J. F., Chen, S. C., & Kuo, C. H. (2020). Predictive modeling of blood pressure during hemodialysis: a comparison of linear model, random forest, support vector regression, XGBoost, LASSO regression and ensemble method. Comput Methods Programs Biomed, 195, 105536. https://doi.org/10.1016/j.cmpb.2020.105536
    Hwang, J., & Kim, K. (2020). An efficient domain-adaptation method using GAN for fraud detection. International Journal of Advanced Computer Science and Applications, 11(11), 94-103. <Go to ISI>://WOS:000600271300013
    Kanda, E., Kashihara, N., Matsushita, K., Usui, T., Okada, H., Iseki, K., Mikami, K., Tanaka, T., Wada, T., Watada, H., Ueki, K., & Nangaku, M. (2018). Guidelines for clinical evaluation of chronic kidney disease: AMED research on regulatory science of pharmaceuticals and medical devices. Clinical and Experimental Nephrology, 22. https://doi.org/10.1007/s10157-018-1615-x
    Kohavi, R., & Provost, F. (1998). Glossary of Terms. Machine Learning, 2, 271-274. https://doi.org/10.1023/A:1017181826899
    Kotanko, P., Garg, A. X., Depner, T., Pierratos, A., Chan, C. T., Levin, N. W., Greene, T., Larive, B., Beck, G. J., Gassman, J., Kliger, A. S., & Stokes, J. B. (2015). Effects of frequent hemodialysis on blood pressure: results from the randomized frequent hemodialysis network trials. Hemodialysis International, 19(3), 386-401. https://doi.org/10.1111/hdi.12255
    Kupyn, O., Budzan, V., Mykhailych, M., Mishkin, D., & Matas, J. (2018). DeblurGAN: blind motion deblurring using conditional adversarial networks. arXiv pre-print arxiv:1711.07064,2018.
    Kwon, C., Park, S., Ko, S., & Ahn, J. (2021). Increasing prediction accuracy of pathogenic staging by sample augmentation with a GAN. PLOS ONE, 16(4), e0250458. https://doi.org/10.1371/journal.pone.0250458
    Lee, H., Yun, D., Yoo, J., Yoo, K., Kim, Y. C., Kim, D. K., Oh, K. H., Joo, K. W., Kim, Y. S., Kwak, N., & Han, S. S. (2021). Deep learning model for real-time prediction of intradialytic hypotension. Clin J Am Soc Nephrol, 16(3), 396-406. https://doi.org/10.2215/CJN.09280620
    Levey, A. S., Astor, B. C., Stevens, L. A., & Coresh, J. (2010). Chronic kidney disease, diabetes, and hypertension: what's in a name? Kidney International, 78(1), 19-22. https://doi.org/10.1038/ki.2010.115
    Lin, Y. C., Lin, y.-c., Peng, C. C., Chen, K.-C., Chen, H.-H., Fang, T.-C., Sung, S.-Y., & Wu, M.-S. (2018). Effects of cholesterol levels on mortality in patients with long-term peritoneal dialysis based on residual renal function. Nutrients, 10. https://doi.org/10.3390/nu10030300
    Liu, Y. S., Yang, C. Y., Chiu, P. F., Lin, H. C., Lo, C. C., Lai, A. S., Chang, C. C., & Lee, O. K. (2021). Machine learning analysis of time-dependent features for predicting adverse events during hemodialysis therapy: model development and validation study. J Med Internet Res, 23(9), e27098. https://doi.org/10.2196/27098
    Majnik, M., & Bosnić, Z. (2013). ROC analysis of classifiers in machine learning: a survey. Intelligent Data Analysis, 17(3), 531-558. https://doi.org/10.3233/ida-130592
    Mustafa, R., Bdair, F., Akl, E., Garg, A., Thiessen Philbrook, H., Salameh, H., Kisra, S., Nesrallah, G., Al-Jaishi, A., Patel, P., Mustafa, A., & Schunemann, H. (2015). Effect of lowering the dialysate temperature in chronic hemodialysis: a systematic review and meta-analysis. Clinical Journal of the American Society of Nephrology, 11. https://doi.org/10.2215/CJN.04580415
    Ozenne, B., Subtil, F., & Maucort-Boulch, D. (2015). The precision–recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases. Journal of Clinical Epidemiology, 68(8), 855-859. https://doi.org/https://doi.org/10.1016/j.jclinepi.2015.02.010
    Rampanelli, E., Ochodnicky, P., Vissers, H., Butter, L., Claessen, N., calcagnì, A., Kors, L., Gethings, L., Bakker, S., Borst, M. H., Navis, G., Liebisch, G., Speijer, D., Weerman, M., Jung, B., Aten, J., Steenbergen, E., Schmitz, G., Ballabio, A., & Aerts, J. (2018). Excessive dietary lipid intake provokes an acquired form of lysosomal lipid storage disease in the kidney. The Journal of pathology, 246. https://doi.org/10.1002/path.5150
    Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., & Lee, H. (2016). generative adversarial text to image synthesis. arXiv pre-print arxiv:1605.05396,2016.
    Rezapour, M., Khavanin Zadeh, M., & Sepehri, M. M. (2013). Implementation of predictive data mining techniques for identifying risk factors of early AVF failure in hemodialysis patients. Computational and Mathematical Methods in Medicine, 2013, 830745. https://doi.org/10.1155/2013/830745
    Rezapour, M., & Khavaninzadeh, M. (2014). Association between non-matured arterio-venus fistula and blood pressure in hemodialysis patients. Medical journal of the Islamic Republic of Iran, 28, 144-144. https://pubmed.ncbi.nlm.nih.gov/25695002
    Rivera, G., Florencia, R., García, V., Ruiz, A., & Sánchez-Solís, J. P. (2020). News classification for identifying traffic incident points in a spanish-speaking country: a real-world case study of class imbalance learning. Applied Sciences, 10(18), 6253. https://doi.org/10.3390/app10186253
    Rocha, A., Sousa, C., Teles, P., Coelho, A., & Xavier, E. (2016). Effect of dialysis day on intradialytic hypotension risk. Kidney and Blood Pressure Research, 41(2), 168-174. https://doi.org/10.1159/000443418
    Sands, J., Usvyat, L., Sullivan, T., Segal, J., Zabetakis, P., Kotanko, P., Maddux, F., & Diaz-Buxo, J. (2014). Intradialytic hypotension: frequency, sources of variation and correlation with clinical outcome. Hemodialysis international. International Symposium on Home Hemodialysis, 18. https://doi.org/10.1111/hdi.12138
    Stefánsson, B. V., Brunelli, S. M., Cabrera, C., Rosenbaum, D., Anum, E., Ramakrishnan, K., Jensen, D. E., & Stålhammar, N.-O. (2014). Intradialytic hypotension and risk of cardiovascular disease. Clinical Journal of the American Society of Nephrology, 9(12), 2124-2132. https://doi.org/10.2215/cjn.02680314
    Tanaka, F., & Aranha, C. (2019). Data augmentation using GANs. arXiv pre-print arxiv:1904.09135,2019.
    Vaccari, I., Orani, V., Paglialonga, A., Cambiaso, E., & Mongelli, M. (2021). A Generative Adversarial Network (GAN) Technique for Internet of Medical Things Data. Sensors (Basel), 21(11). https://doi.org/10.3390/s21113726
    Vadakedath, S., & Kandi, V. (2017). Dialysis: A review of the mechanisms underlying complications in the management of chronic renal failure. Cureus. https://doi.org/10.7759/cureus.1603
    Wang, W., Wang, C., Cui, T., & Li, Y. (2020). Study of restrained network structures for wasserstein generative adversarial networks (WGANs) on numeric data augmentation. IEEE Access, 8, 89812-89821. https://doi.org/10.1109/access.2020.2993839
    Webster, A. C., Nagler, E. V., Morton, R. L., & Masson, P. (2017). Chronic kidney disease. Lancet, 389(10075), 1238-1252. https://doi.org/10.1016/s0140-6736(16)32064-5
    Zhu, J.-Y., Park, T., Isola, P., & Alexei. (2020). Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv pre-print arxiv:1703.10593,2020.
    Zhu, J.-Y., Zhang, R., Pathak, D., Darrell, T., Alexei, Wang, O., & Shechtman, E. (2018). Toward multimodal image-to-image translation. arXiv pre-print arxiv:1711.11586,2018.
    Zhu, X., Liu, Y., Qin, Z., & Li, J. (2017). Data augmentation in emotion classification using generative adversarial networks. arXiv pre-print arxiv:1711.00648,2017.

    無法下載圖示 校內:2027-09-21公開
    校外:2027-09-21公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE