簡易檢索 / 詳目顯示

研究生: 卓芝吟
CHO, CHIH-YIN
論文名稱: 以深度強化學習早期偵測藥物不良反應之研究
Early Detecting Adverse Drug Reaction with Deep Reinforcement Learning
指導教授: 李昇暾
Li, Sheng-Tun
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理研究所
Institute of Information Management
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 53
中文關鍵詞: 藥物不良反應單類別分類器時間序列早期預測強化學習文字探勘
外文關鍵詞: adverse drug reactions, one class classification, early prediction on time series, reinforcement learning, text mining
相關次數: 點閱:146下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 藥物不良反應是指病人使用某種藥物之後產生有害且不可預期或過度的反應,不同於藥物副作用輕微且可預期,藥物不良反應一般都對病人的治療不利,輕度影響可能造成不適但可自行恢復,重度影響則會造成永久性傷害,甚至危及生命。除了對病人造成的傷害之外,世界衛生組織發表的藥物治療培訓資料中也提到,藥物不良反應造成的醫療成本極高,在全世界一直是個值得重視的議題。因此本研究致力提出一個藥物不良反應的判別模型,透過病程紀錄資料,辨別病人於住院過程中是否發生藥物不良反應,期望為醫療產業及藥物不良反應等相關領域盡一份心力。
    本研究擷取由SOAP格式撰寫的臨床電子病歷進行實驗,前處理階段將資料集透過醫療字典組合進行過濾,藉此刪除較無意義的單詞,並進行特徵擷取以提高模型的準確度。透過支援向量資料描述法的模型,使模型分類是否為藥物不良反應,因資料時間長度不一,故再結合強化學習以DQN框架搭配LSTM,使模型得以盡可能的早期預測該病人是否有藥物不良的反應。全時間段的實驗中,本研究的方法準確度可達92%,以時間段區分的準確度也有75%以上,成效優於其他機器學習的演算法。結果呈現中,除預測出是否有藥物不良反應外,本研究提供專家一份藥物不良反應比例評估表,呈現每位病人藥物不良反應以及沒有藥物不良反應的機率,使專家在用藥評估上更有依據,並可以藉此評估表更有效的對應患者進行診治。

    Adverse Drug Reaction (ADR) refers to the harmful, serious, and unintended results caused by taking medicines. Different from "side effect" that is predictable, ADR is generally deleterious to patients' treatment. Mild ADR effects can result in recoverable discomfort, while severe effects may cause permanent injury or even dangerous to life.
    Therefore, our study aims at proposing a discriminant model for ADRs that can identify through progress notes whether a patient has ADR during hospitalization
    We applied SVDD model to classify ADR and combined it with the Reinforcement Learning framework using DQN and LSTM to deal with the varying data length. This enables the model to predict as early as possible whether a patient has ADR.
    In the experiments of the entire period, the accuracy of our method is 92%, and the accuracy of distinguishing by period is more than 75%, which is better than other algorithms.
    In addition to predicting whether a patient will have adverse drug reactions, our study also provides an assessment table that shows the probability of having ADR for each patient, so that experts can have more basis for drug evaluation.

    目 錄 摘要 I 英文摘要 II 誌謝 IX 目 錄 XI 圖 目 錄 XIV 表 目 錄 XV 第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 2 1.3 研究流程 2 第二章 文獻探討 4 2.1 藥物不良反應議題 4 2.2 藥物不良反應及醫療相關研究 5 2.3 多變量早期時間序列預測 7 2.4單類別分類 7 2.4.1 One-class SVM 8 2.4.2 支持向量數據描述 (Support Vector Data Description, SVDD) 9 2.5 主題模型 11 2.5.1 隱含狄利克雷分佈 (Latent Dirichlet Allocation, LDA) 11 2.6深度學習 13 2.6.1 長短期記憶 (Long short-term memory, LSTM) 13 2.7強化學習 16 2.7.1 強化學習基本架構 17 2.7.2 Q-Learning 18 2.7.3深度Q網路 (Deep Q-Learning Network , DQN) 19 第三章 研究方法 21 3.1 問題及符號定義 21 3.1.1問題定義 21 3.1.2符號定義 22 3.2 研究框架 23 3.3 資料前處理 24 3.4 特徵萃取 26 3.5 異常檢測框架 27 3.6 強化學習框架 28 3.6.1 狀態、動作及環境設定 28 3.6.2 獎勵值設定 29 3.6.3 代理人設定 29 3.7 模型建置流程 30 第四章 實驗結果與分析 31 4.1 資料集 31 4.1.1 資料集說明 31 4.1.2 資料集處理 34 4.2超參數設定 34 4.3評估指標 36 4.3.1 早期預測評估指標 36 4.3.2 模型準確度評估指標 37 4.4 實驗結果 39 4.4.1 字典比較 39 4.4.2主題數比較 40 4.4.3演算法及時間段比較 41 4.4.4 資料集調整 44 4.4.5 藥物不良反應比例評估 45 第五章 結論與未來展望 47 5.1 結論與貢獻 47 5.2 未來展望與研究限制 48 參考文獻 50 圖 目 錄 圖1-1、研究流程圖 3 圖2-1、支持向量機建構的最佳超平面 9 圖2-2、高斯核函數(TAX & DUIN, 2004) 11 圖2-3、LDA模型示意圖 12 圖2-4、遞歸神經網路架構圖 14 圖2-5、長短期記憶網路架構圖 15 圖2-6、強化學習基本架構 17 圖3-1、研究架構圖 24 圖3-2、資料前處理步驟 25 圖4-1、各主題數量困惑度(單一字典) 35 圖4-2、各主題數量困惑度(字典混合-1) 35 圖4-3、各主題數量困惑度(字典混合-2) 35 圖4-4、混淆矩陣 35 表 目 錄 表3-1、醫療字典統整 26 表4-1、資料集數據統計 32 表4-2、資料集範例 33 表4-3、超參數設定 36 表4-4、字典過濾評估指標 39 表4-5、不同字典使用不同主題數之比較 41 表4-6、不同字典組合於相同時段之演算法比較 42 表4-7、k折交叉驗證(5-fold) 43 表4-8、不同演算法在不同時間段之比較 43 表4-9、資料集之比較 44 表4-10、藥物不良反應比例評估表 46

    Beijer, H. J. M., & de Blaey, C. J. (2002). Hospitalisations caused by adverse drug reactions (ADR): a meta-analysis of observational studies. Pharmacy World and Science, 24(2), 46-54. doi:10.1023/A:1015570104121
    Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2), 157-166. doi:10.1109/72.279181
    Blei, D. M., & Lafferty, J. D. (2006). Dynamic topic models. International conference on Machine learning, Association for Computing Machinery, New York, US, 113–120. doi:10.1145/1143844.1143859.
    Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022.
    Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297. doi:10.1007/BF00994018
    El-allaly, E., Sarrouti, M., En-Nahnahi, N., & Alaoui, S. O. E. (2020). A LSTM-Based Method with Attention Mechanism for Adverse Drug Reaction Sentences Detection. Advances in Intelligent Systems and Computing, Springer, Cham, 17-26. doi:10.1007/978-3-030-36664-3_3.
    Gao, X. (2018). Deep reinforcement learning for time series: playing idealized trading games. arXiv preprint arXiv:1803.03916.
    Guo, Y., Barnes, S. J., & Jia, Q. (2017). Mining meaning from online ratings and reviews: Tourist satisfaction analysis using latent dirichlet allocation. Tourism management, 59, 467-483. doi:10.1016/j.tourman.2016.09.009
    Hajjar, E. R., Cafiero, A. C., & Hanlon, J. T. (2007). Polypharmacy in elderly patients. The American journal of geriatric pharmacotherapy, 5(4), 345-351. doi:10.1016/j.amjopharm.2007.12.002
    Heinrich, G. (2005). Parameter estimation for text analysis. Technical report
    Hitchen, L. (2006). Adverse drug reactions result in 250 000 UK admissions a year. BMJ: British Medical Journal, 332, 1109. doi:10.1136/bmj.332.7550.1109
    Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780. doi:10.1162/neco.1997.9.8.1735
    Hoffman, M., Bach, F., & Blei, D. (2010). Online learning for latent dirichlet allocation. Advances in neural information processing systems, 23.
    Huynh, T., He, Y., Willis, A., & Rüger, S. (2016). Adverse drug reaction classification with deep neural networks. COLING, International Committee on Computational Linguistics, Osaka ,Japan, 877-887.
    Jeon, E., Kim, Y., Park, H., Park, R. W., Shin, H., & Park, H.-A. (2020). Analysis of adverse drug reactions identified in nursing notes using reinforcement learning. Healthcare Informatics Research, 26(2), 104-111. doi:10.4258/hir.2020.26.2.104
    Jia, S., Zhang, X., Wang, X., & Liu, Y. (2018). Fake reviews detection based on LDA. International Conference on Information Management (ICIM), IEEE, 280-283. doi:10.1109/INFOMAN.2018.8392850.
    Jiang, Z., Xu, D., & Liang, J. (2017). A deep reinforcement learning framework for the financial portfolio management problem. arXiv preprint arXiv:1706.10059.
    Jo, Y., Lee, L., & Palaskar, S. (2017). Combining LSTM and latent topic modeling for mortality prediction. arXiv preprint arXiv:1709.02842.
    Kamaruddin, S., & Ravi, V. (2016). Credit card fraud detection using big data analytics: use of PSOAANN based one-class classification. International Conference on Informatics and Analytics, Association for Computing Machinery, New York, US, 1-8. doi:10.1145/2980258.2980319.
    Khan, S. S., & Madden, M. G. (2009). A survey of recent trends in one class classification. Irish conference on artificial intelligence and cognitive science, Springer, Berlin, 188-197. doi:10.1007/978-3-642-17080-5_21.
    Kraemer, H. C. (2014). Kappa coefficient. Wiley StatsRef: statistics reference online, 1-4. doi:10.1002/9781118445112.stat00365.pub2
    Li, Y. (2017). Deep reinforcement learning: An overview. arXiv preprint arXiv:1701.07274.
    Li, Y., Ni, P., & Chang, V. (2020). Application of deep reinforcement learning in stock trading strategies and stock forecasting. Computing, 102(6), 1305-1322. doi:10.1007/s00607-019-00773-w
    Lin, L. J. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine learning, 8(3), 293-321. doi:10.1007/BF00992699
    Lin, Y. W., Zhou, Y., Faghri, F., Shaw, M. J., & Campbell, R. H. (2019). Analysis and prediction of unplanned intensive care unit readmission using recurrent neural networks with long short-term memory. PloS one, 14(7), e0218942. doi:10.1371/journal.pone.0218942
    Martinez, C., Perrin, G., Ramasso, E., & Rombaut, M. (2018). A deep reinforcement learning approach for early classification of time series. European Signal Processing Conference (EUSIPCO), IEEE, 2030-2034. doi:10.23919/EUSIPCO.2018.8553544.
    Mikolov, T. (2012). Statistical language models based on neural networks. Presentation at
    Google, Mountain View, 2nd April, 80, 26.
    Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie , C., Sadik , A., Antonoglou, I., King, H., Kumaran, D., Wierstra , D., Legg, S., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. nature, 518(7540), 529-533. doi:10.1038/nature14236
    Ruff, L., Zemlyanskiy, Y., Vandermeulen, R., Schnake, T., & Kloft, M. (2019). Self-attentive, multi-context one-class classification for unsupervised anomaly detection on text. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, ACL, Florence, Italy, 4061-4071. doi:10.18653/v1/P19-1398.
    Rumshisky, A., Ghassemi, M., Naumann, T., Szolovits, P., Castro, V. M., McCoy, T. H., & Perlis, R. H. (2016). Predicting early psychiatric readmission with natural language processing of narrative discharge summaries. Translational Psychiatry, 6(10), e921-e921. doi:10.1038/tp.2015.182
    Schölkopf, B., Williamson, R. C., Smola, A., Shawe-Taylor, J., & Platt, J. (1999). Support vector method for novelty detection. Advances in neural information processing systems, 12(3), 582-588.
    Schölkopf, B., Platt, J. C., Shawe-Taylor, J., Smola, A. J., & Williamson, R. C. (2001). Estimating the support of a high-dimensional distribution. Neural Computation, 13(7), 1443-1471. doi:10.1162/089976601750264965
    Shepherd, G., Mohorn, P., Yacoub, K., & May, D. W. (2012). Adverse drug reaction deaths reported in United States vital statistics, 1999-2006. Annals of Pharmacotherapy, 46(2), 169-175. doi:10.1345/aph.1P592
    Sultana, J., Cutroneo, P., & Trifirò, G. (2013). Clinical and economic burden of adverse drug reactions. Journal of pharmacology & pharmacotherapeutics, 4(Suppl 1), S73-S77. doi:10.4103/0976-500X.120957
    Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. IEEE Transactions on Machine learning, 3, 9-44. doi:10.1007/BF00115009
    Tax, D. M. J., & Duin, R. P. W. (2004). Support vector data description. Machine learning, 54(1), 45-66. doi:10.1023/B:MACH.0000008084.60811.49
    Wang, Q., Lopes, L. S., & Tax, D. M. J. (2004). Visual object recognition through one-class learning. International Conference Image Analysis and Recognition, Springer, Berlin, Heidelberg, 463-470. doi:10.1007/978-3-540-30125-7_58.
    Watkins, C. J. C. H., & Dayan, P. (1992). Q-learning. Machine learning, 8(3), 279-292. doi:10.1007/BF00992698
    Xin, J., Zhao, H., Liu, D., & Li, M. (2017). Application of deep reinforcement learning in mobile robot path planning. Chinese Automation Congress (CAC), IEEE, Jinan, China, 7112-7116. doi:10.1109/CAC.2017.8244061.
    Xing, Z., Pei, J., & Philip, S. Y. (2009). Early prediction on time series: A nearest neighbor approach. International Joint Conference on Artificial Intelligence, Citeseer, Pasadena, US, 1297-1302.
    Yang, C. C., Yang, H., Jiang, L., & Zhang, M. (2012). Social media mining for drug safety signal detection. International workshop on smart health and wellbeing, Association for Computing Machinery, New York, US, 33-40. doi:10.1145/2389707.2389714.
    Zhang, M., & Geng, G. (2019). Adverse drug event detection using a weakly supervised convolutional neural network and recurrent neural network model. Information, 10(9), 276. doi:10.3390/info10090276

    無法下載圖示 校內:不公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE