簡易檢索 / 詳目顯示

研究生: 林英嘉
Lin, Ying-Jia
論文名稱: 基於深度學習的醫學文字探勘與序列標註任務
Deep Learning Based Medical Text Mining and Sequence Tagging
指導教授: 高宏宇
Kao, Hung-Yu
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2024
畢業學年度: 112
語文別: 英文
論文頁數: 118
中文關鍵詞: 自然語言處理語言模型醫學文字探勘深度學習序列標註提示微調少樣本學習無監督式學習
外文關鍵詞: Natural Language Processing, Language Model, Medical Text Mining, Deep Learning, Sequence Tagging, Prompt Tuning, Few-shot Learning, Unsupervised Learning
ORCID: https://orcid.org/0000-0003-4347-0232
相關次數: 點閱:141下載:10
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 深度學習在自然語言處理 (NLP) 任務中被廣泛使用,如機器翻譯、文本分類和序列標註問題。在本論文中,我們專注於深度學習在醫學文字探勘 (Medical Text Mining) 和序列標註 (Sequence Tagging) 技術的應用。我們提出了四種深度學習模型來解決醫學文字探勘和序列標註問題。首個模型為 MPTR (multi-label prompt-tuning for radiology report mining),這是一種適用於放射科文字報告探勘的提示微調模型。 MPTR 相較於傳統的詞典方法,更能夠捕捉文字報告中的細微差異,並且 MPTR 大幅減少建立肝臟電腦斷層檢查 (Computed Tomography, CT) 文字報告探勘模型所需要的人工標記資料量。第二個模型是 SEC (Saliency Equivalent Concatenation),我們提出了一種針對少樣本文本分類問題 (Few-shot Text Classification) 的資料擴增方法。我們透過一種無監督式的作法增加每個資料實例的資訊量,來提高少樣本分類任務的效能。第三個模型是 PhenoLocalizer,一種用於畸形學檢驗 (dysmorphology physical examination) 報告的醫學名詞標準化和位置抽取的序列標註模型。PhenoLocalizer 能夠增強臨床檢查報告中與 HPO (Human Phenotype Ontology) 術語相關的描述定位,並將其與任何現有方法的初始預測組合,標準化為相應的 HPO 概念。由於 PhenoLocalizer 採用了序列標記模型的訓練方式,我們接著以此類 模型進行一般主題的序列標註問題進行探討,即中文詞彙分割(CWS)任務,也是 我們提出的第四個模型。在第四個模型中,我們提出了一種非監督式的訓練框架, 通過結合預訓練語言模型於預訓練階段所學習到的語言知識,提高了先前非監督式 CWS 方法的訓練效率和中文詞彙分割效能。本論文中的 MPTR 以及 PhenoLocalizer 模型已在實際的醫學文字探勘任務上進行 了評估,並可有助於臨床應用,例如:「良性特徵識別」、「臨床輔助決策」以及用於「醫學影像研究的自動標記工作」。並且我們嘗試應用 SEC 於 MPTR,並且在實驗中 證實 SEC 能夠提高 MPTR 的效能。最後一個模型是撰寫本文時最先進的非監督式 CWS 模型。與常見的監督式 CWS 模型相比,我們的方法可用於未經標記的語句, 例如分割新興網路用語或罕見語言的詞彙。

    Deep learning has been widely used in natural language processing (NLP) tasks, such as machine translation, text classification, and sequence tagging problems. In this dissertation, we focus on the application of deep learning to medical text mining and sequence tagging techniques. We propose four deep learning models for medical text mining and sequence tagging problems. The first model is MPTR, a multi-label prompt-tuning model for radiology report mining. MPTR is able to alleviate the shortage of dictionary-based methods and greatly reduce the need of labeled data for building deep learning models to extract cancer-related features from liver computed tomography (CT) reports. The second model is SEC, a data augmentation approach for few-shot text classification with Saliency Equivalent Concatenation. SEC is able to improve the performance of the few-shot text classification task by enriching the information for each data instance. The third model is PhenoLocalizer, which is a sequence tagging model for medical text normalization and extraction for dysmorphology physical examination reports. PhenoLocalizer is able to enhance the localization for the mentions of Human Phenotype Ontology (HPO) terms in the clinical examination reports and normalize them to the corresponding HPO concepts in combination with an initial prediction set from any existing methods. Since we leverage sequence tagging techniques for PhenoLocalizer, we next study another sequence tagging problem in the general domain the Chinese word segmentation (CWS) task, which is the fourth model we propose. We propose an unsupervised training framework to improve the training efficiency and segmentation performance of the previous unsupervised CWS approaches with the pre-trained language model. The proposed MPTR and PhenoLocalizer approaches in this dissertation have been evaluated on the real-world medical text mining tasks, and they can be beneficial for clinical applications, such as benign feature identification, clinical decision making, and automatic labeling for medical image research. In addition, we apply SEC to MPTR and verify that SEC can improve the performance of MPTR in our experiment. The last model is the most advanced unsupervised CWS model at the time of writing. Compared with supervised CWS models, our approach can be used for word segmentation on unlabeled words, such as segmenting words originated from the Internet or rare languages.

    摘要 i Abstract iii 誌謝 v Table of Contents vii List of Tables x List of Figures xii Chapter 1. Introduction 1 1.1 Deep Learning in Natural Language Processing 2 1.2 Scarcity of Labeled Medical Data 4 1.3 Few-shot Medical Text Mining for Reports 4 1.4 Data Augmentation Technique for Few-shot Learning 6 1.5 Medical Text Mining and Sequence Tagging 7 1.6 Sequence Tagging Application in the General Field 9 1.7 Overview of Implemented Approaches 9 1.7.1. How to improve few-shot learning performance for report labeling? 10 1.7.2. Can data augmentation help few-shot learning? 10 1.7.3. How to design a sequence tagging model to better locate entities in reports? 11 1.7.4. Can we also apply sequence tagging to medical word segmentation? 11 Chapter 2. Related Work 12 2.1 Few-shot Medical Text Mining for Reports 12 2.1.1. Traditional Approaches 12 2.1.2. Machine Learning Approaches for Report Labeling 13 2.2 Data Augmentation Technique for Few-shot Learning 13 2.2.1. Meta-learning 13 2.2.2. Data augmentation 14 2.3 Medical Text Mining with Sequence Tagging 16 2.3.1. Traditional Approaches 16 2.3.2. PhenoTagger 16 2.3.3. PhenoBERT 17 2.4 Sequence Tagging Application in the General Field 19 2.4.1. Language Modeling for UCWS 19 2.4.2. Incorporation of Pre-trained Knowledge for UCWS 19 Chapter 3. Few-shot Medical Text Mining for Reports 20 3.1 Contributions 20 3.2 Dataset 21 3.2.1. Data Collection 21 3.2.2. Data Labeling and Pre-processing 21 3.2.3. Dataset Splits 21 3.3 System Background 22 3.4 Method 24 3.4.1. Problem Definition 24 3.4.2. Verbalizer 24 3.4.3. Training 26 3.4.4. Automatic Prompt Generation 27 3.5 Experiments 29 3.5.1. Baseline Methods 29 3.5.2. Implementation Details 32 3.5.3. Performance Comparison 33 3.5.4. Case Study 34 3.5.5. Effectiveness of Negation Handling 35 3.5.6. Performance Comparison for Different Training Sizes 38 3.5.7. Ablation Study for the Verbalizer 38 3.6 Discussion 39 3.6.1. Comparison with Dictionary-based Tools 39 3.6.2. Comparison with BERT Fine-Tuning 40 3.6.3. Comparison with GPT-4 40 3.6.4. Practical Implications and Applications 41 3.7 Summary 42 Chapter 4. Data Augmentation Technique for Few-shot Learning 43 4.1 Contributions 43 4.2 Method 45 4.2.1. Few-shot text classification with Meta-learning 45 4.2.2. Problem formulation 45 4.2.3. Meta-learning 45 4.2.4. Saliency-equivalent concatenation 46 4.3 Experiments 47 4.3.1. Datasets 47 4.3.2. Baselines 48 4.3.3. Implementation Details 49 4.3.4. Results 50 4.4 Summary 61 Chapter 5. Medical Text Mining with Sequence Tagging 62 5.1 Contributions 62 5.2 Approach 62 5.2.1. Motivation 62 5.2.2. System Overview 63 5.2.3. Pre-processing 63 5.2.4. Training 63 5.2.5. Inference 64 5.3 Dataset 65 5.4 Results 66 5.5 Summary 68 Chapter 6. Sequence Tagging Application in the General Field 69 6.1 Contributions 69 6.2 Method 69 6.2.1. Segment Model 70 6.2.2. Classifier 74 6.2.3. Training Framework 74 6.3 Experiments 75 6.3.1. Datasets 75 6.3.2. Implementation Details 75 6.3.3. Results 76 6.3.4. Training Speed Comparison 77 6.4 Analysis 77 6.4.1. Use Self-Training? 77 6.4.2. Segmentation Examples 78 6.4.3. Comparison of Model Performance on Different Segmentation Lengths 79 6.4.4. Performance comparison for different tagging schemas 80 6.4.5. Performance without Early Stopping 80 6.4.6. Re-implementation Results 81 6.4.7. More Segmentation Examples 84 6.5 Summary 85 6.6 Limitations 85 Chapter 7. Conclusion 86 7.1 Concluding Remarks 86 7.2 Theoretical Insights 87 7.2.1. MPTR 87 7.2.2. SEC 88 7.2.3. PhenoLocalizer 88 7.2.4. UCWS 88 References 90 Publication List 101

    [1] Emily Alsentzer, John Murphy, William Boag, Wei-Hung Weng, Di Jindi, Tristan Naumann, and Matthew McDermott. Publicly available clinical BERT embeddings. In Proceedings of the 2nd Clinical Natural Language Processing Workshop, pages 72–78, Minneapolis, Minnesota, USA, June 2019. Association for Computational Linguistics.
    [2] Ateret Anaby-Tavor, Boaz Carmeli, Esther Goldbraich, Amir Kantor, George Kour, Segev Shlomov, Naama Tepper, and Naama Zwerdling. Do not have enough data? deep learning to the rescue! In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 7383–7390, 2020.
    [3] Aryan Arbabi, David R Adams, Sanja Fidler, and Michael Brudno. Identifying clinical terms in medical text using ontology-guided machine learning. JMIR Med Inform, 7(2):e12596, May 2019.
    [4] AR Aronson. Effective mapping of biomedical text to the umls metathesaurus: the metamap program. Proceedings. AMIA Symposium, page 17—21, 2001.
    [5] Yujia Bao, Menghua Wu, Shiyu Chang, and Regina Barzilay. Few-shot text classification with distributional signatures. In International Conference on Learning Representations, 2020.
    [6] Steven Bird and Edward Loper. NLTK: The natural language toolkit. In Proceedings of the ACL Interactive Poster and Demonstration Sessions, pages 214–217, Barcelona, Spain, July 2004. Association for Computational Linguistics.
    [7] Olivier Bodenreider. The unified medical language system (umls): integrating biomedical terminology. Nucleic Acids Research, 32:D267–D270, 1 2004.
    [8] Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5:135–146, 2017.
    [9] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. Language models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc., 2020.
    [10] Iñigo Casanueva, TadasTemčinas, Daniela Gerz, Matthew Henderson, and Ivan Vulić. Efficient intent detection with dual sentence encoders. In Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI, pages 38–45, Online, July 2020. Association for Computational Linguistics.
    [11] Jason S Chang and Tracy Lin. Unsupervised word segmentation without dictionary. In ROCLING 2003 Poster Papers, pages 355–359, 2003.
    [12] Matthew C. Chen, Robyn L. Ball, Lingyao Yang, Nathaniel Moradzadeh, Brian E. Chapman, David B. Larson, Curtis P. Langlotz, Timothy J. Amrhein, and Matthew P. Lungren. Deep learning to classify radiology free-text reports. Radiology, 286(3):845– 852, 2018. PMID: 29135365.
    [13] Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang,and Jia-Bin Huang. A closer look at few-shot classification. In International Conference on Learning Representations, 2019.
    [14] Fredrik A Dahl, Taraka Rama, Petter Hurlen, Pål H Brekke, Haldor Husby, Tore Gundersen, Øystein Nytrø, and Lilja Øvrelid. Neural classification of norwegian radiology reports: using nlp to detect findings in ct-scans of children. BMC Medical Informatics and Decision Making, 21:84, 2021.
    [15] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Jill Burstein, Christy Doran, and Thamar Solorio, editors, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics.
    [16] Jesse Dodge, Gabriel Ilharco, Roy Schwartz, Ali Farhadi, Hannaneh Hajishirzi, and Noah A. Smith. Fine-tuning pretrained language models: Weight initializations, data orders, and early stopping. CoRR, abs/2002.06305, 2020.
    [17] Thomas Dopierre, Christophe Gravier, and Wilfried Logerais. PROTAUGMENT: Unsupervised diverse short-texts paraphrasing for intent detection meta-learning. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 2454–2466, Online, August 2021. Association for Computational Linguistics.
    [18] C.m. Downey, Fei Xia, Gina-Anne Levow, and Shane Steinert-Threlkeld. A masked segmental language model for unsupervised natural language segmentation. In Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 39–50, Seattle, Washington, July 2022. Association for Computational Linguistics.
    [19] Ignat Drozdov, Daniel Forbes, Benjamin Szubert, Mark Hall, Chris Carlin, and David J Lowe. Supervised and unsupervised language modelling in chest x-ray radiological reports. Plos one, 15(3):e0229963, 2020.
    [20] Vincent M D'Anniballe, Fakrul Islam Tushar, Khrystyna Faryna, Songyue Han, Maciej A Mazurowski, Geoffrey D Rubin, and Joseph Y Lo. Multi-label annotation of text reports from computed tomography of the chest, abdomen, and pelvis using deep learning. BMC Medical Informatics and Decision Making, 22:102, 2022.
    [21] Thomas Emerson. The second international Chinese word segmentation bakeoff. In Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing, 2005.
    [22] Y. Feng, L. Qi, and W. Tian. Phenobert: A combined deep learning method for automated recognition of human phenotype ontology. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 20(02):1269–1277, mar 2023.
    [23] Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 1126–1135. PMLR, 06–11 Aug 2017.
    [24] Tianyu Gao, Adam Fisch, and Danqi Chen. Making pre-trained language models better few-shot learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3816–3830, Online, August 2021. Association for Computational Linguistics.
    [25] Ruiying Geng, Binhua Li, Yongbin Li, Xiaodan Zhu, Ping Jian, and Jian Sun. Induction networks for few-shot text classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3904–3913, 2019.
    [26] Yu Gu, Robert Tinn, Hao Cheng, Michael Lucas, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, and Hoifung Poon. Domain-specific language model pretraining for biomedical natural language processing. ACM Transactions on Computing for Healthcare, 3(1), oct 2021.
    [27] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
    [28] Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, and Yejin Choi. The curious case of neural text degeneration. In International Conference on Learning Representations, 2020.
    [29] Jeremy Howard and Sebastian Ruder. Universal language model fine-tuning for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 328–339, Melbourne, Australia, July 2018. Association for Computational Linguistics.
    [30] Shengding Hu, Ning Ding, Huadong Wang, Zhiyuan Liu, Jingang Wang, Juanzi Li, Wei Wu, and Maosong Sun. Knowledgeable prompt-tuning: Incorporating knowledge into prompt verbalizer for text classification. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio, editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2225–2240, Dublin, Ireland, May 2022. Association for Computational Linguistics.
    [31] Jeremy Irvin, Pranav Rajpurkar, Michael Ko, Yifan Yu, Silviana Ciurea-Ilcus, Chris Chute, Henrik Marklund, Behzad Haghgoo, Robyn Ball, Katie Shpanskaya, Jayne Seekins, David A. Mong, Safwan S. Halabi, Jesse K. Sandberg, Ricky Jones, David B. Larson, Curtis P. Langlotz, Bhavik N. Patel, Matthew P. Lungren, and Andrew Y. Ng. Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01):590–597, Jul. 2019.
    [32] Guangjin Jin and Xiao Chen. The fourth international Chinese language processing bakeoff: Chinese word segmentation, named entity recognition and Chinese POS tagging. In Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing, 2008.
    [33] ClementJonquet,NigamHShah,andMarkAMusen.Theopenbiomedicalannotator. Summit On Translat. Bioinforma., 2009:56–60, March 2009.
    [34] Yoon Kim. Convolutional neural networks for sentence classification. In Alessandro Moschitti, Bo Pang, and Walter Daelemans, editors, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1746– 1751, Doha, Qatar, October 2014. Association for Computational Linguistics.
    [35] Yoon Kim. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1746–1751, Doha, Qatar, October 2014. Association for Computational Linguistics.
    [36] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2015.
    [37] Nikita Kitaev, Lukasz Kaiser, and Anselm Levskaya. Reformer: The efficient transformer. In International Conference on Learning Representations, 2020.
    [38] AlexKrizhevsky,IlyaSutskever,andGeoffreyEHinton.Imagenetclassificationwith deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
    [39] Varun Kumar, Ashutosh Choudhary, and Eunah Cho. Data augmentation using pre-trained transformer models. In Proceedings of the 2nd Workshop on Life-long Learning for Spoken Language Systems, pages 18–26, Suzhou, China, December 2020. Association for Computational Linguistics.
    [40] Sebastian Köhler, Michael Gargano, Nicolas Matentzoglu, Leigh C Carmody, David Lewis-Smith, Nicole A Vasilevsky, Daniel Danis, Ganna Balagura, Gareth Baynam, Amy M Brower, Tiffany J Callahan, Christopher G Chute, Johanna L Est, Peter D Galer, Shiva Ganesan, Matthias Griese, Matthias Haimel, Julia Pazmandi, Marc Hanauer, Nomi L Harris, Michael J Hartnett, Maximilian Hastreiter, Fabian Hauck, Yongqun He, Tim Jeske, Hugh Kearney, Gerhard Kindle, Christoph Klein, Katrin Knoflach, Roland Krause, David Lagorce, Julie A McMurry, Jillian A Miller, Monica C Munoz-Torres, Rebecca L Peters, Christina K Rapp, Ana M Rath, Shahmir A Rind, Avi Z Rosenberg, Michael M Segal, Markus G Seidel, Damian Smedley, Tomer Talmy, Yarlalu Thomas, Samuel A Wiafe, Julie Xian, Zafer Yüksel, Ingo Helbig, Christopher J Mungall, Melissa A Haendel, and Peter N Robinson. The Human Phenotype Ontology in 2021. Nucleic Acids Research, 49(D1):D1207–D1217, 12 2020.
    [41] Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4):1234–1240, 09 2019.
    [42] Jia Li, Yucong Lin, Pengfei Zhao, Wenjuan Liu, Linkun Cai, Jing Sun, Lei Zhao, Zhenghan Yang, Hong Song, Han Lv, and Zhenchang Wang. Automatic text classification of actionable radiology reports of tinnitus patients using bidirectional encoder representations from transformer (bert) and in-domain pre-training (idpt). BMC Medical Informatics and Decision Making, 22:200, 2022.
    [43] Wei Li, Yuhan Song, Qi Su, and Yanqiu Shao. Unsupervised Chinese word segmentation with BERT oriented probing and transformation. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3935–3940, Dublin, Ireland, May 2022. Association for Computational Linguistics.
    [44] Cong Liu, Fabricio Sampaio Peres Kury, Ziran Li, Casey Ta, Kai Wang, and Chunhua Weng. Doc2Hpo: a web application for efficient and accurate HPO concept curation. Nucleic Acids Research, 47(W1):W566–W570, 05 2019.
    [45] Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Comput. Surv., 55(9), jan 2023.
    [46] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692, 2019.
    [47] Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. In International Conference on Learning Representations, 2019.
    [48] Jin Kiat Low, Hwee Tou Ng, and Wenyuan Guo. A maximum entropy approach to Chinese word segmentation. In Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing, 2005.
    [49] Ling Luo, Shankai Yan, Po-Ting Lai, Daniel Veltri, Andrew Oler, Sandhya Xirasagar, Rajarshi Ghosh, Morgan Similuk, Peter N Robinson, and Zhiyong Lu. PhenoTagger: a hybrid method for phenotype concept recognition using human phenotype ontology. Bioinformatics, 37(13):1884–1890, 01 2021.
    [50] Anna Majkowska, Sid Mittal, David F Steiner, Joshua J Reicher, Scott Mayer McKinney, Gavin E Duggan, Krish Eswaran, Po-Hsuan Cameron Chen, Yun Liu, Sreenivasa Raju Kalidindi, et al. Chest radiograph interpretation with deep learning models: assessment with radiologist-adjudicated reference standards and population-adjusted evaluation. Radiology, 294(2):421–431, 2020.
    [51] Aditya Malte and Pratik Ratadiya. Evolution of transfer learning in natural language processing. arXiv preprint arXiv:1910.07370, 2019.
    [52] David McClosky. Any domain parsing: automatic domain adaptation for natural language parsing, 2010.
    [53] Tomás Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. In Yoshua Bengio and Yann LeCun, editors, 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings, 2013.
    [54] Tomás Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26, 2013.
    [55] Rishabh Misra. News category dataset, 06 2018.
    [56] Daichi Mochihashi, Takeshi Yamada, and Naonori Ueda. Bayesian unsupervised word segmentation with nested Pitman-Yor language modeling. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 100–108, Suntec, Singapore, August 2009. Association for Computational Linguistics.
    [57] Renkun Ni, Micah Goldblum, Amr Sharaf, Kezhi Kong, and Tom Goldstein. Data augmentation for meta-learning. In International Conference on Machine Learning, pages 8152–8161. PMLR, 2021.
    [58] S Nowak, David Biesner, YC Layer, M Theis, Helen Schneider, W Block, Benjamin Wulff, UI Attenberger, Rafet Sifa, and AM Sprinkart. Transformer-based structuring of free-text radiology report databases. European Radiology, 33(6):4228–4236, 2023.
    [59] R OpenAI. Gpt-4 technical report. arXiv, pages 2303–08774, 2023.
    [60] Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul F Christiano, Jan Leike, and Ryan Lowe. Training language models to follow instructions with human feedback. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 27730–27744. Curran Associates, Inc., 2022.
    [61] Chao Pang, Annet Sollie, Anna Sijtsma, Dennis Hendriksen, Bart Charbon, Mark de Haan, Tommy de Boer, Fleur Kelpin, Jonathan Jetten, Joeri K. van der Velde, Nynke Smidt, Rolf Sijmons, Hans Hillege, and Morris A. Swertz. SORTA: a system for ontology-based re-coding and technical annotation of biomedical phenotype data. Database, 2015:bav089, 09 2015.
    [62] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Curran Associates Inc., Red Hook, NY, USA, 2019.
    [63] Yifan Peng, Xiaosong Wang, Le Lu, Mohammadhadi Bagheri, Ronald Summers, and Zhiyong Lu. Negbio: a high-performance tool for negation and uncertainty detection in radiology reports. AMIA Joint Summits on Translational Science proceedings, 2017:188—196, 2018.
    [64] Yifan Peng, Shankai Yan, and Zhiyong Lu.Transfer learning in biomedical natural language processing: An evaluation of BERT and ELMo on ten benchmarking datasets. In Proceedings of the 18th BioNLP Workshop and Shared Task, pages 58–65, Florence, Italy, August 2019. Association for Computational Linguistics.
    [65] Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227–2237, New Orleans, Louisiana, June 2018. Association for Computational Linguistics.
    [66] Long N. Phan, James T. Anibal, Hieu Tran, Shaurya Chanana, Erol Bahadroglu, Alec Peltekian, and Grégoire Altan-Bonnet. Scifive: a text-to-text transformer model for biomedical literature, 2021.
    [67] Peng Qi, Yuhao Zhang, Yuhui Zhang, Jason Bolton, and Christopher D. Manning. Stanza: A python natural language processing toolkit for many human languages. In Asli Celikyilmaz and Tsung-Hsien Wen, editors, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 101–108, Online, July 2020. Association for Computational Linguistics.
    [68] Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. Improving language understanding by generative pre-training. 2018.
    [69] Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language models are unsupervised multitask learners. 2019.
    [70] Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140):1–67, 2020.
    [71] Timo Schick and Hinrich Schütze. Exploiting cloze-questions for few-shot text classification and natural language inference. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 255–269, Online, April 2021. Association for Computational Linguistics.
    [72] Timo Schick and Hinrich Schütze. It’s not just size that matters: Small language models are also few-shot learners. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2339–2352, Online, June 2021. Association for Computational Linguistics.
    [73] Konstantinos Sechidis, Grigorios Tsoumakas, and Ioannis Vlahavas. On the stratification of multi-label data. In Dimitrios Gunopulos, Thomas Hofmann, Donato Malerba, and Michalis Vazirgiannis, editors, Machine Learning and Knowledge Discovery in Databases, pages 145–158, Berlin, Heidelberg, 2011. Springer Berlin Heidelberg.
    [74] Rico Sennrich, Barry Haddow, and Alexandra Birch. Improving neural machine translation models with monolingual data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 86–96, Berlin, Germany, August 2016. Association for Computational Linguistics.
    [75] Bonggun Shin, Falgun H. Chokshi, Timothy Lee, and Jinho D. Choi. Classification of radiology reports using neural attention models. In 2017 International Joint Conference on Neural Networks (IJCNN), pages 4363–4370, 2017.
    [76] Akshay Smit, Saahil Jain, Pranav Rajpurkar, Anuj Pareek, Andrew Ng, and Matthew Lungren. Combining automatic labelers and expert annotations for accurate radiology report labeling using BERT. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1500–1519, Online, November 2020. Association for Computational Linguistics.
    [77] Jake Snell, Kevin Swersky, and Richard Zemel. Prototypical networks for few-shot learning. Advances in neural information processing systems, 30, 2017.
    [78] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(56):1929–1958, 2014.
    [79] Zhiqing Sun and Zhi-Hong Deng. Unsupervised neural word segmentation for Chinese via segmental language modeling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4915–4920, Brussels, Belgium, October-November 2018. Association for Computational Linguistics.
    [80] Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip HS Torr, and Timothy M Hospedales. Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1199–1208, 2018.
    [81] Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. In Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc., 2014.
    [82] Maria Taboada, Hadriana Rodríguez, Diego Martínez, María Pardo, and María Jesús Sobrido. Automated semantic annotation of rare disease cases: a case study. Database, 2014:bau045, 06 2014.
    [83] Jörg Tiedemann and Santhosh Thottingal. OPUS-MT – building open translation services for the world. In André Martins, Helena Moniz, Sara Fumega, Bruno Martins, Fernando Batista, Luisa Coheur, Carla Parra, Isabel Trancoso, Marco Turchi, Arianna Bisazza, Joss Moorkens, Ana Guerberof, Mary Nurminen, Lena Marg, and Mikel L. Forcada, editors, Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, pages 479–480, Lisboa, Portugal, November 2020. European Association for Machine Translation.
    [84] Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
    [85] Huihsin Tseng, Pichuan Chang, Galen Andrew, Daniel Jurafsky, and Christopher Manning. A conditional random field word segmenter for sighan bakeoff 2005. In Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing, 2005.
    [86] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. Attention is all you need. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
    [87] Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Daan Wierstra, et al. Matching networks for one shot learning. Advances in neural information processing systems, 29, 2016.
    [88] Tu Vu, Minh-Thang Luong, Quoc Le, Grady Simon, and Mohit Iyyer. STraTA: Self-training with task augmentation for better few-shot learning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5715– 5731, Online and Punta Cana, Dominican Republic, November 2021. Association for Computational Linguistics.
    [89] Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. GLUE: A multi-task benchmark and analysis platform for natural language understanding. In Tal Linzen, Grzegorz Chrupała, and Afra Alishahi, editors, Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 353–355, Brussels, Belgium, November 2018. Association for Computational Linguistics.
    [90] Lihao Wang and Xiaoqing Zheng. Unsupervised word segmentation with bi-directional neural language model. ACM Transactions on Asian and Low-Resource Language Information Processing, 22(1):1–16, 2022.
    [91] X. Wang, Y. Peng, L. Lu, Z. Lu, M. Bagheri, and R. M. Summers. Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3462–3471, Los Alamitos, CA, USA, jul 2017. IEEE Computer Society.
    [92] Jason Wei and Kai Zou. EDA: Easy data augmentation techniques for boosting performance on text classification tasks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 6382–6388, Hong Kong, China, November 2019. Association for Computational Linguistics.
    [93] Adina Williams, Nikita Nangia, and Samuel Bowman. A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1112–1122. Association for Computational Linguistics, 2018.
    [94] Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander Rush. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online, October 2020. Association for Computational Linguistics.
    [95] David A. Wood, Jeremy Lynch, Sina Kafiabadi, Emily Guilhem, Aisha Al Busaidi, Antanas Montvila, Thomas Varsavsky, Juveria Siddiqui, Naveen Gadapa, Matthew Townend, Martin Kiik, Keena Patel, Gareth Barker, Sebastian Ourselin, James H. Cole, and Thomas C. Booth. Automated labelling using an attention model for radiology reports of mri scans (alarm). In Tal Arbel, Ismail Ben Ayed, Marleen de Bruijne, Maxime Descoteaux, Herve Lombaert, and Christopher Pal, editors, Proceedings of the Third Conference on Medical Imaging with Deep Learning, volume 121 of Proceedings of Machine Learning Research, pages 811–826. PMLR, 06–08 Jul 2020.
    [96] Xing Wu, Shangwen Lv, Liangjun Zang, Jizhong Han, and Songlin Hu. Conditional bert contextual augmentation. In International conference on computational science, pages 84–95. Springer, 2019.
    [97] Zhiyong Wu, Yun Chen, Ben Kao, and Qun Liu. Perturbed masking: Parameter-free probing for analyzing and interpreting BERT. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4166–4176, Online, July 2020. Association for Computational Linguistics.
    [98] Qizhe Xie, Zihang Dai, Eduard Hovy, Thang Luong, and Quoc Le. Unsupervised data augmentation for consistency training. Advances in Neural Information Processing Systems, 33:6256–6268, 2020.
    [99] Naiwen Xue, Fei Xia, Fu-Dong Chiou, and Marta Palmer. The penn chinese treebank: Phrase structure annotation of a large corpus. Natural language engineering, 11(2):207–238, 2005.
    [100] Nianwen Xue. Chinese word segmentation as character tagging. In International Journal of Computational Linguistics & Chinese Language Processing, Volume 8, Number 1, February 2003: Special Issue on Word Formation and Chinese Language Processing, pages 29–48, February 2003.
    [101] Michihiro Yasunaga, Jure Leskovec, and Percy Liang. LinkBERT: Pretraining language models with document links. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio, editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8003–8016, Dublin, Ireland, May 2022. Association for Computational Linguistics.
    [102] Daniel Zeman, Jan Hajič, Martin Popel, Martin Potthast, Milan Straka, Filip Ginter, Joakim Nivre, and Slav Petrov. CoNLL 2018 shared task: Multilingual parsing from raw text to Universal Dependencies. In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 1–21, Brussels, Belgium, October 2018. Association for Computational Linguistics.
    [103] Tianyi Zhang, Felix Wu, Arzoo Katiyar, Kilian Q Weinberger, and Yoav Artzi. Revisiting few-sample bert fine-tuning. In International Conference on Learning Representations, 2021.
    [104] Yijia Zhang, Qingyu Chen, Zhihao Yang, Hongfei Lin, and Zhiyong Lu. BioWordVec, improving biomedical word embeddings with subword information and MeSH. Scientific Data, 6(1):52, May 2019.
    [105] Yukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In Proceedings of the IEEE international conference on computer vision, pages 19–27, 2015.

    下載圖示 校內:2025-02-01公開
    校外:2025-02-01公開
    QR CODE