簡易檢索 / 詳目顯示

研究生: 鍾文傑
Chung, Wen-Chieh
論文名稱: 基於深度學習之工程訴訟案件篩選與歷審統計表建立及案件預測系統
Case screening, summary table creation and legal judgment prediction system for construction litigation based on deep learning
指導教授: 王駿發
Wang, Jhing-Fa
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 54
中文關鍵詞: 案件篩選資訊擷取IDFPOS文本相似度判決預測BERT
外文關鍵詞: case screening, information extraction, IDF, part of speech, text similarity, judgment prediction, BERT
相關次數: 點閱:47下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來由於人工智慧發展迅速,深度學習的技術逐漸開始應用於各種領域,法律便是其中之一。本研究以深度學習技術提出一個工程訴訟案件篩選與歷審統計表建立及案件預測系統,並分為三個部分。第一部份是工程訴訟案件篩選,由中華民國司法院提供之判決書資料中經由兩步驟篩選出屬於建築工程訴訟之案件,其準確率達到93.55%。第二部分是案件歷審統計表建立,將案件的歷審判決書利用正則表達式進行資訊擷取並彙整成個案之歷審統計表,準確率達到86.75%。第三部分是案件預測,案件預測有三項輸出:1.同類型案例之統計表格 2.與本案相似之過往案例 3.本案之法院判決預測結果。 同類型案例之統計表格為匹配輸入之案件類別,並輸出事先由法律專家針對案件類別進行訴訟所得與訴訟時間統計之統計表。與本案之相似案例是利用詞嵌入模型將判決書文本與輸入案件內的詞進行向量化後,再將每個詞進行IDF與詞性(POS)加權並計算其餘弦相似度,最後列出前10個相似的案例。本案之法院判決預測結果是利用判決書文本利用BERT模型將判決書文本向量化後再通過神經網路,預測法院判決之結果,在預測訴訟時間及訴訟所得的準確率為88.89%與82.22%。本系統可讓使用者於訴訟前先行得知案件可能的勝敗訴情形,再評估是否要提起訴訟。

    In recent years, due to the rapid development of artificial intelligence, deep learning technology has gradually begun to be applied in various fields, and law is one of them. This study uses deep learning technology to propose a construction litigation case screening, summary table creation and legal judgment prediction system , and is divided into three parts. The first part is the case screening for construction litigation. In the judgment data provided by the Judicial Yuan of the Republic of China, the cases that belong to the construction litigation are screened out in two steps. The accuracy of case screening is 93.55%. The second part is the case summary table creation for construction litigation, which uses regular expressions to extract information from the case trial records and integrates it into case summary tables. The third part is the case prediction. The case prediction has three outputs: 1. Statistical table of the same type of case 2. Similar case 3. The court judgment prediction result. The accuracy of summary table creation is 86.75%. The statistical table of the same type of case is the case type that matches the input, and a statistical table that the legal experts conduct litigation proceeds and litigation time statistics on the case type in advance is output. Similar case, the word embedding model is used to vectorize the words in the judgment and the input case. Then, each word is weighted by IDF and part of speech (POS) and the cosine similarity is calculated. Finally, the top 10 similar cases are listed. The court judgment prediction result is to use the BERT model to vectorize the judgment and then pass the neural network to predict the result of the court judgment. The accuracy rate in predicting litigation period and the gain in the litigation is 88.89% and 82.22%. This system allows the user to know the possible success or failure of the case before the lawsuit, and then assess whether to initiate a lawsuit.

    中文摘要 I Abstract II 誌謝 IV Content V Table List VII Figure List VIII Chapter 1 Introduction 1 1.1 Background 1 1.2 Motivation 2 1.3 Objective 3 1.4 Organization 3 Chapter 2 Related Work 5 2.1 The Survey of Case Screening 5 2.2 The Survey of Information Extraction 6 2.3 The Survey of Judicial Case Similarity Calculation 7 2.4 The Survey of Judicial Case Prediction 8 2.5 The Survey of BERT 10 2.5.1 The Architecture of BERT 11 2.5.2 Input Representation of BERT 12 2.5.3 Multi-head Self-attention 12 2.5.4 Pre-training BERT 14 2.5.5 Applications of BERT 14 Chapter 3 Case screening, summary table creation and legal judgment prediction system for construction litigation 16 3.1 System Overview 16 3.1.1 Case Screening for Construction Litigation 16 3.1.2 Case Summary Table Creation for Construction Litigation 16 3.1.3 Case Prediction for Construction Litigation 17 3.2 Case Screening for Construction Litigation 19 3.2.1 Case Citation Prefix Filter 19 3.2.2 Classifier for Construction Litigation 19 3.3 Case Summary Table Creation for Construction Litigation 21 3.3.1 Crawler Module 21 3.3.2 Judgment Analysis Module 22 3.4 Case Prediction for Construction Litigation 31 3.4.1 Case Similarity Calculation Module 31 3.4.2 Judgment Prediction Module 37 3.4.3 Case Category Matching Module 39 Chapter 4 Experiments 41 4.1 Experiment for Case Screening 41 4.1.1 Dataset for Case Screening 41 4.1.2 Experimental Results for Case Screening 42 4.2 Experiment for Information Extraction 43 4.3 Experiment for Case Similarity Calculation 43 4.3.1 Dataset for Case Similarity Calculation 43 4.3.2 Experimental Results for Case Similarity Calculation 45 4.4 Experiment for Judgment Prediction 45 4.4.1 Dataset for Judgment Prediction 45 4.4.2 Experimental Results for Judgment Prediction 46 Chapter 5 Conclusions 47 5.1 Conclusions 47 5.2 Future Works 48 Appendix 49 Reference 52

    [1] Rissland, Edwina L., Kevin D. Ashley, and Ronald Prescott Loui. "AI and Law: A fruitful synergy." Artificial Intelligence 150.1-2 (2003): 1-15.
    [2] Bench-Capon, Trevor, et al. "A history of AI and Law in 50 papers: 25 years of the international conference on AI and Law." Artificial Intelligence and Law 20.3 (2012): 215-319.
    [3] Aletras, Nikolaos, et al. "Proceedings of the Natural Legal Language Processing Workshop 2019." Proceedings of the Natural Legal Language Processing Workshop 2019. 2019.
    [4] Do, Phong-Khac, et al. "Legal question answering using ranking SVM and deep convolutional neural network." arXiv preprint arXiv:1703.05320 (2017).
    [5] Katz, Daniel Martin, Michael J. Bommarito, and Josh Blackman II. "A general approach for predicting the behavior of the Supreme Court of the United States." PloS one 12.4 (2017).
    [6] Virtucio, Michael Benedict L., et al. "Predicting decisions of the philippine supreme court using natural language processing and machine learning." 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC). Vol. 2. IEEE, 2018.
    [7] Aletras, Nikolaos, et al. "Predicting judicial decisions of the European Court of Human Rights: A natural language processing perspective." PeerJ Computer Science 2 (2016): e93.
    [8] 工程糾紛處理流程與相關事項之介紹. Available: http://ja.lawbank.com.tw/pdf/%E5%B7%A5%E7%A8%8B%E7%B3%BE%E7%B4%9B%E8%99%95%E7%90%86%E6%B5%81%E7%A8%8B%E8%88%87%E7%9B%B8%E9%97%9C%E4%BA%8B%E9%A0%85%E4%B9%8B%E4%BB%8B%E7%B4%B9.pdf
    [9] Su, Gui-yang, et al. "Improving the precision of the keyword-matching pornographic text filtering method using a hybrid model." Journal of Zhejiang University-Science A 5.9 (2004): 1106-1113.
    [10] Wu, Ou, and Weiming Hu. "Web sensitive text filtering by combining semantics and statistics." 2005 International Conference on Natural Language Processing and Knowledge Engineering. IEEE, 2005.
    [11] 张静, and 张妍. "正则表达式及其在信息抽取中的应用." 电脑知识与技术 5.15 (2009): 3867-3868.
    [12] Lin, Tao, et al. "Deep Web Data Extraction Based on Regular Expression." Advanced Materials Research. Vol. 718. Trans Tech Publications Ltd, 2013.
    [13] Turchin, Alexander, et al. "Using regular expressions to abstract blood pressure and treatment intensification information from the text of physician notes." Journal of the American Medical Informatics Association 13.6 (2006): 691-695.
    [14] Xu, Jin, et al. "Judicial Case Screening Based on LDA." International Conference of Pioneering Computer Scientists, Engineers and Educators. Springer, Singapore, 2019.
    [15] Zhang, W., Yoshida, T., Tang, X.: A comparative study of TF* IDF, LSI and multi-words for text classification. Expert Syst. Appl. 38(3), 2758–2765 (2011)
    [16] Zelikovitz, S., Hirsh, H.: Using LSI for text classification in the presence of background text. In: Proceedings of the Tenth International Conference on Information and Knowledge Management, pp. 113–118. ACM (2001)
    [17] He, Tieke, et al. "Word embedding based document similarity for the inferring of penalty." International Conference on Web Information Systems and Applications. Springer, Cham, 2018.
    [18] Xia, Chunyu, et al. "Ensemble Methods for Word Embedding Model Based on Judicial Text." International Conference on Web Information Systems and Applications. Springer, Cham, 2019.
    [19] Xiao, Chaojun, et al. "Cail2018: A large-scale legal dataset for judgment prediction." arXiv preprint arXiv:1807.02478 (2018).
    [20] Chen, Baogui, et al. "A Deep Learning Method for Judicial Decision Support." 2019 IEEE 19th International Conference on Software Quality, Reliability and Security Companion (QRS-C). IEEE, 2019.
    [21] Joulin, Armand, et al. "Bag of tricks for efficient text classification." arXiv preprint arXiv:1607.01759 (2016).
    [22] Kim, Yoon. "Convolutional neural networks for sentence classification." arXiv preprint arXiv:1408.5882 (2014).
    [23] Zhang, Shu, et al. "Evaluation of Judicial Imprisonment Term Prediction Model Based on Text Mutation." 2019 IEEE 19th International Conference on Software Quality, Reliability and Security Companion (QRS-C). IEEE, 2019.
    [24] Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).
    [25] Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems. 2017.
    [26] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
    [27] Wu, Yonghui, et al. "Google's neural machine translation system: Bridging the gap between human and machine translation." arXiv preprint arXiv:1609.08144 (2016).
    [28] Ma, Wei-Yun, and Keh-Jiann Chen. "Design of CKIP Chinese word segmentation system." Chinese and Oriental Languages Information Processing Society 14.3 (2005): 235-249.
    [29] Mikolov, Tomas, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781 (2013).
    [30] Mikolov, Tomas, et al. "Distributed representations of words and phrases and their compositionality." Advances in neural information processing systems. 2013.

    無法下載圖示 校內:2025-08-31公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE