簡易檢索 / 詳目顯示

研究生: 方伸維
Fang, Shen-Wei
論文名稱: 以基於實例的學習方法輔助銀行進行客戶審查
Utilizing Instance-based Learning to Facilitate Customer Reviews for Banks
指導教授: 鄧維光
Teng, Wei-Guang
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 47
中文關鍵詞: 可疑行為偵測認識你的客戶 (KYC)客戶審查基於實例的學習方法時間序列分析小波轉換
外文關鍵詞: suspicious activity detection, know your customer (KYC), customer reviews, instance-based learning, time series analysis, wavelet transform
相關次數: 點閱:66下載:10
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 面對反洗錢和反資恐浪潮的崛起,金融機構每年都需要花費大量金錢和時間進行客戶審查作業 (Know Your Customer) ,然而,如何有效地辨識出最值得優先進行客戶審查的客戶,一直以來都是一個巨大的挑戰。傳統方法通常採用基於固定規則或僅依賴審查人員主觀的經驗判斷,這容易導致審查結果存在誤殺或遺漏的情況。因此,本研究主要目標是根據不同的客戶背景設立多元且可量化參考標準,以辨識出離群客戶,進而協助銀行定期進行之客戶審查作業,並利用時序分析技術找出客戶交易行為中之可疑時序態樣。明確而言,在本研究中我們採用了基於實例的學習方法,逐一考慮每位目標客戶,此外我們也將資料分為背景資料和交易行為資料,以找出背景與行為不相符的客戶為核心理念,計算每位客戶的離群指數以輔助決策,接著針對中等風險的客戶,我們運用時序分析並結合小波轉換的方法,在時序資料中尋找多解析度的可疑態樣。總體而言,本研究的貢獻有兩點,其一是我們建立了一個評分模型,能夠幫助銀行以客觀且多元標準進行客戶審查;其二是我們運用了時序分析和小波轉換的技術,能夠在時序資料中查找多解析度的可疑態樣,這個方法可以幫助我們識別出有潛在風險的行為,進一步判斷客戶是否需要進一步審查。

    Facing the rise of the anti-money laundering and counter-terrorism financing wave, financial institutions must spend a significant amount of money and time conducting Know Your Customer (KYC) verifications for their customers each year. However, identifying the customers who are most worth prioritizing for KYC verification has always been a significant challenge. Traditional methods often rely on fixed rules or subjective judgments from reviewing personnel, which can lead to false positives or false negative in the review results. Therefore, the main objective of our work is to establish diverse and quantifiable reference criteria based on the profiles of different customers to identify outlier customers and facilitate regular customer reviews. Additionally, we aim to utilize time series analysis techniques to identify suspicious temporal patterns within customer behavior. To be clear, we adopt an instance-based learning approach by considering each target customer individually. Furthermore, we divide the data into profile and transaction behavior data. Building upon the central concept of “identifying customers with inconsistent profiles and behaviors,” we calculate the outlier score for each customer to assist in decision making. For customers with moderate risk, we employ time series analysis and combine it with wavelet transform to search for multi-resolution suspicious patterns in the time series data. In summary, our work contributes to two aspects. Firstly, we establish a scoring model that helps bank personnel conduct objective and diverse KYC reviews. Secondly, we utilize time series analysis and wavelet transform to search for multi-resolution suspicious patterns in the time series data. This approach can help us identify potentially risky transaction behaviors and further determine the suspicion of customers.

    Chapter 1 Introduction 1 1.1 Motivation and Overview 1 1.2 Contributions of This Work 2 Chapter 2 Preliminaries 3 2.1 Basics of Anomaly Detection 3 2.2 Know Your Customer 4 2.3 Characteristics of the Banking Data 6 2.4 Model-based vs. Instance-based Learning 8 2.5 Utilizing Peer Group Analysis Concept to Detect Suspicious Customers 10 2.6 Building an Explainable Model 12 Chapter 3 Proposed Scheme of Instance-based Method 13 3.1 Design of Our Proposed Scheme 13 3.2 Dealing with Mixed-typed Attributes Data and Time Series Data 14 3.3 Our Model Architecture 17 3.4 Model Interpretability 24 Chapter 4 Prototyping and Empirical Studies 26 4.1 Datasets Used in Our Experiments 26 4.2 Experiment Results 27 4.3 Case Studies 38 4.4 Discussion of the Experimental Results 41 Chapter 5 Conclusions and Future Works 43 Bibliography 44

    [1] R. Arasa, and O. Linah. “Determinants of know your customer (KYC) compliance among commercial banks in Kenya.” (2015).
    [2] M. Goldstein, and S. Uchida. “A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data.” PloS one 11.4 (2016): e0152173.
    [3] X. Liu, P. Zhang, and D. Zeng. “Sequence matching for suspicious activity detection in anti-money laundering.” Intelligence and Security Informatics: IEEE ISI 2008
    [4] T. Chiu, et al. “A robust and scalable clustering algorithm for mixed type attributes in large database environment.” Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining. 2001.
    [5] J. Podani. (1999 “Extending Gower’s General Coefficient of Similarity to Ordinal Characters”, Taxon, 48, pp. 331-340.
    [6] T. Higuchi. “Approach to an irregular time series on the basis of the fractal theory.” Physica D: Nonlinear Phenomena 31.2 (1988): 277-283.Hacker, Philipp, et al. “Explainable AI under contract and tort law: legal incentives and technical challenges.” Artificial Intelligence and Law 28 (2020): 415-439.
    [7] P. Boniol, et al. “SAD: an unsupervised system for subsequence anomaly detection.” 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 2020.
    [8] J. R. Quinlan. “Combining instance-based and model-based learning,” Proceedings of the tenth international conference on machine learning, 1993.
    [9] S. B. Imandoust, and M. Bolandraftar. “Application of k-nearest neighbor (knn) approach for predicting economic events: Theoretical background.” International journal of engineering research and applications 3.5 (2013): 605-610.
    [10] R. J. Bolton, and D. J. Hand, “Unsupervised profiling methods for fraud detection.” Credit scoring and credit control VII (2001): 235-255.
    [11] L. Settipalli, and G. R. Gangadharan. “Healthcare fraud detection using primitive sub peer group analysis.” Concurrency and Computation: Practice and Experience 33.23 (2021): e6275.
    [12] D. J. Weston, et al. “Fault mining using peer group analysis.” Challenges at the Interface of Data Analysis, Computer Science, and Optimization. Springer, Berlin, Heidelberg, 2012. 453-461.Radhakrishnan, Srinivasan, et al. “Novel keyword co-occurrence network-based methods to foster systematic reviews of scientific literature.” PloS one 12.3 (2017): e0172778.
    [13] Z. Ferdousi, and A. Maeda. “Unsupervised outlier detection in time series data.” 22nd International Conference on Data Engineering Workshops (ICDEW’06). IEEE, 2006.
    [14] M.T. Ribeiro, S. Singh, and C. Guestrin. “Why should i trust you?” Explaining the predictions of any classifier.” Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016.
    [15] P. Hacker, et al. “Explainable AI under contract and tort law: legal incentives and technical challenges.” Artificial Intelligence and Law 28 (2020): 415-439.
    [16] T. Jun. “A peer dataset comparison outlier detection model applied to financial surveillance.” 18th International Conference on Pattern Recognition (ICPR’06). Vol. 4. IEEE, 2006.Liu, Linqing, and Shiye Mei. “Visualizing the GVC research: a co-occurrence network based bibliometric analysis.” Scientometrics 109.2 (2016): 953-977.
    [17] S. Bishnoi, B. K. Hooda, “A Survey of Distance Measures for Mixed Variables,” International Journal of Chemical Studies , SP-8(4): 338-343, June 2020.
    [18] Z. R. Struzik, and A. Siebes. “The Haar wavelet transform in the time series similarity paradigm.” Principles of Data Mining and Knowledge Discovery: Third European Conference, PKDD’99, Prague, Czech Republic, September 15-18, 1999. Proceedings 3. Springer Berlin Heidelberg, 1999.
    [19] G. Guo, H. Wang, D. Bell, K. Greer, Y. Bi, “KNN Model-Based Approach in Classification,” On The Move to Meaningful Internet Systems, pages 986–996, 2003.
    [20] T. Liu, A. W. Moore, A. Gray, and K. Yang, “An Investigation of Practical Approximate Nearest Neighbor Algorithms,” Proceedings of the 17th International Conference on Neural Information Processing Systems, pages 825-832, December 2004.
    [21] A. Andoni, I. Razenshteyn, and N. S. Nosatzki. “LSH Forest: Practical Algorithms Made Theoretical,” Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 67-78, January 2017.
    [22] D. Cai. “A revisit of hashing algorithms for approximate nearest neighbor search.”IEEE Transactions on Knowledge and Data Engineering.18;33(6):2337-48, Nov 2019
    [23] M. Zhao, J. Chen, and Y. Li. “A Review of Anomaly Detection Techniques Based on Nearest Neighbor,” Proceedings of the 2018 International Conference on Computer Modeling, Simulation and Algorithm, pages 290-292, April 2018.
    [24] R. Dawson, “How Significant Is a Boxplot Outlier?” Journal of Statistics Education, 19(2), 2011.
    [25] C. Cheadle, M. P. Vawter, W. J. Freed, and K. G. Becker, “Analysis of Microarray Data Using Z Score Transformation,” The Journal of Molecular Diagnostics, 5(2):73-81, May 2003.
    [26] B. Liu, W. Hsu, and Y. Ma. “Integrating Classification and Association Rule Mining,” Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pages 80-86, August 1998.
    [27] S. Ghalebikesabi, L. Ter-Minassian, K. DiazOrdaz, and C. C. Holmes, “On Locality of Local Explanation Models,” Proceedings of the 35th Conference on Neural Information Processing Systems, December 2021.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE