| 研究生: |
潘柏儒 Pan, Bo-Ru |
|---|---|
| 論文名稱: |
主成分分析用於人工智慧晶片之硬體加速器 Hardware Accelerator for AI on Chip via Principal Component Analysis |
| 指導教授: |
李國君
Lee, Gwo-Giun |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2021 |
| 畢業學年度: | 109 |
| 語文別: | 英文 |
| 論文頁數: | 71 |
| 中文關鍵詞: | 主成分分析 、賈柏濾波器 、演算法暨架構共同探索 、硬體加速器 、特殊應用積體電路 |
| 外文關鍵詞: | principal component analysis, Gabor filter, algorithm architecture co-design, accelerator, ASIC |
| 相關次數: | 點閱:134 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文主旨為提出了一個主成分分析演算法之硬體架構設計,設計目標主要用於賈柏濾波器。透過主成分分析方法將重要訊息集中於主特徵軸之特性,利用賈柏濾波器做完主成分分析結果之對稱性以降低運算複雜度。近年來,隨著邊緣人工智慧的流行,本文將以小面積以及低功耗為目標設計主成分分析之硬體架構。設計流程是基於演算法/架構共同設計,經由對演算法進行複雜度分析,包含運算量、平行度、記憶體大小、傳輸頻寬,進而設計出資料流模型。最終所提出之硬體架構是使用TSMC 90 nm製成進行合成,操作時脈為142.8百萬赫茲。
This thesis proposes a hardware architecture of principal component analysis (PCA). The design goal is to apply PCA on Gabor filter and utilize the symmetrical property of PCA result to reduce the computation complexity. In recent years, Edge AI become new trend of AI application, this thesis aims to develop the PCA hardware architecture with small area and low power. Design flow is based on algorithm/architecture co-design, by analyzing number of operations, degree of parallelism, memory configuration, data transfer and then data flow is built. Final proposed architecture is synthesized with TSMC 90 nm technology, operating at the speed of 142.8 MHz.
[1] K. Pearson, LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin philosophical magazine and journal of science, 1901. 2(11): p. 559-572.
[2] G.G. Lee, Y.-K. Chen, et al., Algorithm/architecture co-exploration of visual computing on emergent platforms: Overview and future prospects. IEEE transactions on circuits and systems for video technology, 2009. 19(11): p. 1576-1587.
[3] N. Sundaram, Making computer vision computationally efficient. 2012: University of California, Berkeley.
[4] G.G. Lee, C.-H. Huang, et al., Complexity-aware Gabor filter bank architecture using principal component analysis. Journal of Signal Processing Systems, 2017. 89(3): p. 431-444.
[5] M. Anthony and P.L. Bartlett, Neural network learning: Theoretical foundations. 2009: cambridge university press.
[6] Y. LeCun, Generalization and network design strategies. Connectionism in perspective, 1989. 19: p. 143-155.
[7] A. Hidaka and T. Kurita. Consecutive dimensionality reduction by canonical correlation analysis for visualization of convolutional neural networks. in Proceedings of the ISCIE international symposium on stochastic systems theory and its applications. 2017. The ISCIE Symposium on Stochastic Systems Theory and Its Applications.
[8] J.G. Daugman, Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. JOSA A, 1985. 2(7): p. 1160-1169.
[9] D. Gabor Theory of communication. Part 1: The analysis of information. Journal of the Institution of Electrical Engineers - Part III: Radio and Communication Engineering, 1946. 93, 429-441.
[10] J.G. Daugman, Complete discrete 2-D Gabor transforms by neural networks for image analysis and compression. IEEE Transactions on acoustics, speech, and signal processing, 1988. 36(7): p. 1169-1179.
[11] R. Mehrotra, K.R. Namuduri, and N. Ranganathan, Gabor filter-based edge detection. Pattern recognition, 1992. 25(12): p. 1479-1494.
[12] M. Lindenbaum, M. Fischer, and A. Bruckstein, On Gabor's contribution to image enhancement. Pattern recognition, 1994. 27(1): p. 1-8.
[13] J.V. Soares, J.J. Leandro, et al., Retinal vessel segmentation using the 2-D Gabor wavelet and supervised classification. IEEE Transactions on medical Imaging, 2006. 25(9): p. 1214-1222.
[14] A.C. Bovik, M. Clark, and W.S. Geisler, Multichannel texture analysis using localized spatial filters. IEEE transactions on pattern analysis and machine intelligence, 1990. 12(1): p. 55-73.
[15] S.-Y. Chen, G.G.C. Lee, et al. Reconfigurable Edge via Analytics Architecture. in 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS). 2019. IEEE.
[16] N.P. Jouppi, C. Young, et al. In-datacenter performance analysis of a tensor processing unit. in Proceedings of the 44th annual international symposium on computer architecture. 2017.
[17] Y.-H. Chen, T.-J. Yang, et al., Eyeriss v2: A flexible accelerator for emerging deep neural networks on mobile devices. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2019. 9(2): p. 292-308.
[18] S. Kundu, M. Nazemi, et al., Pre-defined sparsity for low-complexity convolutional neural networks. IEEE Transactions on Computers, 2020. 69(7): p. 1045-1058.
[19] Nvidia. NVIDIA Deep Learning Accelerator (NVDLA). [Internet]; Available from: http://nvdla.org/.
[20] S. Yoon and A. Jameson, Lower-upper symmetric-Gauss-Seidel method for the Euler and Navier-Stokes equations. AIAA journal, 1988. 26(9): p. 1025-1026.
[21] R.B. Lehoucq and D.C. Sorensen, Deflation techniques for an implicitly restarted Arnoldi iteration. SIAM Journal on Matrix Analysis and Applications, 1996. 17(4): p. 789-821.
[22] G.L. Sleijpen and H.A. Van der Vorst, A Jacobi--Davidson iteration method for linear eigenvalue problems. SIAM review, 2000. 42(2): p. 267-293.
[23] J. Demmel and K. Veselić, Jacobi’s method is more accurate than QR. SIAM Journal on Matrix Analysis and Applications, 1992. 13(4): p. 1204-1245.
[24] I. Bravo, C. Vázquez, et al., High level synthesis FPGA implementation of the Jacobi algorithm to solve the eigen problem. Mathematical Problems in Engineering, 2015. 2015.
[25] A. Ruhe, The norm of a matrix after a similarity transformation. BIT Numerical Mathematics, 1969. 9(1): p. 53-58.
[26] J.E. Volder, The CORDIC trigonometric computing technique. IRE Transactions on electronic computers, 1959(3): p. 330-334.
[27] J.B. Walther, Computer-mediated communication: Impersonal, interpersonal, and hyperpersonal interaction. Communication research, 1996. 23(1): p. 3-43.
[28] X. Hu, R.G. Harber, and S.C. Bass, Expanding the range of convergence of the CORDIC algorithm. IEEE Transactions on computers, 1991. 40(01): p. 13-21.
[29] S.-F. Hsiao and J.-M. Delosme, Householder CORDIC algorithms. IEEE Transactions on Computers, 1995. 44(8): p. 990-1001.
[30] T. Joachims. Text categorization with support vector machines: Learning with many relevant features. in European conference on machine learning. 1998. Springer.
[31] S. Peng, Q. Xu, et al., Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines. FEBS letters, 2003. 555(2): p. 358-362.
[32] C.P. GRILL and V.N. RUSH, Analysing spectral data: comparison and application of two techniques. Biological Journal of the Linnean society, 2000. 69(2): p. 121-138.
[33] T. Elgamal and M. Hefeeda, Analysis of PCA algorithms in distributed environments. arXiv preprint arXiv:1503.05214, 2015.
[34] M.A. Mansoori and M.R. Casu, High Level Design of a Flexible PCA Hardware Accelerator Using a New Block-Streaming Method. Electronics, 2020. 9(3): p. 449.
[35] L. Beilina, E. Karchevskii, and M. Karchevskii, Numerical linear algebra: Theory and applications. 2017: Springer.
[36] U.A. Korat, A reconfigurable hardware implementation for the principal component analysis. 2016, San Diego State University.
[37] G.G. Lee and S.C. Kim, Guest Editorial for the Special Section on “Algorithm Vs. Architectures: Opportunities and Challenges in Multicore/GPU DSP. Journal of Signal Processing Systems, 2017. 89(3): p. 415-416.
[38] R.R. Teja and P.S. Reddy, Sine/cosine generator using pipelined cordic processor. International Journal of Engineering and Technology, 2011. 3(4): p. 431.
[39] H.T. Kung and C.E. Leiserson, Systolic Arrays for (VLSI). 1978, CARNEGIE-MELLON UNIV PITTSBURGH PA DEPT OF COMPUTER SCIENCE.
校內:2026-10-25公開