| 研究生: |
戴源 Tai, Yuan |
|---|---|
| 論文名稱: |
通用於各內核大小卷積與矩陣運算之計算核心設計 Design of a Kernel-agnostic Compute Core for Convolution and GEMM |
| 指導教授: |
陳中和
Chen, Chung-Ho |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2024 |
| 畢業學年度: | 112 |
| 語文別: | 中文 |
| 論文頁數: | 101 |
| 中文關鍵詞: | CNN 邊緣推論加速器 、深度優先運算 、1 * 1卷積核分解 、矩陣運算映射 |
| 外文關鍵詞: | CNN Edge Inference Accelerator, Depth-first Computation, 1 * 1 Convolution Mapping, Matrix to Convolution Mapping |
| 相關次數: | 點閱:146 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
[1]. Wu, T.J. A one-dimensional convolution accelerator supporting data reuse and multiple dimensional filters. NCKU Master Thesis. 2020.
[2]. Hsiao, C.-C. Quantization Implementation for Neural Network Accelerator based on CNN Inference Engine. NCKU Master Thesis. 2021.
[3]. Huang, H.-Q. Integration of Machine Learning Compiler Framework with Custom Instruction Set Architecture for CASLab-DLA. NCKU Master Thesis. 2022.
[4]. Xie, C.-Y. Optimizations of CNN Micro-architecture and Memory Sub-system for CASLab-DLA. NCKU Master Thesis. 2022.
[5]. Lo, T.-Y. Compressed Sparse Convolution Hardware and Software Co-design on CASLab-DLA. NCKU Master Thesis. 2023.
[6]. Wang, H.-Y. Instruction Scheduling Optimization for Convolution Neural Network on Scalable CASLab-DLA – TVM System. NCKU Master Thesis. 2023.
[7]. Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., ... & Rakin, A. S. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
[8]. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., ... & Houlsby, N. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale.
[9]. Redmon, J., & Farhadi, A. YOLOv3: An Incremental Improvement
[10]. Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., ... Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.
[11]. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L.-C. MobileNetV2: Inverted Residuals and Linear Bottlenecks.
[12]. Simonyan, K., & Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition.
[13]. Li, T., Zhang, F., Fan, X., Shen, J., Guo, W., & Cao, W. Unified Accelerator for Attention and Convolution in Inference Based on FPGA
[14]. You, H., Sun, Z., Shi, H., Yu, Z., Zhao, Y., Zhang, Y., Li, C., Li, B., & Lin, Y. ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design
[15]. Tuli, S., & Jha, N. K. AccelTran: A Sparsity-Aware Accelerator for Dynamic Inference with Transformers