簡易檢索 / 詳目顯示

研究生: 黃韋程
Huang, Wei-Cheng
論文名稱: 應用於人工智慧邊緣裝置且基於零點交越放大技術之全類比式記憶體內運算技術
A Fully Analog Computing-in-Memory Design with Zero-Crossing-Based Amplification for Edge-AI Devices
指導教授: 張順志
Chang, Soon-Jyh
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2024
畢業學年度: 112
語文別: 英文
論文頁數: 175
中文關鍵詞: 全類比記憶體內運算電荷運算零點交越放大技術權重電容分裂技術過充現象消除技術人工智慧處理器
外文關鍵詞: fully analog Computing-in-Memory (CIM), charge domain computation, zero-crossing-based amplification, weighted capacitor splitting method, overshoot cancellation technique, AI processor
相關次數: 點閱:76下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文提出一個應用於人工智慧邊緣裝置的全類比式記憶體內運算處理器。以往的記憶體內運算架構都須藉由數位類比轉換器、類比數位轉換器以及大量的儲存器來實現神經網路層間訊號傳遞,然而量化雜訊將有可能限制神經網路的準確率,故本論文提出使用放大器模擬神經元訊號傳輸,藉此增加神經網路精度及降低儲存器的數量,實現感測器旁的全類比式神經網路運算的方式。為此,本論文提出四種技巧來實現此電路: (1)基於靜態隨機存取記憶體單元的九顆電晶體運算單元、以電荷重分布的技術達成相對於電流方式更為線性的乘加運算;(2)依權重比例切分電容,並以電荷平均的技術配合電容開關切換,創造出有號數八位元權重運算,相較於傳統數位移位加法器加總的方式,所提出的方法不僅快又省電,且不受量化雜訊限制;(3)結合激活函數與零點交越式放大器避免不必要的傳輸放大,並解決傳輸電路靜態功耗的問題;(4)過充消除技巧來解決零點交越式放大器放大過程所產生的過充現象。
    本晶片使用台積電180nm CMOS 1P6M製程製造,核心面積為20.98 mm2 實現類比輸入 / INT8權重 / 類比輸出之運算,784x100x10的全連接層神經網路。在1.3V及100MHz時脈操作下,運算吞吐量為441-GOPS,乘加運算為效1880-TOPS/W,全網路運算能效為117-TOPS/W,MNIST資料集辨識準確率約為91.48%。

    This thesis presents a fully analog processing Computing-in-Memory (CIM) chip designed for artificial intelligence (AI) edge devices. In the past, conventional CIM architectures typically used digital-to-analog converters (DACs), analog-to-digital converters (ADCs), and extensive registers to achieve signal transmission between layers of neural networks. However, quantization noise plays a non-ignorable role in the accuracy of neural networks. Therefore, this thesis proposes the use of amplifiers to imitate signal transmission between neurons, aiming to enhance the accuracy of neural networks and reduce the number of registers, thereby realizing a fully analog neural network computation approach at the sensor. To achieve this, the thesis proposes four techniques for this approach: (1) A static random-access memory (SRAM)-based 9T1C computing cell performs the Multiply-Accumulate (MAC) computation through charge redistribution, achieving better linearity compared to current-domain computation. (2) Capacitors are split based on binary-weight ratios and combined with switching capacitor (SC) method to create signed 8-bit weighted MAC through charge sharing. Comparing with conventional digital shifter-adder-based method, this method is not only faster and more power-efficient but also offers high accuracy if the noise floor is sufficiently low. (3) Integrate the ReLU function into the zero-crossing amplifier to avoid unnecessary signal amplification, resulting in a significant reduction in energy consumption. (4) The overshoot cancellation technique for resolving the overshoot errors in the zero-crossing-based amplification.
    The proof-of-concept prototype was fabricated in TSMC's 180 nm CMOS standard 1P6M technology. The core area is 20.98 mm2 and supports an analog-input / INT8-weight / analog-output MAC operation and 784x100x10 fully connected layers. Measurements indicate that at a 1.3V power supply, with a clock frequency of 100 MHz the system, it achieves a throughput of 441-GOPS, an energy efficiency of 1880-TOPS/W for MAC operations, and a whole-neural-network energy efficiency of 117-TOPS/W. The inference accuracy for the MNIST database is about 91.48%.

    摘 要 I Abstract II List of Tables X List of Figures XI Chapter 1 Introduction 1 1.1 Background of Artificial Intelligence 1 1.2 Memory Centric Computation 3 1.3 Near / in Sensor Computation 5 1.4 Motivation 6 1.5 Thesis Organization 7 Chapter 2 Overview of Neural Networks and Data Management 8 2.1 Introduction of Neural Network 9 2.2 Basics of Deep Neural Network 12 2.2.1 Convolutional Neural Networks 13 2.2.2 Fully Connected Neural Networks 14 2.2.3 Non-Linear Activation Function 15 2.2.4 Pooling Function 16 2.2.5 Normalization Function 17 2.3 Quantization for Neural Networks 18 2.3.1 Linear Quantization 19 2.3.2 Non-Linear Quantization 21 2.3.3 Quantization-Aware Training 22 2.4 Introduction to Data Management 23 2.4.1 Data Flow 24 2.4.2 Data Reuse 28 2.4.3 Local memory 29 Chapter 3 Fundamentals of Conventional Computing-in-Memory 31 3.1 Introduction to Computing-In-Memory 33 3.1.1 Concept of Computing-in-Memory 33 3.1.2 Dataflow and Structure of Computing-in-Memory 36 3.2 Digital-to-Analog Conversion 37 3.2.1 Basic concept of DAC 38 3.2.2 Static Specifications of DAC 41 3.2.3 Output types of DAC in CIM macro 43 3.3 Mixed-Signal Computation with SRAM 45 3.3.1 6T SRAM Bit-cell 45 3.3.2 Current-Based Multiplication and Accumulation 50 3.3.3 Charge-Based Multiplication and Accumulation 54 3.4 Mixed-Signal Computation with RRAM 56 3.4.1 Classic RRAM Bit-cell 56 3.4.2 Current-Based Multiplication and Accumulation 57 3.4.3 Inevitable Read-Disturb Issue 62 3.5 Analog-to-Digital Conversion 63 3.5.1 The Concept and Operation of SAR ADC 64 3.5.2 Quantization Error 67 3.5.3 Static Specifications of ADC 68 3.6 Digital Shift Adder 71 Chapter 4 Proposed Architecture: A Fully Analog Processing Computing-in-Memory for Edge-AI Devices 73 4.1 Introduction 73 4.2 Proposed Architectural 75 4.3 Proposed Techniques 79 4.3.1 Analog Computing Unit Cell 80 4.3.2 MAC Operation Technique Supporting INT-8 weights 81 4.3.3 Zero-Crossing Based Amplifier with ReLU 86 4.3.4 Overshoot cancellation with DAC 92 4.4 Circuit Implementation 93 4.4.1 Peripheral Circuits of SRAM 93 4.4.2 Continuous-Time Comparator 95 4.4.3 Switched-Capacitor-Based Integrator 98 4.4.4 Bias Circuits 100 4.4.5 MAX Circuit 101 4.4.6 Capacitors on SRAM Cell for Computing 103 4.5 Analysis Of Non-Ideal Effects 105 4.5.1 Noise Model of MAC Operations 105 4.5.2 Noise Model of Current Mirror op-amp 108 4.5.3 Linearity Analysis of Zero-Crossing Amplifier 110 Chapter 5 Simulation and Measurement Results 114 5.1 Layout and Chip Floor Plan 114 5.2 Simulation Results 118 5.3 Die Micrograph and Measurement Setup 123 5.4 Measurement Result 131 5.5 Chip Demonstration 140 Chapter 6 Conclusion and Future Works 145 Bibliography 148

    [1] O.-A. Kwabena, Z. Qin, T. Zhuang and Z. Qin, “MSCryptoNet: Multi-scheme privacy-preserving deep learning in cloud computing,” IEEE Access, vol. 7, pp. 29344-29354, 2019.
    [2] A. M. Ghosh and K. Grolinger, “Deep learning: Edge-cloud data analytics for IoT,” Proc. 2019 IEEE Canadian Conference of Electrical and Computer Engineering (CCECE), pp. 1-7, 2019.
    [3] V. Sze and Y-H. Chen, “Efficient processing of deep neural networks: A tutorial and survey,”, arXiv preprint, 2017.
    [4] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks,” arXiv preprint arXiv:1603.05279, 2016.
    [5] I. Hubara, D. Soudry, and R. El-Yaniv, “Binarized Neural Networks,” NIPS. arXiv preprint arXiv:1602.02505, 2016.
    [6] I. Hubara, Matthieu Courbariaux, Daniel Soudry, El YanivRan, and Yoshua Bengio, “Quantized neural networks: Training neural networks with low precision weights and activations,” Journal of Machine Learning Research, 18, 2016.
    [7] Y. He, X. Zhang and J. Sun, “Channel pruning for accelerating very deep neural networks”, IEEE international conference on computer vision, October 2017.
    [8] C. Yang, L. Xie, C. Su and A. L. Yuille, “Snapshot distillation: Teacher–student optimization in one generation”, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 2854-2863, Jun. 2019.
    [9] K. Guo, W. Li, K. Zhong, Z. Zhu, S. Zeng, S. Han, Y. Xie, P. Debacker, M. Verhelst, Y. Wang. “Neural Network Accelerator Comparison,” [Online]. Available: https://nicsefc.ee.tsinghua.edu.cn/projects/neural-network-accelerator/
    [10] W. -S. Khwa et al., “A 65nm 4Kb algorithm-dependent Computing-in-Memory SRAM unit-macro with 2.3ns and 55.8TOPS/W fully parallel product-sum operation for binary DNN edge processors,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2018.
    [11] M. Kang, S. K. Gonugondla, A. Patil and N. R. Shanbhag, “A Multi-Functional In-Memory Inference Processor Using a Standard 6T SRAM Array,” in IEEE Journal of Solid-State Circuits, vol. 53, no. 2, pp. 642-655, Feb. 2018.
    [12] H. Valavi, P. J. Ramadge, E. Nestler, and N. Verma, “A mixed-signal binarized convolutional-neural-network accelerator integrating dense weight storage and multiplication for reduced data movement,” in Proc. IEEE Symp. VLSI Circuits, Jun. 2018, pp. 141–142.
    [13] H. Kim, Q. Chen, and B. Kim, “A 16K SRAM-based mixed-signal inmemory computing macro featuring voltage-mode accumulator and rowby-row ADC,” in Proc. IEEE Asian Solid-State Circuits Conf. (A-SSCC), Nov. 2019, pp. 35–36.
    [14] Zhou, F. & Chai, Y, “Near-sensor and in-sensor computing,” Nat. Electron. 3, 664–671 (2020). [Online]. Available: https://doi.org/10.1038/s41928-020-00501-9.
    [15] X. Yang et al.,”3D Virtual Refocusing of Point Spread Function (PSF) Engineered Images Using Cascaded Neural Networks,” Conference on Lasers and Electro-Optics (CLEO), 2022.
    [16] J. R. Nikolić, S. S. Tomić, Z. H. Perić and D. R. Aleksić, “Analysis of Neural Network Accuracy Degradation due to Uniform Weight Quantization of One or More Layers,” 2022 57th International Scientific Conference on Information, Communication and Energy Systems and Technologies (ICEST), Ohrid, North Macedonia, 2022.
    [17] “File:Neuron.svg,” https://zh.m.wikipedia.org/wiki/File:Neuron.svg
    [18] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, “Gradient-based learning applied to document recognition,” in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.
    [19] V. Sze, Y. -H. Chen, T. -J. Yang and J. S. Emer, “Efficient Processing of Deep Neural Networks: A Tutorial and Survey,” in Proceedings of the IEEE, vol. 105, no. 12, pp. 2295-2329, Dec. 2017.
    [20] Andrew L Maas, Awni Y Hannun, Andrew Y Ng, et al., “Rectifier nonlinearities improve neural network acoustic models,” In Proc. icml, volume 30, page 3. Citeseer, 2013.
    [21] K. He, X. Zhang, S. Ren and J. Sun, “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification,” 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 2015.
    [22] DA Clevert, T Unterthiner, S Hochreiter. Fast and Accurate Deep Network Learning by Exponential Linear Units (Elus). ArXiv Preprint ArXiv:1511.07289, 2015.
    [23] Sergey Ioffe and Christian Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in Proc. of ICML, 2015.
    [24] Yufei Ma, N. Suda, Yu Cao, J. -s. Seo and S. Vrudhula, “Scalable and modularized RTL compilation of Convolutional Neural Networks onto FPGA,” 2016 26th International Conference on Field Programmable Logic and Applications (FPL), 2016.
    [25] Song Han, Huizi Mao, and William J Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding,” arXiv preprint
    arXiv:1510.00149, 2015.
    [26] R. Krishnamoorthi, “Quantizing deep convolutional networks for efficient inference: A whitepaper,” in Proc. of CoRR, 2018.
    [27] E. H. Lee, D. Miyashita, E. Chai, B. Murmann and S. S. Wong, “LogNet: Energy-efficient neural networks using logarithmic computation,” 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017.
    [28] A. Zhou, A. Yao, Y. Guo, L. Xu and Y. Chen, “Incremental network quantization: Towards lossless CNNs with low-precision weights,” arXiv:1702.03044, 2017.
    [29] V. Sze, Y. Chen, T. Yang and J. S. Emer, “Efficient Processing of Deep Neural Networks: A Tutorial and Survey,” in Proceedings of the IEEE, vol. 105, no. 12, pp. 2295-2329, Dec. 2017.
    [30] M. Horowitz, “Computing's energy problem (and what we can do about it),” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2014.
    [31] Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W Mahoney, and Kurt Keutzer, “A survey of quantization methods for efficient neural network inference,” arXiv:2103.13630, 2021.
    [32] B. Asgari, R. Hadidi and H. Kim, “MEISSA: Multiplying Matrices Efficiently in a Scalable Systolic Architecture,” 2020 IEEE 38th International Conference on Computer Design (ICCD), Hartford, CT, USA, 2020, pp. 130-137.
    [33] L. Fick and D. Fick, “Introduction to Compute-in-Memory,” 2019 IEEE Custom Integrated Circuits Conference (CICC), Austin, TX, USA, 2019, pp. 1-65.
    [34] V. Sze and Y-H. Chen, “Efficient processing of deep neural networks: A tutorial and survey,” arXiv preprint, 2017.
    [35] C. -J. Jhang, C. -X. Xue, J. -M. Hung, F. -C. Chang and M. -F. Chang, “Challenges and Trends of SRAM-Based Computing-in-Memory for AI Edge Devices,” in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 68, no. 5, pp. 1773-1786, May 2021.
    [36] A. Biswas and A. P. Chandrakasan, “Conv-RAM: An energy-efficient SRAM with embedded convolution computation for low-power CNN-based machine learning applications,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2018.
    [37] X. Si et al., “A Twin-8T SRAM Computation-In-Memory Macro for Multiple-Bit CNN-Based Machine Learning,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2019.
    [38] X. Si et al., “A 28nm 64Kb 6T SRAM Computing-in-Memory Macro with 8b MAC Operation for AI Edge Chips,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2020.
    [39] J. -W. Su et al., “A 28nm 384kb 6T-SRAM Computation-in-Memory Macro with 8b Precision for AI Edge Chips,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2021.
    [40] M. E. Sinangil et al., “A 7-nm Compute-in-Memory SRAM Macro Supporting Multi-Bit Input, Weight and Output and Achieving 351 TOPS/W and 372.4 GOPS,” in IEEE Journal of Solid-State Circuits (JSSC), vol. 56, no. 1, pp. 188-198, Jan. 2021.
    [41] H. Valavi, P. J. Ramadge, E. Nestler and N. Verma, “A 64-Tile 2.4-Mb In-Memory-Computing CNN Accelerator Employing Charge-Domain Compute,” in IEEE Journal of Solid-State Circuits (JSSC), vol. 54, no. 6, pp. 1789-1799, June 2019.
    [42] Z. Jiang, S. Yin, J. Seo and M. Seok, “C3SRAM: An In-Memory-Computing SRAM Macro Based on Robust Capacitive Coupling Computing Mechanism,” in IEEE Journal of Solid-State Circuits (JSSC), vol. 55, no. 7, pp. 1888-1897, July 2020.
    [43] C. -X. Xue et al., “A 22nm 4Mb 8b-Precision ReRAM Computing-in-Memory Macro with 11.91 to 195.7TOPS/W for Tiny AI Edge Devices,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2021.
    [44] MOCHIDA, Reiji, et al., “A 4M synapses integrated analog ReRAM based 66.5 TOPS/W neural-network processor with cell current controlled writing and flexible network architecture,” in Proc. of 2018 IEEE Symposium on VLSI Technology, 2018.
    [45] CHEN, Wei-Hao, et al., “A 65nm 1Mb nonvolatile Computing-in-Memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2018.
    [46] XUE, Cheng-Xin, et al., “A 1Mb multibit ReRAM Computing-in-Memory macro with 14.6 ns parallel MAC computing time for CNN based AI edge processors,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2019.
    [47] PENG, Xiaochen; LIU, Rui; YU, Shimeng, “Optimizing weight mapping and data flow for convolutional neural networks on RRAM based processing-in-memory architecture,” in Proc. of IEEE International Symposium on Circuits and Systems (ISCAS), 2019.
    [48] XUE, Cheng-Xin, et al., “A 22nm 2Mb ReRAM compute-in-memory macro with 121-28TOPS/W for multibit MAC computing for tiny AI edge devices,” in Proc. of IEEE International Symposium on Circuits and Systems (ISCAS), 2020.
    [49] LIU, Qi, et al., “A fully integrated analog ReRAM based 78.4 TOPS/W compute-in-memory chip with fully parallel MAC computing,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2020.
    [50] XUE, Cheng-Xin, et al. “Embedded 1-Mb ReRAM-based Computing-in-Memory macro with multibit input and weight for CNN-based AI edge processors” IEEE Journal of Solid-State Circuits (JSSC), 2019.
    [51] Z. Chen et al., “High-performance HfO x /AlO y -based resistive switching memory cross-point array fabricated by atomic layer deposition,” Nanoscale Res. Lett., vol. 10, pp. 70, Feb. 2015.
    [52] W. Shim, Y. Luo, J. -S. Seo and S. Yu, “Investigation of Read Disturb and Bipolar Read Scheme on Multilevel RRAM-Based Deep Learning Inference Engine,” in IEEE Transactions on Electron Devices, vol. 67, no. 6, pp. 2318-2323, June 2020.
    [53] B. Murmann, “ADC Performance Survey 1997-2018,” [Online]. Available: http://web.stanford.edu/~murmann/adcsurvey.html.
    [54] C. Liu, S. Chang, G. Huang and Y. Lin, “A 10-bit 50-MS/s SAR ADC With a Monotonic Capacitor Switching Procedure,” in IEEE Journal of Solid-State Circuits (JSSC), vol. 45, no. 4, pp. 731-740, April 2010.
    [55] Y. -H. Kim and S. Cho, “A 1-GS/s 9-bit Zero-Crossing-Based Pipeline ADC Using a Resistor as a Current Source,” in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 24, no. 7, pp. 2570-2579, July 2016.
    [56] S. -K. Shin et al., “A 12 bit 200 MS/s Zero-Crossing-Based Pipelined ADC With Early Sub-ADC Decision and Output Residue Background Calibration,” in IEEE Journal of Solid-State Circuits (JSSC), vol. 49, no. 6, pp. 1366-1382, June 2014.
    [57] L. Brooks and H. -S. Lee, “A 12b, 50 MS/s, Fully Differential Zero-Crossing Based Pipelined ADC,” in IEEE Journal of Solid-State Circuits (JSSC), vol. 44, no. 12, pp. 3329-3343, Dec. 2009.
    [58] L. Brooks and H. -S. Lee, “A 12b 50MS/s fully differential zero-crossing-based ADC without CMFB,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2009.
    [59] M. Chu, B. Kim and B. -G. Lee, “A 10-bit 200-MS/s Zero-Crossing-Based Pipeline ADC in 0.13-μm CMOS Technology,” in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 23, no. 11, pp. 2671-2675, Nov. 2015.
    [60] T. Sepke, J. K. Fiorenza, C. G. Sodini, P. Holloway and Hae-Seung Lee, “Comparator-based switched-capacitor circuits for scaled CMOS technologies,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2006.
    [61] J. -W. Su et al., “16.3 A 28nm 384kb 6T-SRAM Computation-in-Memory Macro with 8b Precision for AI Edge Chips,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2021.
    [62] J. Yue et al., “15.2 A 2.75-to-75.9TOPS/W Computing-in-Memory NN Processor Supporting Set-Associate Block-Wise Zero Skipping and Ping-Pong CIM with Simultaneous Computation and Weight Updating,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2021.
    [63] P. -C. Wu et al., “A 28nm 1Mb Time-Domain Computing-in-Memory 6T-SRAM Macro with a 6.6ns Latency, 1241GOPS and 37.01TOPS/W for 8b-MAC Operations for Edge-AI Devices,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2022.
    [64] J. -M. Hung et al., “An 8-Mb DC-Current-Free Binary-to-8b Precision ReRAM Nonvolatile Computing-in-Memory Macro using Time-Space-Readout with 1286.4-21.6TOPS/W for Edge-AI Devices,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2022.
    [65] P. Chen et al., “7.8 A 22nm Delta-Sigma Computing-in-Memory (Δ∑CIM) SRAM Macro with Near-Zero-Mean Outputs and LSB-First ADCs Achieving 21.38TOPS/W for 8b-MAC Edge AI Processing,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2023.
    [66] H. Jiang, W. Li, S. Huang and S. Yu, “A 40nm Analog-Input ADC-Free Compute-in-Memory RRAM Macro with Pulse-Width Modulation between Sub-arrays,” 2022 IEEE Symposium on VLSI Technology and Circuits Dig. Tech. Papers, Honolulu, HI, USA, 2022, pp. 266-267.
    [67] C. -J. Jhang, C. -X. Xue, J. -M. Hung, F. -C. Chang and M. -F. Chang, “Challenges and Trends of SRAM-Based Computing-in-Memory for AI Edge Devices,” in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 68, no. 5, pp. 1773-1786, May 2021.
    [68] S. -E. Hsieh et al., “7.6 A 70.85-86.27TOPS/W PVT-Insensitive 8b Word-Wise ACIM with Post-Processing Relaxation,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2023.

    無法下載圖示 校內:2026-06-30公開
    校外:2026-06-30公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE