簡易檢索 / 詳目顯示

研究生: 游世杰
Yu, Shih-Chieh
論文名稱: 俱雙重精準度函數近似之特殊函數單元設計
Design of Special Function Unit with Dual-Precision Function Approximation
指導教授: 郭致宏
Kuo, Chih-Hung
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 91
中文關鍵詞: 分段多項式近似法泰勒級數查找表特殊函數單元雙重精準度運算低功率
外文關鍵詞: IEEE 754,Special function unit, Taylor series, Piecewise polynomial approximation, dual-precision
相關次數: 點閱:82下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 特殊函數單元的設計中會使用多項式近似目標函數,多項式近似運算的係數會預先計算求得並儲存於查找表中加速硬體計算,然而查找表卻占了整體大部分的面積資源。本論文提出使用分段多項式近似的運算架構,透過不均勻分段(Non-uniform segmentation)設計,減少查找表所需使用面積。我們探討泰勒級數展開(Taylor series expansion method)及最小化極大近似法(Minimax approximation)的近似效果及演算法特性,利用泰勒級數展開較易於分析的特性,透過計算泰勒多項式的餘項設計分段演算法。硬體運算中可經由精準度選擇線,減少多項式運算階數並輸出低精準度的近似運算,設計可變精準度之低功耗運算架構。本篇論文硬體架構使用VerilogHDL語言撰寫,以Design Compiler在TSMC 40nm製程下合成並模擬驗證,並使用Memory Compiler模擬記憶體查找表單元,藉由Primetime程式分析雙重精準度近似單元於不同模式中的功耗。

    This paper presents a non-uniform segmentation method for a special function unit based on Taylor series approximation. Compared with the uniform segmentation, our method reduces the sizes of the look-up tables by decreasing the number of segments. By reducing the calculation of polynomial order and partitioning the multipliers, our architecture can output low-precision result with lower power consumption.
    The proposed design is synthesized with TSMC 40-nm cell library, and the implementation results based on 2^nd order polynomial approximation show that our method can efficiently reduce the total area by 15% in single-precision.

    中文摘要 I 英文延伸摘要 II 致謝 IX 目錄 X 圖目錄 XII 表目錄 XIV 第一章 緒論 1 1-1 前言 1 1-2 研究動機 2 1-3 研究貢獻 3 1-4 論文架構 4 第二章 相關研究背景介紹 5 2-1 IEEE 754標準 (IEEE 754 Standard) 5 2-1-1 半精度 (Half precision) 6 2-1-2 單精度 (Single precision) 6 2-1-3 雙精度 (Single precision) 7 2-2 特殊函數單元 7 2-3 近似演算法 13 2-3-1 泰勒級數展開式 (Taylor series expansion method) 14 2-3-2 最小化極大近似演算法 15 2-4 相關文獻回顧 17 2-4-1 最小化極大二階多項式近似設計 19 2-4-2 泰勒級數展開的三階多項式的設計 22 2-4-3 不均勻分段設計以及位址產生器設計 25 第三章 雙重精準度之低功耗多項式近似單元設計 27 3-1 不同近似演算法的近似效果比較 28 3-2 不均勻分段分析及設計 33 3-2-1 區間邊界及展開點計算 35 3-2-2 近似區間不均勻分段方法 38 3-3 迭代調整區間邊界及展開點 39 3-4 位址產生器設計 45 3-5 誤差分析 49 3-6 近似運算硬體架構 51 3-7 可變精準度架構分析及設計 54 3-8 小結 62 第四章 實驗環境與數據分析 63 4-1 實驗環境 63 4-2 均勻分段與不均勻分段比較 64 4-3 硬體架構中相關參數設定 67 4-4 相關文獻之數據比較結果 70 4-5 雙重精準度硬體數據 75 第五章 結論與未來展望 86 5-1 結論 86 5-2 未來展望 86 參考文獻 88

    [1] B. Taylor, “Methodus incrementorum directa et inversa,” Pearsonianis, London, 1715.
    [2] Phillips, George M. “Interpolation and approximation by polynomials,” Springer Science & Business Media, 2003.
    [3] IEEE Standard for Floating-Point Arithmetic IEEE Std 754-2008 In IEEE Std 754-2008 (August 2008), pp. 1-70
    [4] D. Goldberg, “What every computer scientist should know about floating-point arithmetic,” ACM Computing Surveys, Vol 23, No 1, March 1991.
    [5] S. A. Tawfik and H. A. H. Fahmy, "Algorithmic truncation of minimax polynomial coefficients," IEEE International Symposium on Circuits and Systems, Island of Kos, pp. 2421 -2424, May 2006.
    [6] J. A. Pineiro, S. F. Oberman, J. M. Muller, and J. D. Bruguera, “High-speed function approximation using a minimax quadratic interpolator,” IEEE Transactions on Computers, vol. 54, no. 3, pp. 304–318, March 2005.
    [7] C. Chen, “High-order Taylor series approximation for efficient computation of elementary functions,” IET Computers & Digital Techniques, vol. 9, no. 6, pp. 328-335, Nov. 2015.
    [8] H. J. Ko, S. F. Hsiao and W. L. Huang, “A new non-uniform segmentation and addressing remapping strategy for hardware-oriented function evaluators based on polynomial approximation,” Proceedings of 2010 IEEE International Symposium on Circuits and Systems, Paris, , pp. 4153-4156 2010.
    [9] S. F. Hsiao and K. C. Huang, "Low-power dual-precision table-based function evaluation supporting dynamic precision changes," Asia Pacific Conference on Circuits and Systems (APCCAS), Jeju, South Korea, 2016, pp. 710-712.
    [10] S. F. Hsiao, H. J. Ko, Y. L. Tseng, W. L. Huang, S. H. Lin and C. S. Wen, “Design of Hardware Function Evaluators Using Low-Overhead Nonuniform Segmentation With Address Remapping,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 21, no. 5, pp. 875-886, May 2013.
    [11] S. Lloyd, “Least squares quantization in PCM,” IEEE Transactions on Information Theory, vol. 28, no. 2, pp. 129-137, Mar 1982.
    [12] J. Max, “Quantizing for minimum distortion,” IRE Transactions on Information Theory, vol. 6, no. 1, pp. 7-12, March 1960.
    [13] M. J. Schulte and J. E. Stine, “Approximating elementary functions with symmetric bipartite tables,” IEEE Transactions on Computers, vol. 48, no. 8, pp. 842-847, Aug 1999.
    [14] P. T. P. Tang, “Table-driven implementation of the logarithm function in IEEE floating-point arithmetic.” ACM Transactions on Mathematical Software (TOMS), vol.16, issue 4, pp. 378-400, Dec. 1990.
    [15] D. Piso, J. A. Pineiro and J. D. Bruguera, “Analysis of the impact of different methods for division/square root computation in the performance of a superscalar microprocessor,” Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools, pp. 218-225, Jan. 2002.
    [16] Wang, Ling, and Joseph Needham. “Horner's method in Chinese mathematics: Its origins in the root-extraction procedures of the Han dynasty,” T'oung Pao 43. Livr. 5, pp. 345-401, 1955.
    [17] K. Diefendorff, P. K. Dubey, R. Hochsprung and H. Scale, "AltiVec extension to PowerPC accelerates media processing," IEEE Micro, vol. 20, no. 2, pp. 85-95, Mar/Apr 2000.
    [18] P. Markstein, IA-64 and Elementary Functions. Hewlett-Packard Professional Books, 2000.
    [19] D. U. Lee, W. Luk, J. Villasenor and P. Y. K. Cheung, “Hierarchical segmentation schemes for function evaluation,” Proceedings of 2003 IEEE International Conference on Field-Programmable Technology (FPT), pp. 92-99, Mar. 2003.
    [20] D. U. Lee, W. Luk, J. Villasenor and P. Y. K. Cheung, “Non-uniform segmentation for hardware function evaluation,” International Conference on Field Programmable Logic and Applications. Springer Berlin Heidelberg, 2003.
    [21] D. U. Lee, R. C. C. Cheung, W. Luk and J. D. Villasenor, "Hierarchical Segmentation for Hardware Function Evaluation," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 17, no. 1, pp. 103-116, Jan. 2009.
    [22] D. D. Caro, N. Petra and A. G. M. Strollo, "High-Performance Special Function Unit for Programmable 3-D Graphics Processors," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 56, no. 9, pp. 1968-1978, Sept. 2009.
    [23] D. De Caro, N. Petra and A. G. M. Strollo, "A High Performance Floating-Point Special Function Unit Using Constrained Piecewise Quadratic Approximation," 2008 IEEE International Symposium on Circuits and Systems, Seattle, WA, pp. 472-475, June 2008.
    [24] F. de Dinechin, A. Tisserand, “Multipartite table methods,” IEEE Transactions on Computers, vol. 54, no. 3, pp. 319-330, March 2005.
    [25] G. Cao, H. Du, P. Wang, Q. Du and J. Ding, “A Piecewise Cubic Polynomial Interpolation Algorithm for Approximating Elementary Function,” 14th International Conference on Computer-Aided Design and Computer Graphics (CAD/Graphics), Xi'an, 2015, pp. 57-64.
    [26] Shen-Fu Hsiao, Ping-Chung Wei and Ching-Pin Lin, "An automatic hardware generator for special arithmetic functions using various ROM-based approximation approaches," International Symposium on Circuits and Systems, 2008, pp. 468-471.

    下載圖示 校內:2019-07-05公開
    校外:2019-07-05公開
    QR CODE