簡易檢索 / 詳目顯示

研究生: 林家洋
Lin, Chia-Yang
論文名稱: Mercurius: 一個高速且彈性的前瞻微處理器匯流排架構
Mercurius: A High Speed and Flexible AMBA Architecture
指導教授: 卿文龍
Chin, Wen-Long
學位類別: 碩士
Master
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2012
畢業學年度: 100
語文別: 中文
論文頁數: 64
中文關鍵詞: 系統單晶片前瞻微處理器匯流排架構共享式記憶體交換統計多工特性暫存器轉移階層
外文關鍵詞: System-on-Chip, Advanced Micro-controller Bus Architecture, shared-memory switching architecture, statistical multiplexing, register-transfer level model
相關次數: 點閱:109下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著半導體技術的蓬勃發展,單一晶片上所能實現的功能越來越多且複雜,於是,系統單晶片(System-on-Chip,簡稱SoC)的設計方法成為設計上的趨勢。前瞻微處理器匯流排架構(Advanced Micro-controller Bus Architecture,簡稱AMBA)是一個被廣泛運用於系統單晶片上的晶片匯流排架構,其提供了一套標準的匯流排協定,用以連結晶片上的各個元件,但AMBA架構上無法支援多筆資料交易(multiple outstanding transaction) ,且主裝置和僕裝置皆為共享單獨的匯流排通道,這會導致系統的頻寬大幅降低。
    據此,本研究的目的為基於AMBA的匯流排協定,提出使用共享式記憶體交換(shared-memory switching)的架構,取名為Mercurius,用以無接縫地(seamless)取代傳統匯流排的架構,也就是說,既有裝置不需做任何修改,即可使用本平台,達到多筆資料交易、頻寬大量提升、以及易於擴充等優點。其動作流程為,主裝置端對僕裝置端所發出的資料交易,會儲存在共享的記憶體中,經動態控制方法查詢路由表,再依照查詢結果依序去共享記憶體中讀出,並分配至僕裝置端,所有主裝置端的資料交易,皆能被寫入共享緩衝區記憶體中,然後讀出傳送到僕裝置端。共享式記憶體的統計多工特性(statistical multiplexing),可使系統效能達到最佳並支援多筆資料交易,因此,理想上所有的主裝置及僕裝置對(pair),可平行(parallel)地傳輸資料,不會因為被阻塞住,而導致必需等待的情況,假若僕裝置來不及反應主裝置的資料交易,Mercurius的流量控制(flow control)功能,不僅可避免緩衝區溢出,也可公平分配系統資源,達到資源最佳利用。共享式記憶體的架構,可以允許同時間寫入所有主裝置端的資料交易,也能同時讀出所有的資料交易至僕裝置端,具有虛擬(virtual)點對點的特性,且在該架構下,也能支援主裝置與主裝置間的通訊(mater to master communication),故共享式記憶體的架構之傳輸量,故共享式記憶體的架構,不僅傳輸量可達到傳統匯流排架構的數倍,且較具有彈性。
    就實現層面而言,本論文使用TSMC 90奈米的技術實現硬體架構,並在暫存器轉移層(Register Transfer Level,RTL)評估本架構之效能,設計之規格為8x8(意即系統可以支援八個裝置,使用者可自行定義所需之主、僕裝置的數目),整體之傳輸資料傳輸頻寬,在最差的情況下(也就是僕裝置的回應時間為零延遲,意即AMBA最完美的情況),也可達到至少為AMBA AHB的3倍以上,若僕裝置的回應時間有延遲的話,由於匯流排會被擁有使用權裝置占用而導致效能降低,實驗數據顯示,此時頻寬至少為AMBA AHB的4倍以上。

    With the advance in silicon technology, more and more complicated hardware components can be integrated into a single chip, so the System-on-Chip (SoC) design method becomes popular in recent years. Advanced Micro-controller Bus Architecture(AMBA) is an on chip bus architecture widely used in SoC. It provides a standard bus protocol to connect every component on the chip. However, AMBA cannot support multiple outstanding transactions and all the components share the same bus channel. If a slave is not ready to respond, that means no other transaction can bypass the blocked one, which leads to a huge reduction of the performance.
    To address this issue, this paper proposes a shared-memory switching architecture based on AMBA bus protocol, which is named Mercurius. It can seamless replace the traditional bus architecture, that is Mercurius is compatible of the existed device without any modification. All the transactions initiated by masters are stored in the shared-memory, and then the transactions will be dispatched to the slaves after dynamically inquiring the routing table. The system can achieve the optimal performance and support multiple outstanding transactions owing to the statistical multiplexing property of shared-memory, therefore all the master and slave pairs ideally are able to transfer data parallel, which can bypass the blocked one. Under successive and slow transactions, the flow control mechanism not only prevents the shared-buffer overflow but also allocates the resources evenly among all devices. The shared-memory architecture can absorb all the transactions initiated by masters and send them out to the slaves at the same time like a virtual point-to-point characteristic, hence the data transfer bandwidth of Mercurius is several times than traditional bus-based interconnect. Also it can allows a master-to-master type of communication between hardware components which can reduce the extra connection area.
    The performance is evaluated by the register-transfer level models which are implemented with TSMC 90nm silicon technology. The design specification is 8x8(that is the system support eight devices, which can be reconfigured by user). The simulation results show that the data transfer bandwidth is at least three times than AMBA Advanced High-performance(AHB) even in the best condition(the slave response time are zero delay). If a slave is not ready to respond, that means no other transaction can bypass the blocked one, then simulation results shows that the data transfer bandwidth is at least four times than AMBA AHB.

    中文摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii 誌謝. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v 目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi 表目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii 圖目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1 簡介. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 相關背景. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1 AMBA AHB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 鏈結串列. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3 Mercurius: 一個高速且彈性的前瞻微處理器匯流排架構. . . . . . . . . . . . 10 3.1 整體架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 操作步驟. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.3 主裝置與主裝置間的通訊. . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.4 拓撲. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.5 流量控制. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.6 硬體實現. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4 模擬與驗證. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.1 驗證方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.2 模擬結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 5 實體資料. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 5.1 時序. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 5.2 面積. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5.3 功率. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 5.4 晶片佈局. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.5 FPGA 仿真. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 6 結論與未來展望. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 參考文獻. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 個人簡歷. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    [1] A. Bindal, S. Mann, B. N. Ahmed and L. A. Raimundo, “An undergraduate system-on-chip (SoC) course for computer engineering students,” IEEE Trans. Edu., vol. 48, no. 2, pp. 279-289, May 2005.
    [2] ”AMBA Specification,” Rev. 2.0, Axis. Sunnyvale, CA, 1999.
    [3] “CoreConnect Bus Architecture,” IBM. Yorktown Heights, NY, 1999.
    [4] “Wishbone Bus,” OpenCores, 2003.
    [5] F. Polmi, D. Bntazzi, L. Bmini, and A. Bogliolo, “Performance analysis of arbitration policies for SoC communication architectures,” Kluwa Jolnnal on Design Automation for Embedded Systems, vol. 8, no. 2, pp. 189-210, 2003.
    [6] L. Benini and D. Bertozzi, “Network-on-chip architectures and design methods,” IEE Comput. Digit. Tech., vol. 152, no. 2, pp. 261-272, Mar. 2005.
    [7] C. A. Zeferino, M. E. Kreutz, L. Carro, and A. A. Susin, “Models for commun- ication trade-offs on system-on-chip,” Proc. Int. Workshop IP-Based SoC Design, Oct. 2002.
    [8] K. K. Ryu, E. Shin, and V. J. Mooney, “A comparison of five different multi- processor SoC bus architectures,” Proc. EUROMICRO Symp. Digit. Syst. Des., Sep. 2001.
    [9] L. Benini and G. De Micheli, “Networks on chips: A new SoC paradigm,” IEEE Trans. Comput., vol. 35, no. 1, pp. 70–78, Jan. 2002.
    [10] J. Lee and H. –J. Lee, “Wire Optimization for Multimedia SoC and SiP Designs,” IEEE trans. on circuits and systems, vol. 55, no. 8, pp. 2202-2215, Sep. 2008
    [11] K. Lahiri, A. Raghunathan, and G. Lakshminarayana, “The LOTTERYBUS On-Chip Communication Architecture,” IEEE Trans. on VLSI systems, vol. 14, no. 6, pp.596-608 June 2006
    [12] X. Zhu and S. Malik “A Hierarchical Modeling Framework for On=Chip Communication Architectures,” Proc. ICCAD, Nov. 2002
    [13] Y. Tamir and G.L. Frazier, “Dynamically-allocated multi-queue buffer for VLSI communication switches,” IEEE Trans. Comput., vol.41, no.6, pp. 725-737, Jun. 1992.
    [14] Intel Corporation, 82596 User’s Manual, 1989.
    [15] D. Lattard, E. Beigne, F. Clermidy, Y. Durand, R. Lemaire, P. Vivet, and
    F. Berens, “A reconfigurable baseband platform based on an asynchronous
    network-on-chip,” IEEE J. Solid-State Circuits, vol. 43, no. 1, pp. 223–235, Jan. 2008.
    [16] F. Karim, A. Nguyen, and S. Dey, “An interconnect architecture for networking system on chips,” IEEE Micro, vol. 22, no. 5, pp. 36–45, Oct. 2002.
    [17] K. Lahiri, A. Raghunathan, and G. Lakshminarayana, “LOTTERYBUS: A new communication architecture for high-performance system-on-chip designs,” Proc. Des. Autom., Jun. 2001.

    下載圖示 校內:2017-08-31公開
    校外:2017-08-31公開
    QR CODE