| 研究生: |
陳立偉 Chen, Li-Wei |
|---|---|
| 論文名稱: |
高速系統單晶片-Mercurius於嵌入式系統單晶片雛型驗證平台Concord II驗證及網路式路由器-AnyNoC之實現 Verification of High Performance SoC-Mercurius on Embedded MPSoC Rapid Prototyping Platform-Concord II and Implementation of Network-on-Chip Router-AnyNoC |
| 指導教授: |
卿文龍
Chin, Wen-Long |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 工程科學系 Department of Engineering Science |
| 論文出版年: | 2014 |
| 畢業學年度: | 102 |
| 語文別: | 中文 |
| 論文頁數: | 103 |
| 中文關鍵詞: | 網路式晶片 、晶片上通訊 、系統單晶片 、流量控制 、封包式交換 、輸出佇列 |
| 外文關鍵詞: | Network on Chip, interconnection networks, System-on-Chip, flow control, packet switching, Output Queue Buffering |
| 相關次數: | 點閱:115 下載:8 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
現今電子消費市場可明顯看出通訊、數據、影音等許多功能已整合進單一裝置,以智慧型手機為例,手機整合各種晶片來處理多媒體應用,如錄影、照相、微型投影和擴增實境(Augmented Reality,AR)。加上4G/WIFI/LTE無線寬頻技術的快速發展,對於資料傳輸的頻寬要求比起以往強烈許多。
同時,在電信、娛樂多媒體或手持行動裝置領域上,隨著半導體技術的持續成長,眾多矽智財(silicon intellectual property,SIP)供應商致力於開發多功能、高效能且省電的矽智財,如DSP、GPU、WIFI,其可重複使用(reusable)、可修改(programmable)的模組特性,大幅縮短系統單晶片(System-on-Chip,SoC)的設計時程。然而矽智財供應商卻面臨了難題,矽智財能否在各種環境中相容。當系統設計者希望快速整合各種矽智財時,定義一種標準匯流排協定即成為了最有效率的方法,讓矽智財供應商生產的矽智財具有可攜性(portable)。
嵌入式系統整合軟體、硬體與行動應用,系統底層以匯流排負責各個矽智財的溝通,為了創造流暢的使用者體驗(User Experience),高效率的匯流排是設計系統單晶片的關鍵。總結前段所述,如何讓矽智財之間相容、快速傳遞封包並且合理的區域化分配,是當前匯流排的重要議題。
如今系統單晶片整合越來越多矽智財,匯流排擴充性及特定矽智財的高效能需求,成為傳統匯流排面臨的挑戰。根據本實驗室學長的提出的Mercurius高速匯流排架構,其交易效能比傳統匯流排快三倍,本作除了利用完整的嵌入式系統驗證外,也進一步改善前作的功能與效能,在一般交通流量下使效能提升到傳統匯流排的五倍,並設計應用於網路式晶片(Network on Chip)的路由核心(router core),稱之AnyNoC,其具有前作的低延遲、高傳輸量、擴充性。以虛擬點對點(virtual point-to-point)的封包傳輸通道,同時接收所有輸入埠(ingress port)的資料,且立即傳送所有封包至輸出埠(egress port),避免多個裝置進行存取時的匯流排競爭,大幅提升矽智財之間的頻寬。路由核心使用共享式記憶體輸出佇列封包交換技術(shared-memory output-queue packet switching)來充份利用封包緩衝區。輸出佇列架構可以消除前端緩衝佇列阻塞效應(HOL blocking),並支援多筆資料交易(multiple outstanding transaction),使系統效能達到最佳。流量控制(flow control)模組不僅在公平分配系統資源之下避免封包緩衝區溢出,更有效利用封包緩衝區,並提升系統單晶片上的服務品質(quality of service,QoS)。
本作AnyNoC中的連接設定上,相鄰的路由核心之間使用封包式傳輸,並利用適合的網路介面(network interface,NI)支援各種通訊協定(AHB、AXI、OCP)的矽智財。AnyNoC的多網段(network segment)架構可彈性擴充大量的矽智財或匯流排,在單一晶片中建立全域非同步及區域同步(globally asynchronous locally synchronous,GALS)時脈系統;各個路由核心與網路介面分別具有獨立時脈;核心與裝置之間是透過交握(handshaking)方式來互相溝通,大幅降低網路時脈樹(clock tree)結構的複雜度,亦使整體時脈樹功耗更低。且交握方式的非同步電路防止大型系統單晶片下的時鐘傾斜(clock skew),並利用非同步(asynchronous)資料緩衝區來防止因為不同時脈溝通時產生的亞穩態(metastable)輸出。
In this paper, the performance of a packet switched router is improved by
means of a shared-memory output-queue technique. The enhanced router not only has the benefits of good scalability and a high resource utilization
efficiency, but also offers a superior communication performance due to its
virtual point-to-point characteristic. In the proposed router, a virtual point-to-point connection is set up at each dedicated port; thus yielding a higher throughput without interrupting the slower devices or causing congestion at any of the other ports. In addition, the head-of-line (HOL) blocking problem is resolved by using a flit buffer to construct an output queued switch. In implementing the flit buffer, the buffer address is maintained by a dynamic linked list; thereby avoiding packet losses with flow control. The packet switched router is implemented in two different environments. In the first case, the router is implemented in a real-time embedded system in place of the traditional shared bus (e.g., the AHB bus).In the second case, the proposed router is used to construct a network-on-chip (NoC). The performance of the proposed router, designated as AnyNoC is evaluated by comparing the overall system throughput of the constructed NoC with that obtained using a traditional packet-switched NoC. It is shown that AnyNoC yields a significant reduction in the network latency compared to that achieved using a traditional router architecture.
[1] A. Bindal, S. Mann, B. Ahmed, and L. Raimundo, “An undergraduate system-on-
chip (soc) course for computer engineering students,” Education, IEEE Transactions
on, vol. 48, pp. 279–289, May 2005.
[2] “Amba specification (rev. 2).” May 1999.
[3] F. Poletti, D. Bertozzi, L. Benini, and A. Bogliolo, “Performance analysis of arbitra-
tion policies for soc communication architectures.,” Design Autom. for Emb. Sys.,
vol. 8, no. 2-3, pp. 189–210, 2003.
[4] L. Benini and D. Bertozzi, “Network-on-chip architectures and design methods,”
Computers and Digital Techniques, IEE Proceedings -, vol. 152, pp. 261–272, Mar
2005.
[5] C. Grecu, P. P. Pande, A. Ivanov, and R. Saleh, “Structured interconnect archi-
tecture: a solution for the non-scalability of bus-based socs.,” in ACM Great Lakes
Symposium on VLSI (D. Garrett, J. Lach, and C. A. Zukowski, eds.), pp. 192–195,
ACM, 2004.
[6] C.-Y. Lin, “Mercurius :a high speed and flexible amba architecture,” Master’s thesis,
NCKU, June 2012.
[7] G. D. Micheli and L. Benini, “Networks on chip: A new paradigm for systems on
chip design.,” in DATE, pp. 418–419, IEEE Computer Society, 2002.
101
[8] Y. Tamir and G. Frazier, “Dynamically-allocated multi-queue buffers for vlsi com-
munication switches,” Computers, IEEE Transactions on, vol. 41, pp. 725–737, Jun
1992.
[9] “82596 user’s manual.” 1989.
[10] J.-H. Tang, “Design of a wormhole switch circuit for network on a chip,” Master’s
thesis, NCKU, July 2013.
[11] S. Keshav, An Engineering Approach to Computer Networking: ATM Networks, the
Internet, and the Telephone Network. Massachusetts: Addison Wesley Longman,
1997.
[12] H. M. A. Agarwal, C. Iskander and R. Shankar, “Survey of network on chip archi-
tectures and contributions,” Journal of Engineering, Computing and Architecture,
2009.
[13] F. Davik, M. Yilmaz, S. Gjessing, and N. Uzun, “Ieee 802.17 resilient packet ring
tutorial,” Communications Magazine, IEEE, vol. 42, pp. 112–118, Mar 2004.
[14] S. Kumar, A. Jantsch, J.-P. Soininen, M. Forsell, M. Millberg, J. Oberg, K. Tiensyrja,
and A. Hemani, “A network on chip architecture and design methodology,” in VLSI,
2002. Proceedings. IEEE Computer Society Annual Symposium on, pp. 105–112, 2002.
[15] W. J. Dally and B. Towles, “Route packets, not wires: On-chip interconnection
networks.,” in DAC, pp. 684–689, ACM, 2001.
[16] F. G. M. A. V. de Mello, L. C. Ost and N. L. V. Calazans, “Evaluation of routing
algorithms on mesh based nocs,” Pontifícia Universidade Católica do Rio Grande do
Sul, Mar 2004.
[17] L. Benini and G. Micheli, Networks on Chips: Technology and Tools. Morgan Kauf-
mann, 2006.
102
[18] N. Jerger and L. Peh, On-Chip Networks. Morgan and Claypool Pub., 2009.
[19] L. G. Valiant and G. J. Brebner, “Universal schemes for parallel communication,” in
STOC, pp. 263–277, ACM, 1981.
[20] T. Nesson and S. L. Johnsson, “Romm routing on mesh and torus networks.,” in
SPAA, pp. 275–287, 1995.
[21] D. Seo, A. Ali, W.-T. Lim, N. Rafique, and M. Thottethodi, “Near-optimal worst-
case throughput routing for two-dimensional mesh networks.,” in ISCA, pp. 432–443,
IEEE Computer Society, 2005.
[22] P. Wolkotte, G. J. M. Smit, G. Rauwerda, and L. Smit, “An energy-efficient recon-
figurable circuit-switched network-on-chip,” in Parallel and Distributed Processing
Symposium, 2005. Proceedings. 19th IEEE International, pp. 155a–155a, April 2005.
[23] W. J. Dally and B. Towles, Principles and Practices of Interconnection Network. San
Mateo, CA: Morgan Kaufmann, 2004.
[24] J. Joyner, R. Venkatesan, P. Zarkesh-Ha, J. Davis, and J. Meindl, “Impact of three-
dimensional architectures on interconnects in gigascale integration,” Very Large Scale
Integration (VLSI) Systems, IEEE Transactions on, vol. 9, pp. 922–928, Dec 2001.
[25] L.-S. Peh and W. Dally, “A delay model and speculative architecture for pipelined
routers,” in High-Performance Computer Architecture, 2001. HPCA. The Seventh
International Symposium on, pp. 255–266, 2001.
[26] J. Lee, C. Nicopoulos, S. J. Park, M. Swaminathan, and J. Kim, “Do we need wide
flits in networks-on-chip?,” in VLSI (ISVLSI), 2013 IEEE Computer Society Annual
Symposium on, pp. 2–7, Aug 2013.
[27] L. Peterson and S. Davie, Computer Networks. ELSEVIER, 4 th.