簡易檢索 / 詳目顯示

研究生: 畢玉泉
Pi, Yu-Chuan
論文名稱: 於分散式運算平台上使用非同步階段推進技術之平行化細部繞線
Parallel Detailed Routing utilizing Asynchronous-Phase-Proceeding Technique on Distributed Computing Platform
指導教授: 何宗易
Ho, Tsung-Yi
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2012
畢業學年度: 100
語文別: 英文
論文頁數: 42
中文關鍵詞: 平行運算分散式運算非同步階段推進細部繞線
外文關鍵詞: parallel computing, distributed computing, asynchronous-phase-proceeding, detailed routing
相關次數: 點閱:161下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著積體電路規模的擴大,電子設計自動化領域當中的問題複雜度也有爆炸性的成長。為了解決這些複雜的問題,已有許多最佳化演算法被提出,然而執行時間的減少幅度依然有其極限。近年來,隨著平行運算與分散式運算越來越普及,已有越來越多應用程式被修改或開發為適用於平行運算環境的版本,擁有一個有效率的平行化電子設計自動化演算法將對於處理複雜的問題有很大的幫助。常見的平行化演算法流程是將問題切割成數個階段,並分別對每個階段做平行化的處理。由於一個已平行化階段的完成是取決於最後一個完成的工作,所以提早將它們的工作完成的處理器將會處於閒置狀態,一直到這個階段完成為止。這種多階段平行化的方式會因為累積每個個別階段的閒置時間而浪費太多運算能力,對平行處理的效能有不良影響。在此篇論文中,我們以分散式運算環境為基礎,發展了一個稱為非同步階段推進的新技術。非同步階段推進可用來克服平行處理多階段演算法時造成的效能不彰,利用此技術並可節省程式的執行時間。為了顯示非同步階段推進技術的效能,我們將其應用在一個細部繞線問題上。此問題可說是在實體設計流程中最為複雜的問題之一。將細部繞線演算法進行平行化並不直覺,有許多因素將對效能有不良影響。我們採取了一個特別的分割方式對電路佈局進行分割,並可保留其平行處理的能力。實驗結果顯示,與現今最新的細部繞線程式相比,我們的作法可大幅提昇執行速度,縮短所需的時間。

    As the scale of integration advanced to exascale, the complexity of electronic design automation (EDA) problems are poised to rapidly grow. To solve these complex problems, several optimization algorithms are proposed. However, the improvement of execution time is still limited. Recently, as parallel computing and distributed computing become more ubiquitous, more and more applications are modified or developed to be adapted on parallel computing environment. Having an efficient parallel EDA algorithm will be of great value on dealing with complex problems. Conventional parallel algorithms and flows divide the problem into several phases and each phase is separately handled in parallel. Since the completion of a parallelized phase is dominated by the completion of the latest work, processors which complete their work earlier will be idle until the phase is completed. On accumulating the idling time of each individual phase, the multi-phase parallelization scheme wastes too much computing power. It has a detrimental effect on the performance of parallelization. In this thesis, we develop a novel infrastructure utilizing asynchronous-phase-proceeding (APP) technique based on a distributed computing environment. APP is able to overcome the inefficiency of parallelized multi-phase algorithms and conserve the execution time. To demonstrate the efficiency of APP, we apply it on the detailed routing problem that is one of the most complex problem in physical design flow. Parallelizing a detailed routing algorithm well is not intuitive. There are several factors which harm the performance. We take a special mechanism to partition the layout region, and keep the ability of parallelization. Compared with the state-of-the-art detailed router, our experimental results show a great speed-up on execution time.

    摘要 i Abstract iii Table of Contents v List of Tables vii List of Figures viii 1 Introduction 1 2 Preliminaries 7 2.1 Problem Formulation 7 2.2 Previous Work 9 2.3 Distributed Computing and Gearman 10 3 Methodology 13 3.1 Overview 13 3.2 Partitioning Mechanism 15 3.3 Viaduct Generation 18 3.4 Segment Allocation 20 3.5 Asynchronous-Phase-Proceeding (APP) 22 3.6 Improvement of APP by Task Scheduling 25 4 Experimental Results 29 5 Conclusion 36 Bibliography 37

    [1] “Condor–high throughput computing,” http://www.cs.wisc.edu/condor/.
    [2] “Gearman,” http://gearman.org/.
    [3] “IBM-Place 1.0 benchmark suites,” http://er.cs.ucla.edu/benchmarks/ibm-place/.
    [4] G. M. Amdahl, “Validity of the single processor approach to achieving large scale computing capabilities,” in Proceedings of Spring Joint Computer Conference, pp. 483–485, 1967.
    [5] M. A. Baker, G. C. Fox, and H. W. Yau, “Cluster computing review,” Northeast Parallel Architectures Center, Tech. Rep., Nov. 1995.
    [6] S. Batterywala, N. Shenoy, W. Nicholls, and H. Zhou, “Track assignment: A desirable intermediate step between global routing and detailed routing,” in Proceedings of IEEE International Conference on Computer-Aided Design, pp. 59–66, 2002.
    [7] Y. Chang, Y. Lee, and T. Wang, “NTHU-Route 2.0: A fast and stable global router,” in Proceedings of IEEE International Conference on Computer-Aided Design, pp. 338–343, 2008.
    [8] H. Chen, C. Hsu, and Y. Chang, “High-performance global routing with fast overflow reduction,” in Proceedings of IEEE Asia and South Pacific Design Automation Conference, pp. 582–587, 2009.
    [9] M. Cho, K. Lu, K. Yuan, and D. Pan, “Boxrouter 2.0: Architecture and implementation of a hybrid and robust global router,” in Proceedings of IEEE International Conference on Computer-Aided Design, pp. 503–508, 2007.
    [10] M. Cho and D. Pan, “Boxrouter: A new global router based on box expansion and progressive ilp,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 26, no. 12, pp. 2130–2143, 2007.
    [11] J. Dean and S. Ghemawat, “MapReduce: Simplified data processing on large clusters,” Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008.
    [12] M. Edahiro, “Parallelizing fundamental algorithms such as sorting on multi-core processors for EDA acceleration,” in Proceedings of IEEE Asia and South Pacific Design Automation Conference, pp. 230–233, 2009.
    [13] H. El-Rewini and T. Lewis, “Scheduling parallel program tasks onto arbitrary target machines,” Journal of Parallel and Distributed Computing, vol. 9, no. 2, pp. 138–153, 1990.
    [14] I. Foster and C. Kesselman, Eds., The grid: Blueprint for a new computing infrastructure. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1999.
    [15] J. Gao, P. Wu, and T. Wang, “A new global router for modern designs,” in Proceedings of IEEE Asia and South Pacific Design Automation Conference, pp. 232–237, 2008.
    [16] Y. Han, D. Ancajas, K. Chakraborty, and S. Roy, “Exploring high throughput computing paradigm for global routing,” in Proceedings of IEEE International Conference on Computer-Aided Design, pp. 298–305, 2011.
    [17] K. Hsu, S. Sinha, Y. Pi, C. Chiang, and T. Ho, “A distributed algorithm for layout geometry operations,” in Proceedings of IEEE Design Automation Conference, pp. 182–187, 2011.
    [18] M. Iverson, F. Özgüner, O. Gregory, and G. Follen, “Parallelizing existing applications in a distributed heterogeneous environment,” in Heterogeneous Computing Workshop, pp. 93–100, 1995.
    [19] Y. Kang, “Improving scheduling of tasks using delay adjust in a heterogeneous environment,” in Proceedings of IEEE International Conference on Biomedical Engineering and Informatics, vol. 7, pp. 2790–2794, 2010.
    [20] W. Liu, W. Kao, Y. Li, and K. Chao, “Multi-threaded collision-aware global routing with bounded-length maze routing,” in Proceedings of IEEE Design Automation Conference, pp. 200–205, 2010.
    [21] Y. Lu, H. Zhou, L. Shang, and X. Zeng, “Multicore parallel min-cost flow algorithm for CAD applications,” in Proceedings of IEEE Design Automation Conference, pp. 832–837, 2009.
    [22] M. Moffitt, “Maizerouter: Engineering an effective global router,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 27, no. 11, pp. 2017–2026, 2008.
    [23] M. Ozdal and M. Wong, “Archer: A history-based global routing algorithm,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 28, no. 4, pp. 528–540, 2009.
    [24] M. Pan and C. Chu, “FastRoute: A step to integrate global routing into placement,” in Proceedings of IEEE International Conference on Computer-Aided Design, pp. 464–471, 2006.
    [25] M. Pan and C. Chu, “FastRoute 2.0: A high-quality and efficient global router,” in Proceedings of IEEE Asia and South Pacific Design Automation Conference, pp. 250–255, 2007.
    [26] J. Polo, D. Carrera, Y. Becerra, M. Steinder, and I. Whalley, “Performance-driven task co-scheduling for MapReduce environments,” in IEEE Network Operations and Management Symposium, pp. 373–380, Apr. 2010.
    [27] J. Roy and I. Markov, “High-performance routing at the nanometer scale,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 27, no. 6, pp. 1066–1077, 2008.
    [28] S. Sapatnekar, E. Haritan, K. Keutzer, A. Devgan, D. Kirkpatrick, S. Meier, D. Pryor, and T. Spyrou, “Reinventing EDA with manycore processors,” in Proceedings of IEEE Design Automation Conference, pp. 126–127, 2008.
    [29] H. Topcuoglu, S. Hariri, and M. Wu, “Performance-effective and low-complexity task scheduling for heterogeneous computing,” IEEE Transactions on Parallel and Distributed Systems, vol. 13, no. 3, pp. 260–274, Mar. 2002.
    [30] J. Ullman, “NP-complete scheduling problems,” Journal of Computer and System Sciences, vol. 10, no. 3, pp. 384–393, 1975.
    [31] T. Wu, A. Davoodi, and J. Linderoth, “A parallel integer programming approach to global routing,” in Proceedings of IEEE Design Automation Conference, pp. 194–199, 2010.
    [32] T. Wu, A. Davoodi, and J. Linderoth, “GRIP: Global routing via integer programming,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 30, no. 1, pp. 72–84, 2011.
    [33] T. Wu, L. Xie, and A. Davoodi, “A parallel and randomized algorithm for large-scale discrete Dual-Vt assignment and continuous gate sizing,” in Proceeding of IEEE International Symposium on Low Power Electronics and Design, pp. 45–50, 2008.
    [34] Y. Xu, Y. Zhang, and C. Chu, “FastRoute 4.0: Global router with efficient via minimization,” in Proceedings of IEEE Asia and South Pacific Design Automation Conference, pp. 576–581, 2009.
    [35] Y. Zhang and C. Chu, “RegularRoute: An efficient detailed router with regular routing patterns,” in Proceedings of IEEE International Symposium on Physical Design, pp. 45–52, 2011.
    [36] Y. Zhang, Y. Xu, and C. Chu, “FastRoute 3.0: A fast and high quality global router based on virtual capacity,” in Proceedings of IEEE International Conference on Computer-Aided Design, pp. 344–349, 2008.

    下載圖示 校內:2013-02-16公開
    校外:2013-02-16公開
    QR CODE