簡易檢索 / 詳目顯示

研究生: 王應瑋
Wang, Ying-Wei
論文名稱: RaceWeir: 用於多核心平台系統之硬體輔助資料競爭偵測機制
RaceWeir: Hardware-Assisted Data Race Detection in Multi-core System
指導教授: 陳中和
Chen, Chung-Ho
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2014
畢業學年度: 102
語文別: 英文
論文頁數: 61
中文關鍵詞: 快取一致性資料競爭偵測多核心系統多線程程式除錯
外文關鍵詞: cache coherence, data race detection, multi-core system, multi-threaded program debugging
相關次數: 點閱:187下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著多核心平台越來越廣泛,程式設計師大多採用多線程編程技巧來提高效能,然而多線程程式容易造成同步錯誤,例如資料競爭,不幸的已提出的軟體架構資料競爭偵測器的效能緩慢,而已提出硬體架構資料競爭偵測器的偵測並不精確。

    在本論文中,我們提出兩種硬體架構資料競爭偵測機制用於多核心系統偵測同步問題,我們首先基於lockset與happens-before演算法提出硬體架構混合式偵測機制,我們發現處理同步屏障(barrier)會影響到happens-before演算法的偵測能力。因此,我們移除大部分的lockset機制,對於每個變數只保留被存取的狀態(lock state)與最後存取的線程號碼(access ID)進而提出改善式happens-before偵測機制,經過數據驗證可以達到高偵測精確度與低硬體需求之架構,為了達到高偵測精確度,我們一併提出備份與還原機制用於解決偵測資訊(metadata)被快取剔除問題。

    我們採用十支SPLASH2程式來評估我們所提出的兩種偵測機制,透過模擬結果我們找到SPLASH2中存在有資料競爭現象,並且透過備份和還原機制大幅提高偵測能力,改善式happens-before偵測機制效能負擔為一般執行時間的1.22至2.55倍。

    As the multicore platform becomes more and more widespread, programmers usually adopt multithreaded programing to improve the performance. However, multithreaded applications are prone to cause synchronization bugs, such as data races. Unfortunately, existing software race detectors are slow in performance while the published hardware-assisted race detectors are imprecise in reporting data race events.
    In this thesis, we propose two hardware-based data race detection mechanisms to catch synchronization bugs in a multicore system. We first develop a hardware implementation of a hybrid-based data race detection mechanism based on the lockset and happens-before algorithms. We find that the way this hybrid detection mechanism handles the barriers adversely affects the detection capability of the happens-before. We remove most of the storage and mechanism in lockset and only keep the lock state and access ID for each variable. Based on this, we propose the improved happens-before detection mechanism which is proved to be a precise data race detector and has less cache area requirement. To be precise in data race detection, we also develop the backup and restore technique to eliminate the cache eviction problem for the cached metadata.
    We evaluate the two proposed detection mechanisms by ten SPLASH2 benchmarks. Through the simulations, we have found the existing races in SPLASH2 benchmarks and our method has significantly enhanced the detection capability due to the employment of the backup and restore technique. The improved happens-before detection mechanisms has caused about 1.22x-2.55x of overheads compared to the normal execution time.

    Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Contribution 2 1.3 Organization of the Thesis 2 Chapter 2 Background 4 2.1 Lockset Data Race Detection Algorithm 4 2.2 Happens-Before Data Race Detection Algorithm 6 2.3 Hybrid Data Race Detection Algorithm 9 Chapter 3 Related Work 10 3.1 Overheads of Data Race Detection 10 3.2 Software-Based Detecting Mechanisms 10 3.3 Hardware-Based Detecting Mechanisms 11 3.4 Cache Coherency 12 3.5 Summary 13 Chapter 4 System Framework 14 4.1 Debug Coprocessor Micro-Architecture 14 4.2 Detection Mechanisms 16 4.2.1 Improved Happens-Before Detection Mechanism 17 4.3 Data Cache Architecture 18 4.4 Metadata Communication 19 4.5 The Debug Coprocessor Software API 21 Chapter 5 Debug Mode Operation 23 5.1 Metadata Backup and Restoration 23 5.1.1 Maintaining Vector Clock Status 23 5.1.2 Metadata Backup Mechanism 26 5.1.3 Metadata Restore Mechanism 27 5.2 Using AID to Avoid Inaccurate Lstate State Transition 27 5.3 Core Vector Clock Rollovers Issue 32 5.4 Handling Barrier 32 5.5 Operation of Debugging Mode 34 Chapter 6 Experimental Result 38 6.1 Experimental Setup 38 6.2 Demonstration of Data Race Detection 40 6.3 Effectiveness of Finding Existing Races 42 6.4 Impact of Data Cache Eviction 44 6.5 Performance Overhead 47 6.5.1 Extra Bus Traffic Overhead 48 6.5.2 Coprocessor Detection Overhead 49 6.5.3 Backup and Restore Mechanism Overhead 50 6.5.4 Performance Overhead of Write Back Policy 52 6.6 Discussion 56 Chapter 7 Conclusions 58 References 59

    [1] C. Flanagan and S.N. Freund, “FastTrack: Efficient and Precise Dynamic Race Detection,” Proceeding of the 2009 ACM SIGPLAN conference on Programming language design and implantation (PLDI 09), pp.121-133, Dublin, Ireland, USA, Jun. 2009.
    [2] E. Pozniansky and A. Schuster, “Efficient On-the-Fly Data Race Detection in Multithreaded C++ Programs,” Proceeding of the 9th ACM SIGPLAN symposium on Principle and Practice of Parallel Programming (PPoPP 03), pp. 179-190, San Diego, California, USA, Jun. 2003.
    [3] L. Lamport. “Time, clocks, and the ordering of events in a distributed system.” Communications of the ACM, vol.21, no. 7, pp. 558–565, 1978.
    [4] S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson, “Eraser : A dynamic data race detector for multi-threaded programs,” ACM Transaction on Computer Systems (TCS), vol. 15,no. 4, pp. 391-411, New York, NY, USA, Nov. 1997.
    [5] M. Prvulovic and J. Torrellas. “ReEnact: Using Thread-Level Speculation Mechanisms to Debug Data Races in Multithreaded Codes,” Proceeding of the 30th Annual International Symposium on Computer Architecture (ISCA 03), pp. 110-121, San Diego, California, USA, Jun, 2003.
    [6] M. Prvulovic, “CORD: Cost-effective (and nearly overhead-free) Order-Recording and Data race detection,” Proceeding of 12th International Symposium on High Performance Computer Architecture (HPCA 06), pp. 132-143, Austin, TX, US, Feb. 2006.
    [7] P. Zhou, R. Teodorescu, and Y. Zhou, “HARD: Hardware-Assisted Lockset-based Race Detection,” Proceeding of the 13th International Symposium on High Performance Computer Architecture (HPCA 07), pp. 121-132, Phoenix, Arizona, US, Feb. 2007.
    [8] A. Muzahid, D. Suarez, S. Qi, and J. Torrellas, “SigRace: Signature-Based Data Race Detection,” Proceeding of the 36th Annual International Symposium on Computer Architecture (ISCA 09), pp. 325-336, Austin, TX, US, June 2009.
    [9] C.-N. Wen, S.-H. Chou, T.-Fu. Chen “dIP: A Non-Intrusive Debugging IP for Dynamic Data Race Detection in Many-core,” Proceeding of the 10th International Symposium on Pervasive Systems, Algorithms, and Networks (ISPAN 09), pp. 86-91, Kaohsiung, Dec. 2009.
    [10] C.-N. Wen, S.-H. Chou, T.-Fu. Chen and A.-P. Su, ”NUDA: A Non-Uniform Debugging Architecture and Non-intrusive Race Detection for Many-Core,” IEEE Design Automation conference (DAC 09), pp. 148-153, San Francisco, California, USA, Jul. 2009.
    [11] J. Devietti, B.P Wood, K. Strauss, L. Ceze, D. Grossman, S. Qadeer, “RADISH: Always-On Sound and Complete Race Detection in Software and Hardware, ” Proceeding of the 39th Annual International Symposium on Computer Architecture (ISCA 12), pp. 201-212, Portland, OR, June. 2012.
    [12] B. Lucia, L. Ceze, K. Strauss, S. Qadeer, and H. Boehm, ”Conflict Exceptions: Simplifying Concurrent Language Semantics with Precise Hardware Exceptions for Data-Races,” Proceedings of the 37th annual international symposium on Computer architecture (ISCA 10), pp. 210-221, New York, NY, USA, June. 2010.
    [13] T. Elmas, S. Qadeer, and S. Tasiran, “Goldilocks: a race and transaction-aware java runtime,” Proceeding of the 2007 ACM SIGPLAN conference on Programming language design and implantation (PLDI 07), pp. 245–255, New York, NY, USA, Jun. 2007.
    [14] R. O’Callahan and J.-D. Choi, “Hybrid dynamic data race detection,” Proceedings of the 9th ACM SIGPLAN symposium on Principles and practice of parallel programming (PPoPP 03), pp. 167–178, San Diego, California, USA, Jun, 2003.
    [15] A. Itzkovitz, A. Schuster, and O. Zeev-Ben- Mordechai. “Towards integration of data race detection in DSM systems,” Journal of Parallel and Distributed Computing (JPDC), vol. 59, pp. 180–203, Nov. 1999.
    [16] C.-C Li, “A Debug-Capable Data Cache for Demand-Driven Data Race Detection,”2012 master thesis of National Cheng Kung University, Tainan, Taiwan, July, 2012.
    [17] N. Nethercote and J. Seward, “Valgrind: A Framework for Heavyweight Dynamic Binary Instrumentation,” Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation (PLDI 07), pp. 89-100, New York, NY, USA, Jun. 2007.

    下載圖示 校內:2016-08-18公開
    校外:2016-08-18公開
    QR CODE