簡易檢索 / 詳目顯示

研究生: 劉惟成
Liu, Wei-Cheng
論文名稱: Codepecker:一個用於偵測多核心平行程式的異常執行緒之除錯工具
Codepecker:A Debugging Tool for Detecting Abnormal Threads of Parallel Programs on Shared Memory Multi-core Systems
指導教授: 謝錫堃
Shieh, Ce-Kuen
共同指導教授: 黃祖基
Huang, Tzu-Chi
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電腦與通信工程研究所
Institute of Computer & Communication Engineering
論文出版年: 2010
畢業學年度: 98
語文別: 英文
論文頁數: 44
中文關鍵詞: 異常偵測多核心多執行緒
外文關鍵詞: Anomaly detection, multi-core, multithreaded
相關次數: 點閱:78下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 由於多核心處理器發展越來越快速,並且擁有高運算能力及低耗能等優點,目前市面上的電腦幾乎都已經採用多核心處理器的架構,也因為如此程式設計師對於撰寫平行程式的需求也越來越高,平行程式中所使用的執行緒數量也越來越多。而大部份平行程式使用大量執行緒並且所有執行緒都擁有相同的行為。然而,利用傳統的除錯工具去為使用大量執行緒的平行程式除錯時,一個個地檢視這些執行緒的資訊來找出造成錯誤的異常執行緒,不僅操作繁雜也需耗費大量的除錯時間。
    本論文提出Codepecker除錯工具,其能夠在多核心系統上的平行程式中自動地偵測出有異常行為的執行緖。Codepecker是藉由比較各執行緒之間執行行為的異同,來自動偵測異常的執行緒。此外,Codepecker也擁有許多優點,例如:透通性高、可攜性佳及系統負擔低。使用者可以使用Codepecker快速地去偵測出平行程式中異常的執行緒,進而使用除錯器去找出錯誤發生的根本原因,如此一來便可以節省大量的除錯成本。

    Due to the popularity of multi-core processors in the computer market with their features of high performance and low power consumption, programmers begin to develop parallel programs for various applications with multiple threads. However, using traditional debugging tools to locate the abnormal threads is a heavy and complicated procedure and costs considerable time.
    In this thesis, Codepecker is proposed to automatically locate the abnormal threads in a parallel program that contains multiple threads performing similar activities on shared memory multi-core systems. Codepecker locates the abnormal threads by identifying threads that behave differently from others and possesses multiple advantages such as high transparency, high portability and low overhead. Programmers can use the Codepecker to locate the abnormal threads rapidly, and further exploit the debugger to figure out the root cause of bug.

    摘要 I Abstract II Content III Figures V Tables VI Chapter 1 Introduction 1 Chapter 2 Related Works 4 2.1. Detecting a Specific Concurrency Bug 4 2.2. Visualizing and Handling the Enormous Information 5 2.3. Anomaly Detecting System 5 2.3.1. Dickinson et al. 6 2.3.2. Yuan et al. 6 2.3.3. Mirgorodskiy et al. 7 2.3.4. DMTracker 8 2.3.5. Lan et al. 9 Chapter 3 System Design 12 3.1. Codepecker Overview 12 3.2. Determination Factors 14 3.3. Codepecker Components 15 Chapter 4 Implementation 19 4.1. System Architecture 19 4.2. Implementation of Synchronization Point Monitor 20 4.3. Implementation of Execution Time Recorder 21 4.4. Implementation of Synchronization Point Replayer 22 4.5. Implementation of Function Execution Recorder 23 4.6. Implementation of Variable Access Recorder 24 4.7. Implementation of Runtime Information Analyzer 25 4.8. Implementation of Codepecker Console 26 Chapter 5 Performance 28 5.1. Codepecker Overhead Breakdowns 28 5.2. Demonstration of Location Bug with Codepecker 30 5.2.1 Detecting Function Execution Order Abnormality 30 5.2.2 Detecting Variable Access Order Abnormality 32 5.2.3 Detecting Lock Access Order Abnormality 34 5.2.4 Detecting Variable Value Variability Abnormality 36 5.2.5 Detecting Thread Execution Time Abnormality 37 5.3. Performance Impact 39 Chapter 6 Conclusion 41 Reference 43

    [1] D. Atanasov, "General purpose GPU programming," 2005.
    [2] NVIDA. Available: http://www.nvidia.com/page/home.html
    [3] ATI. Available: http://www.amd.com/tw/Pages/AMDHomePage.aspx
    [4] C. NVIDIA, "Programming Guide 2.0 : http://developer. download. nvidia. com/compute/cuda/2 0/docs," NVIDIA CUDA Programming Guide, vol. 2, 2008.
    [5] R. Chandra, Parallel programming in OpenMP: Morgan Kaufmann, 2001.
    [6] V. Adve, et al., "Requirements for data-parallel programming environments," IEEE Parallel & Distributed Technology: Systems & Technology, vol. 2, p. 58, 1994.
    [7] B. Massingill, et al., "SIMD: an additional pattern for PLPP (pattern language for parallel programming)," 2007, pp. 1-15.
    [8] L. Chew and D. Lie, "Kivati: Fast Detection and Prevention of Atomicity Violations," 2010.
    [9] J.-D. Choi, et al., "Efficient and precise datarace detection for multithreaded object-oriented programs," presented at the Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation, Berlin, Germany, 2002.
    [10] D. Engler and K. Ashcraft, "RacerX: effective, static detection of race conditions and deadlocks," 2003, p. 252.
    [11] B.-C. Kim, et al., "Visualizing Potential Deadlocks in Multithreaded Programs," presented at the Proceedings of the 10th International Conference on Parallel Computing Technologies, Novosibirsk, Russia, 2009.
    [12] S. Lu, et al., "AVIO: Detecting Atomicity Violations via Access-Interleaving Invariants," Micro, IEEE, vol. 27, pp. 26-35, 2007.
    [13] Y. Yu, et al., "Racetrack: efficient detection of data race conditions via adaptive tracking," 2005, p. 234.
    [14] Zden, et al., "AtomRace: data race and atomicity violation detector and healer," presented at the Proceedings of the 6th workshop on Parallel and distributed systems: testing, analysis, and debugging, Seattle, Washington, 2008.
    [15] D. C. Arnold, et al., "Stack Trace Analysis for Large Scale Debugging," in Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International, 2007, pp. 1-10.
    [16] C. Chen, "The parallel debugging architecture in the intel debugger," Parallel Computing Technologies, pp. 444-451, 2003.
    [17] M. Geimer, et al., "A scalable tool architecture for diagnosing wait states in massively parallel applications," Parallel Computing, vol. 35, pp. 375-388, 2009.
    [18] J. Labarta, et al., "Scalability of visualization and tracing tools," pp. 869-876.
    [19] A. Nataraj, et al., "A framework for scalable, parallel performance monitoring using tau and mrnet," Under submission, 2008.
    [20] M. Noeth, et al., "ScalaTrace: Scalable compression and replay of communication traces for high-performance computing," Journal of Parallel and Distributed Computing, vol. 69, pp. 696-710, 2009.
    [21] P. Roth and B. Miller, "On-line automated performance diagnosis on thousands of processes," 2006, p. 80.
    [22] A. Mirgorodskiy, et al., "Problem diagnosis in large-scale computing environments," 2006, pp. 11-11.
    [23] P. Barham, et al., "Using Magpie for request extraction and workload modelling," 2004, p. 259¡V272.
    [24] M. Chen, et al., "Pinpoint: Problem determination in large, dynamic internet services," 2002.
    [25] W. Dickinson, et al., "Finding failures by cluster analysis of execution profiles," 2001, pp. 339-348.
    [26] Q. Gao, et al., "Dmtracker: finding bugs in large-scale parallel programs by detecting anomaly in data movements," SC, vol. 7, p. 1¡V12, 2007.
    [27] Z. Lan, et al., "Toward automated anomaly identification in large-scale systems," IEEE Transactions on Parallel and Distributed Systems, 2009.
    [28] C. Yuan, et al., "Automated known problem diagnosis with event traces," ACM SIGOPS Operating Systems Review, vol. 40, p. 388, 2006.
    [29] G. Florez, et al., "Detecting anomalies in high-performance parallel programs," 2004.
    [30] R. Agarwal and S. Stoller, "Run-time detection of potential deadlocks for programs with locks, semaphores, and condition variables," 2006, p. 60.
    [31] S. Bensalem and K. Havelund, "Dynamic deadlock analysis of multi-threaded programs," Hardware and Software, Verification and Testing, pp. 208-223, 2006.
    [32] S. Park, et al., "CTrigger: exposing atomicity violation bugs from their hiding places," ACM SIGPLAN Notices, vol. 44, pp. 25-36, 2009.
    [33] M. Xu, et al., "A serializability violation detector for shared-memory server programs," 2005, p. 14.
    [34] L. Lamport, "Time, clocks, and the ordering of events in a distributed system," Communications of the ACM, vol. 21, pp. 558-565, 1978.

    下載圖示 校內:2013-07-05公開
    校外:2013-07-05公開
    QR CODE