| 研究生: |
劉惟成 Liu, Wei-Cheng |
|---|---|
| 論文名稱: |
Codepecker:一個用於偵測多核心平行程式的異常執行緒之除錯工具 Codepecker:A Debugging Tool for Detecting Abnormal Threads of Parallel Programs on Shared Memory Multi-core Systems |
| 指導教授: |
謝錫堃
Shieh, Ce-Kuen |
| 共同指導教授: |
黃祖基
Huang, Tzu-Chi |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2010 |
| 畢業學年度: | 98 |
| 語文別: | 英文 |
| 論文頁數: | 44 |
| 中文關鍵詞: | 異常偵測 、多核心 、多執行緒 |
| 外文關鍵詞: | Anomaly detection, multi-core, multithreaded |
| 相關次數: | 點閱:78 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
由於多核心處理器發展越來越快速,並且擁有高運算能力及低耗能等優點,目前市面上的電腦幾乎都已經採用多核心處理器的架構,也因為如此程式設計師對於撰寫平行程式的需求也越來越高,平行程式中所使用的執行緒數量也越來越多。而大部份平行程式使用大量執行緒並且所有執行緒都擁有相同的行為。然而,利用傳統的除錯工具去為使用大量執行緒的平行程式除錯時,一個個地檢視這些執行緒的資訊來找出造成錯誤的異常執行緒,不僅操作繁雜也需耗費大量的除錯時間。
本論文提出Codepecker除錯工具,其能夠在多核心系統上的平行程式中自動地偵測出有異常行為的執行緖。Codepecker是藉由比較各執行緒之間執行行為的異同,來自動偵測異常的執行緒。此外,Codepecker也擁有許多優點,例如:透通性高、可攜性佳及系統負擔低。使用者可以使用Codepecker快速地去偵測出平行程式中異常的執行緒,進而使用除錯器去找出錯誤發生的根本原因,如此一來便可以節省大量的除錯成本。
Due to the popularity of multi-core processors in the computer market with their features of high performance and low power consumption, programmers begin to develop parallel programs for various applications with multiple threads. However, using traditional debugging tools to locate the abnormal threads is a heavy and complicated procedure and costs considerable time.
In this thesis, Codepecker is proposed to automatically locate the abnormal threads in a parallel program that contains multiple threads performing similar activities on shared memory multi-core systems. Codepecker locates the abnormal threads by identifying threads that behave differently from others and possesses multiple advantages such as high transparency, high portability and low overhead. Programmers can use the Codepecker to locate the abnormal threads rapidly, and further exploit the debugger to figure out the root cause of bug.
[1] D. Atanasov, "General purpose GPU programming," 2005.
[2] NVIDA. Available: http://www.nvidia.com/page/home.html
[3] ATI. Available: http://www.amd.com/tw/Pages/AMDHomePage.aspx
[4] C. NVIDIA, "Programming Guide 2.0 : http://developer. download. nvidia. com/compute/cuda/2 0/docs," NVIDIA CUDA Programming Guide, vol. 2, 2008.
[5] R. Chandra, Parallel programming in OpenMP: Morgan Kaufmann, 2001.
[6] V. Adve, et al., "Requirements for data-parallel programming environments," IEEE Parallel & Distributed Technology: Systems & Technology, vol. 2, p. 58, 1994.
[7] B. Massingill, et al., "SIMD: an additional pattern for PLPP (pattern language for parallel programming)," 2007, pp. 1-15.
[8] L. Chew and D. Lie, "Kivati: Fast Detection and Prevention of Atomicity Violations," 2010.
[9] J.-D. Choi, et al., "Efficient and precise datarace detection for multithreaded object-oriented programs," presented at the Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation, Berlin, Germany, 2002.
[10] D. Engler and K. Ashcraft, "RacerX: effective, static detection of race conditions and deadlocks," 2003, p. 252.
[11] B.-C. Kim, et al., "Visualizing Potential Deadlocks in Multithreaded Programs," presented at the Proceedings of the 10th International Conference on Parallel Computing Technologies, Novosibirsk, Russia, 2009.
[12] S. Lu, et al., "AVIO: Detecting Atomicity Violations via Access-Interleaving Invariants," Micro, IEEE, vol. 27, pp. 26-35, 2007.
[13] Y. Yu, et al., "Racetrack: efficient detection of data race conditions via adaptive tracking," 2005, p. 234.
[14] Zden, et al., "AtomRace: data race and atomicity violation detector and healer," presented at the Proceedings of the 6th workshop on Parallel and distributed systems: testing, analysis, and debugging, Seattle, Washington, 2008.
[15] D. C. Arnold, et al., "Stack Trace Analysis for Large Scale Debugging," in Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International, 2007, pp. 1-10.
[16] C. Chen, "The parallel debugging architecture in the intel debugger," Parallel Computing Technologies, pp. 444-451, 2003.
[17] M. Geimer, et al., "A scalable tool architecture for diagnosing wait states in massively parallel applications," Parallel Computing, vol. 35, pp. 375-388, 2009.
[18] J. Labarta, et al., "Scalability of visualization and tracing tools," pp. 869-876.
[19] A. Nataraj, et al., "A framework for scalable, parallel performance monitoring using tau and mrnet," Under submission, 2008.
[20] M. Noeth, et al., "ScalaTrace: Scalable compression and replay of communication traces for high-performance computing," Journal of Parallel and Distributed Computing, vol. 69, pp. 696-710, 2009.
[21] P. Roth and B. Miller, "On-line automated performance diagnosis on thousands of processes," 2006, p. 80.
[22] A. Mirgorodskiy, et al., "Problem diagnosis in large-scale computing environments," 2006, pp. 11-11.
[23] P. Barham, et al., "Using Magpie for request extraction and workload modelling," 2004, p. 259¡V272.
[24] M. Chen, et al., "Pinpoint: Problem determination in large, dynamic internet services," 2002.
[25] W. Dickinson, et al., "Finding failures by cluster analysis of execution profiles," 2001, pp. 339-348.
[26] Q. Gao, et al., "Dmtracker: finding bugs in large-scale parallel programs by detecting anomaly in data movements," SC, vol. 7, p. 1¡V12, 2007.
[27] Z. Lan, et al., "Toward automated anomaly identification in large-scale systems," IEEE Transactions on Parallel and Distributed Systems, 2009.
[28] C. Yuan, et al., "Automated known problem diagnosis with event traces," ACM SIGOPS Operating Systems Review, vol. 40, p. 388, 2006.
[29] G. Florez, et al., "Detecting anomalies in high-performance parallel programs," 2004.
[30] R. Agarwal and S. Stoller, "Run-time detection of potential deadlocks for programs with locks, semaphores, and condition variables," 2006, p. 60.
[31] S. Bensalem and K. Havelund, "Dynamic deadlock analysis of multi-threaded programs," Hardware and Software, Verification and Testing, pp. 208-223, 2006.
[32] S. Park, et al., "CTrigger: exposing atomicity violation bugs from their hiding places," ACM SIGPLAN Notices, vol. 44, pp. 25-36, 2009.
[33] M. Xu, et al., "A serializability violation detector for shared-memory server programs," 2005, p. 14.
[34] L. Lamport, "Time, clocks, and the ordering of events in a distributed system," Communications of the ACM, vol. 21, pp. 558-565, 1978.