簡易檢索 / 詳目顯示

研究生: 陶志行
Tau, Chih-Shing
論文名稱: 磁碟陣列系統之容錯演算
Towards Fault-Tolerance Intuitively in the Disk Array Systems
指導教授: 王宗一
Wang, Tzone-I
學位類別: 博士
Doctor
系所名稱: 工學院 - 工程科學系
Department of Engineering Science
論文出版年: 2006
畢業學年度: 94
語文別: 英文
論文頁數: 129
中文關鍵詞: 同位配置演算法磁碟陣列容錯
外文關鍵詞: fault tolerance, parity placement, RAID, disk array, algorithm
相關次數: 點閱:199下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •   磁碟陣列系統為了要達到高可信度,一種可以同位配置(parity placement)為基礎能有效且直覺的磁碟容錯方法是有必要的。這篇論文提出一個既簡單又方便的磁碟回復演算法(或解碼方法),處理磁碟陣列系統中經常發生的錯誤問題。此演算法僅使用同位(parity)及互斥(exclusive-or)的二進位運算,能比其他方法須使用有限場(finite field)運算更快速地回復錯誤。這篇論文首先介紹此演算法將原為資料/同位配置的問題轉換成二進位線性系統問題加以解決的原理;其次將此演算法套用至已知能解決二個磁碟錯誤的方法-奇偶編碼(EVENODD)法,展示其如何運作過程。同時也套用在新提出能解決三個磁碟錯誤的方法-HDD1及HDD2(Horizontal and Dual Diagonal水平及雙對角線同位)。除此之外,由前述討論推論出能更有效率地應用到任何已知以同位配置方法之結果。最後,此論文提出另一種純粹以互斥運算為基礎能回復雙磁碟錯誤的解碼演算法-列與斜式同位(Row-Oblique Parity (ROP))法,此演算法可證明其在同位建立(construction)及錯誤重建(reconstruction)過程的運算有最佳的複雜度,而且在額外同位資訊儲存與存取方面亦有最好的性能。ROP架構的簡單性讓我們能在既有的RAID架構上便可以實現,其解碼演算法既簡單直覺且部份解碼步驟可以平行處理執行之,使得磁碟錯誤可以更快速地恢復。與其他RAID磁碟陣列系統的容錯方法相比較,此篇論文所提出的容錯演算比較簡單且更有效率,它們可以軟體實現之,而且可以在不修改現有硬體架構下以硬體實現之。

     In order to achieve high reliability in disk array systems, an efficient and intuitive decoding method for tolerating disk failures based on parity placement scheme needs to be explored. This dissertation proposes a simple and convenient recovery algorithm (or decoding method) to deal with the faulty disks problem (given k, generally less than or equal to 3) occurring frequently in disk array system (given the number of disks N). It is based on modulo 2 arithmetic, parity, and exclusive-OR operations to make the recovery speed faster than other schemes that require computation over finite fields. To begin with, the data/parity placement problem is transformed into the problem of constructing k(N – 1) linear equations so that the problem can be solved immediately. Then, it illustrates how this method works through a known scheme, EVENODD, which tolerates double disk failure in disk array systems, and how it also works through HDD1 (Horizontal and Dual Diagonal) and HDD2, new schemes introduced here to improve triple parity placement schemes to enhance the reliability of a disk array system. Moreover, it is inferred that the proposed method may be applied to any known parity placement scheme and perform even more efficiently. Finally, this dissertation proposes an XOR-based decoding algorithm, Row-Oblique Parity (ROP), for protecting against double disk failure in disk array systems. ROP is provably optimal in computational complexity, both during construction and reconstruction. It is optimal in the amount of redundant information stored and accessed. The simplicity of ROP allowed us to implement it within the current available RAID framework. The decoding algorithms this dissertation proposes here are rather simple and some steps of its decoding procedure can be executed in parallel that makes faster the disk failures recovery. In comparison with other schemes in RAID systems, these decoding methods are simpler and more efficient because it can be implemented by current available software technology. Moreover, most of them do not even need to modify the hardware during the implementation.

    摘要                               i Abstract                             ii Acknowledgements                         iii Contens                              iv List of Tables                           vii List of Figures                           viii Chapter 1  Introduction                      1     1.1 Historical Overview                   3     1.2 RAID Survey                      7      1.2.1 What does RAID stand for?              8      1.2.2 Basic concepts                    9      1.2.3 The different RAID levels               12      1.2.4 RAID implementation                15      1.2.5 RAID and fault tolerance                19     1.3 Related Works                     20     1.4 Dissertation Outline                    22 Chapter 2  Preliminaries                      24     2.1 Metrics for Redundancy in Disk Arrays            24     2.2 Parity in t Dimensions                   25     2.3 Linear Codes                      27     2.4 Implementing Reconstruction               29 Chapter 3  An Alternative Decoding Algorithm for EVENODD Code in Disk Array Systems                            32     3.1 Introduction                      32     3.2 The EVENODD Scheme                 33      3.2.1 The encoding procedure                33      3.2.2 The decoding procedure                35      3.2.3 Algebraic description of EVENODD scheme       37      3.2.4 Efficiency in small writes               38      3.2.5 Parity stripe unit ratio                 39     3.3 The Alternative Decoding Algorithm of EVENODD Scheme   39      3.3.1 The new decoding method              40     3.4 Simulate Results and Analysis              44      3.4.1 Decoding efficiency                 44     3.5 Summary                       48 Chapter 4  Parity Placement Schemes for Tolerating Triple Column Disk Failure in Disk Array System                        50     4.1 Introduction                     50     4.2 Related Works                    52     4.3 HDD1 Scheme                    54      4.3.1 The encoding method of HDD1 scheme         54      4.3.2 The decoding method of HDD1 scheme         56      4.3.3 Correctness of HDD1’s decoding method        61     4.4 HDD2 Scheme                    62      4.4.1 The encoding method of HDD2 scheme         62      4.4.2 The decoding method of HDD2 scheme         64      4.4.3 Correctness of HDD2’s decoding method        67     4.5 Performance Analysis                  67      4.5.1 Encoding efficiency analysis              67      4.5.2 Decoding efficiency analysis              69      4.5.3 Efficiency analysis in small writes           72      4.5.4 Parity stripe unit ratio analysis             72      4.5.5 Workload of the HDD schemes            73     4.6 Summary                        74 Chapter 5  Efficient Fault Tolerant Method in Disk Array Systems 76     5.1 Introduction                      76     5.2 The Recovering Algorithms               76      5.2.1 Decoding method                 77     5.3 Illustrative Parity Placement Schemes           80      5.3.1 EVENODD scheme                 80      5.3.2 DH scheme                    87     5.4 Performance Analysis                  93     5.5 Summary                      95 Chapter 6  Independent Row-Oblique Parity for Double Disk Failure Correction 97     6.1 Introduction                     97     6.2 Row-Oblique Parity Algorithm              98      6.2.1 Construction algorithm               99      6.2.2 Reconstruction algorithm              102      6.2.3 Correctness of the reconstruction algorithm      109     6.3 Performance Analysis                 113      6.3.1 Construction efficiencies analysis           113      6.3.2 Reconstruction efficiencies analysis           116      6.3.3 Efficiencies analysis in small writes          119      6.3.4 Parity block ratio analysis              119     6.4 Summary                      120 Chapter 7  Conclusion and Future Work               121 Bibliography                          123 自述                              129

    [1]  G.A. Alvarez, W.A. Burkhard, and F. Cristian, “Tolerating Multiple
       Failures in RAID Architectures with Optimal Storage and Uniform
       Declustering,” Proceedings of the 24th Annual International Symposium
       on Computer Architecture, pp.62-72, Denver, Colorado, Jun. 1997.
    [2]  G.A. Alvarez, W.A. Burkhard, L.J. Stockmeyer, F. Cristian, “Declustered
       Disk Array Architectures with Optimal and Near-Optimal Parallelism,”
       Proceedings of the 25th Annual International Symposium on Computer
       Architecture, Issue 6, pp.109-120, Mar. 1998.
    [3]  D. Anderson, J. Dykes, and E. Riedel, “More than an interface – SCSI
       vs. ATA,” 2nd USENIX Conference on File and Storage Technologies, pp.
       245-257, San Francisco, CA, Mar. 2003.
    [4]  J.L. Bare and T.F. Chen, “An Effective On-Chip Preloading Scheme to
       reduce Data Access Penalty,” Proceeding of Supercomputing ’91, pp. 176-
       186, 1991.
    [5]  C.G. Bell, “The Future of High Performance Computers in Science and
       Engineering,” Comm. Of the ACM, vol. 32, No. 9, Sep. 1989.
    [6]  E.R. Berlekamp, “On Decoding Bose-Chaudhuri-Hocquenghem Codes,” IEEE
       Trans. Inform. Theory, IT 11, pp. 577-579, 1965.
    [7]  D. Bitton, J. Gray, “Disk Shadowing,” 14th Very Large Database
       Conference, 1988.
    [8]  M. Blaum, “A Class of Byte-Correcting Array Codes,” IBM Research
       Report, RJ5652(57151), May 1987.
    [9]  M. Blaum, “A Coding Technique for Recovery Against Double Disk Failures
       in Disk Arrays,” in Proc. IEEE Int. Conf. on Comm., vol. 3, pp. 1366-
       1368, Chicago, IL, Jun. 1992.
    [10] M. Blaum, H. Hao, R. Mattson, and J. Menon, “A Coding Technique for
       Double Disk Failures in Disk Arrays,” U.S. Patent, 5271012, Dec. 1993.
    [11] M. Blaum and R. Roth, “New Array Codes for Multiple Phased Burst
       Correction,” IEEE Trans. Inform. Theory, vol. 39, No. 1, pp. 66-77,
       Jan. 1993.
    [12] M. Blaum, J. Bruck, and A. Vardy, “Binary Codes with Large Symbols,”
       in Proceedings 1994 IEEE International Symposium on Information Theory,
       pp. 508, Jun. 1994.
    [13] M. Blaum, J. Bruck, & J. Menon, “EVENODD: an Efficient Scheme for
       Tolerating Double Disk Failures in RAID Architectures,” IEEE
       Transactions Computers, vol. 44, No. 2, pp. 192-202, 1995.
    [14] H. Boral, D. DeWitt, “Database Machine: An Idea Whose Time Has
       Passed?,”Database Machines, Ed. H.O. Leilich, M. Missikoff, Springer-
       Verlag, Sep. 1983.
    [15] W. Burkhard and J. Menon, “Disk array storage system reliability,”
       Proceedings of the International Symposium on Fault-Tolerant Computing,
       pp. 432-441, Toulouse, France, Jun. 1993.
    [16] P.M. Chen, “An Evaluation of redundant Arrays of Disks Using an
       Amdah15890,” Technical Report UCB/CSD 89/506, University of California
       at Berkeley, May 1989.
    [17] P.M. Chen, D.A. Patterson, “Maximizing Performance in a Striped Disk Array,” Proc. of the 1990 ACM SIGARCH 17th Ann. Int. Symp. of Computer Architecture, Seattle WA, May 1990.
    [18] P.M. Chen, E.K. Lee, G.A. Gibson, R.H. Katz, and D.A. Patterson, “RAID:
       High-Performance, Reliable Secondary Storage,” ACM Computing Surveys,
       vol. 26, No. 2, pp. 145-185, Jun. 1994.
    [19] A. Clements, “The Principles of Computer Hardware,” Third Ed., Oxford
       University Press, New York, 2000.
    [20] “RAID Technology White Paper,” Dell Computer Corporation, 1999.
    [21] G.D. Forncy, Jr., “Concatenated Codes,” MIT Press, 1966.
    [22] T. Fuja, C. Heegard, and M. Blaum, “Cross Parity Check Convolutional
       Code,” IEEE Trans. Inform. Theory, vol.35, No. 6, pp. 1264-1276, Jul.
       1989.
    [23] G.R. Ganger, B.L. Worthinton, R.Y. Hou, Y.N. Patt, “Disk Arrays: High-
       Performance High-Reliability Storage Subsystems”, IEEE Comput., Vol.
       27, Issue 3, pp. 30-36, 1994.
    [24] P.P. Gelsinger, P.A. Gargini, G.H. Parker, Y.C. Yu, “ Microprocessors
       Circa 2000,” IEEE Spectrum, Oct. 1989.
    [25] G.A. Gibson, L Hellerstein, R.M. Karp, R.H. Katz, D.A. Patterson,
       “Coding Techniques for Handling Failures in Large Disk Arrays,”
       Computer Science Technical Report, CSD88-477, University of California,
       Berkeley, 1988.
    [26] G.A. Gibson, “Redundant Disk Arrays: Reliable, Parallel Secondary
       Storage,” PhD Dissertation, University of California, Berkeley CA,
       Technical Report UCB/CSD 91/613, April 1991.
    [27] Golay, Marcel J.E., “Notes on Digital Coding,” Proceedings of the IRE,
       Vol. 37, pp. 657, Jun. 1949.
    [28] S.W. Golomb, “Theory of Transformation Groups of Polynomials Over GF(2)
       with Applications to Linear Shift Register Sequences,” Information
       Sciences, Dec. 1968.
    [29] R. Goodman and M. Sayano, “Size Limits on Phased Burst Error Correcting
       Array Codes,” Electronics Letters, vol. 26, No. 1, pp. 55-56, 1990.
    [30] R. Goodman, R.J. MacEliece, and M. Sayano, “Phase Burst Correcting
       Array Codes,” IEEE Trans. Inform. Theory, vol. 39, No. 2, pp. 684-693,
       Mar. 1993.
    [31] R.L. Graham, D.E. Knuth, and O. Patashnik, “Concrete Mathematics: A
       Foundation for Computer Science,” Second Ed., Addison-Wesley, Reading,
       MA, pp. 129-130, 1994.
    [32] H. Grassmann, “A New Branch of Mathematics: The Ausdehnungslehre of
       1844 and other works,” Open Court, Illinois, 1995.
    [33] H. Grassmann, “Extension Theory,” American Mathematical Society,
       London Mathematical Society, 2000
    [34] J. Gray, “Why do Computers Stop and What Can Be Done About It?,”
       Tandem Technical Report 85.7, Jun. 1985.
    [35] R.W. Hamming, “Error Detecting and Error Correcting Codes,” Bell
       System Technical Journal, vol. 29, pp. 147-160, 1950.
    [36] A. 37, “Information Storage Technology: A Look at the Future,” IEEE
       Computer, Vol. 18, Jul. 1985.
    [37] E. Horowitz and S. Sahni, “Fundamentals of Computer Algorithms,”
       Computer Science Press, pp. 488-489, 1978.
    [38] P.H. Hsieh, I.Y. Chen, Y.T. Lin and S.Y. Kuo, “An XOR based Reed-
       Solomon algorithm for advanced RAID systems,” in Proc. of the 19th IEEE
       International Symposium on Defect and Fault Tolerance in VLSI Systems
       (DFT’04), pp. 165-172, Oct. 2004.
    [39] N. Jarwala and D.K. Pradhan, “Cost Analysis of On-Chip Error Control
       Coding for Fault-Tolerant Computers,” Proc. FTCS-17, Computer Society
       Press, Los Alamitos, Calif., Order No. 778, pp. 354-359, 1988.
    [40] M. Kasahara, Y. Sugiyama, S. Hirasawa, and T. Namekawa, “New classes of
       binary codes constructed on the basis of concatenated codes and product
       codes,” IEEE Trans. Inform. Theory, vol. IT-22, No.4, pp.462-468, Jul.
       1975.
    [41] R.H. Katz, G.A. Gibson, D.A. Patterson, “Disk System Architecture for
       High Performance Computing”, IEEE Proc. 77, pp. 1842-1850, Dec. 1989.
    [42] M.H. Kryder, “Data Storage in 2000 – Trends in Data Storage
       Technologies,” IEEE Trans. on Magnetics, vol. 25, No. 6, Nov. 1989.
    [43] N.K. Lee, S.B. Yang, & K.W. Lee, “Efficient Parity Placement Schemes
       for Tolerating up to Two Disk Failures in Disk Arrays,” Journal of
       Systems Architecture, 46, pp. 1383-1402, 2000.
    [44] Lin, T.T. Yao, “Design and Evaluation of an On-line Predictive
       Diagnostic System,” Ph.D. Dissertation, Carnegie Mellon University,
       Apr. 1988.
    [45] Shu Lin, Daniel J. Costello, Jr., “Error Control Coding: Fundamentals
       and Applications,” Englewood Cliffs, N.J. 07632, Prentice-Hall, Inc.,
       pp. 52-65, 1989.
    [46] F.J. MacWilliams & N.J.A. Sloane, “The Theory of Error-Correcting
       Codes,” Amsterdam, The Netherlands: North-Holland, 1977.
    [47] D.E. Muller, “Application of Boolean algebra to switching circuit
       design and to error detection,” IRE Trans., EC-3, pp. 6-12, 1954.
    [48] R. Muntz and J. Lui, “Performance analysis of disk arrays under
       failure,” Proceedings of the 16th VLDB Conference, pp. 162-173,
       Brisbane, Jun. 1990.
    [49] A. Papoulis, “Probability, random variables, and stochastic
       processes,” 2nd Edition, McGraw-Hill, NY, 1984.
    [50] C.Park, “Efficient Placement of Parity and Data to Tolerate Two Disk
       Failures in Disk Array Systems,” IEEE Trans. Parallel Distribut. Syst.,
       vol. 6, No. 11, pp.1177-1184, 1995.
    [51] A.M. Patel, “Multitrack Error Correction with Cross-Parity Check
       Coding,” IBM Technical Report, TR02.813, Dec. 1978.
    [52] A.M. Patel, “Adaptive Cross Parity Code for a High Density Magnetic
       Tape Subsystem,” IBM J. Res. Develop., vol. 29, No. 6, pp. 546-562,
       1985.
    [53] D.A. Patterson, G.A. Gibson, & R.H. Katz, “A Case for Redundant Arrays
       of Inexpensive Disks (RAID),” in Proc. of ACM SIGMOD Conf. Data
       Management, pp.109-116, Chicago, IL, 1988.
    [54] D.A. Patterson and J.L. Hennessy, “Computer Architecture – A
       Quantitative Approach,” Second Ed., Morgan Kaufmann, San Francisco, CA,
       1996.
    [55] W. W. Peterson, “Encoding and error-correction procedures for Bose-
       Chaudhuri Codes,” IEEE Transactions on Information Theory. Vol. IT-6,
       Sep. 1960.
    [56] Peterson, W. Wesley, E.J. Weldon, Jr., “Error-Correcting Codes,”
       Second Edition, M.I.T. Press, pp. 131-136, 1972.
    [57] D.K. Pradhan, ed., “Fault-Tolerant Computing: Theory and Techniques,”
       Prentice Hall, Englewood Cliffs, N.J., 1986.
    [58] E. Prange, “Cyclic Error-Correcting codes in two symbols,” Technical
       report AFCRC-TN-57-103, Air Force Cambridge Research Center, Cambridge,
       Mass., 1957.
    [59] P. Prusinkiewicz, & S. Budkowski, “A Double Track Error-Correction Code
       for Magnetic Tape,” IEEE Transactions Computers., vol. 25, No. 6, pp.
       642-645, Jun. 1976.
    [60] S.K. Ramam, V. Pentkovski, and J. Keshava, “Implementing Streaming SIMD
       Extensions on the Pentium III Processor,” IEEE Micro, vol. 20, no. 4,
       pp. 47-57, Jul./Aug. 2000.
    [61] T.R.N. Rao and E. Fujiwara, “Error-Control Coding for Computer
       Systems,” Prentice Hall, Englewood Cliffs, N.J., 1989.
    [62] I. S. Reed, “A class of multiple error correcting codes and the
       decoding scheme, ” IEEE Trans. Info. Theory, vol. IT-4, pp. 38-49, 1954.
    [63] I. S. Reed and G. Solomon, “Polynomial codes over certain finite
       fields,” MIT Lincoln Laboratory Group Report. No.47.23, 31 December
       1958.
    [64] I. S. Reed and G. Solomon, “A decoding procedure for polynomial
       codes,” MIT Lincoln Laboratory Group Report. No.47.24, 6 Mar. 1959.
    [65] I. S. Reed and G. Solomon, “Polynomial codes over certain finite
       fields,” SIAM J. Appl. Math., Vol 8, No 2, pp. 300-304, Jun. 1960.
    [66] I. S. Reed, Deutsch, Hsu, Truong, Wang, and Yeh, “The VLSI
       implementation of a Reed-Solomon encoder using Berlekamp’s bit-serial
       multiplier algorithm,” IEEE Transactions on Computers. vol. C-33, No.
       10, Oct. 1984.
    [67] I. S. Reed and X. Chen, “Error-control coding for data network,”
       Kluwer Academic Publishers, 1999.
    [68] K. Salem and H. Garcia-Molina, “Disk striping,” Proceedings of the
       Second International Conference on Data Engineering, pp. 109-116, Los
       Angeles, California, Feb. 1986.
    [69] M. Schulze, G.A. Gibson, R.H. Katz, D.A. Patterson, “How Reliable Is a
       RAID?,” IEEE COMPCON, pp. 118-123, Spring 1989.
    [70] C. E. Shannon, “A mathematical theory of communication,” Bell System
       Technical Journal, vol. 27, pp. 379-423 and pp. 623-656, Jul. and Oct.
       1948.
    [71] Q. Xin, E. Miller, T. Schwarz, D. Long, S. Brandt, and W. Litwin,
       “Reliabilty mechanisms for very large storage systems,” 20th IEEE/11th
       NASA Boddard Conference on Mass Storage Systems and Technologies, pp.
       146-156, San Diego, CA, Apr. 2003.
    [72] D. Gorenstein, W. Wesley Peterson, and N. Zierler, “Two-error
       correcting Bose-Chaudhuri codes are quasi-perfect,” Information and
       Control, vol. 3, No. 3, pp.291-294, Sep. 1960.

    下載圖示
    2007-01-24公開
    QR CODE