簡易檢索 / 詳目顯示

研究生: 劉安泰
Liu, An-Tai
論文名稱: 重新審視Klein的隨機模型以尋找最佳檢查和維護策略
Revisit to Klein's stochastic model for finding an optimal inspection and maintenance policy
指導教授: 許瑞麟
Sheu, Ruey-Lin
學位類別: 碩士
Master
系所名稱: 理學院 - 數學系應用數學碩博士班
Department of Mathematics
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 48
中文關鍵詞: 馬可夫決策過程分式規劃線性規劃
外文關鍵詞: Markov decision process, fraction programming, linear programming
相關次數: 點閱:102下載:12
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Klein在1962年發表的論文,構建了一個完整的馬可夫決策過程模型。目標是將一個
    會自然逐漸變壞的系統以馬可夫鏈加以描述,以便尋找一個最優的檢查和維護策
    略,使得單位時間的系統運作平均總成本最小。Klein研究此馬可夫決策過程的平
    衡態,將問題以一個線性分式規劃來描述,並轉變為線性規劃來求解。然而以現代
    角度來看,該論文無論是數學結構還是符號,都沒有被清楚且完整的定義,因此可
    讀性不佳。本論文以2004年Kallenberg的「馬可夫決策過程」為範本重新刻劃了1962
    年Klein的文章中,馬可夫決策過程的狀態空間、決策空間、決策花費、決策時間
    點,以及馬可夫鏈的初始分布及轉移機率,並且創建了一個真實場景的應用來做為
    該數學模型的具體例子,實際計算馬可夫性質下的決策系統發現的確能夠找到最優
    的檢查和維護策略。

    In 1952, Klein constructed a Markov decision process model, the objective of which
    is to find an optimal inspection and maintenance policy for minimizing the average
    cost per unit time in a deteriorated system with Markov properties. Klein studied the equilibrium state of this Markov decision process, modeling the problem as a linear fractional program and converted it into a linear program problem for solving the solution. However, from the view point of modern mathematics, we found that neither the mathematical structure nor the notation system in the paper has been clearly and completely treated. As such, the readability of Klein’s paper has lots of rooms for improvement. In this thesis, we adopt the approach in Kallenberg's book, entitled“Markov Decision Process” (2004) to reconstruct Klein’s paper in 1962. The state space, decision space, cost, decision time epoch, and initial distribution and transition probability of Markov chain are now rigorously defined in exchange of a high-resolution description for Klein’s paper. In addition, a practical scenario is constructed as a concrete example supplement to Klein’s model. After numerical computation, we show that linear programming is indeed able to find the optimal inspection and maintenance policies.

    1 Introduction 1 1.1 A historical account of Markov decision process 1 1.2 Practical special cases and generalization of Markov decision process 3 2 Interpret Klein’s model and analyze the linear fractional problem 6 2.1 Construct the deterioration process 6 2.2 The decision rule of inspection and maintenance 8 2.3 The combined process with decision rule and deterioration process 10 2.4 The steady state distribution 12 2.5 The cost structure 13 2.6 The average cost per unit time 14 2.7 Converting a fractional programming into a linear programming problem 18 3 An illustrative example to finding optimal policy 20 3.1 Construct a practical example to illustrate Klein’s model 20 3.2 The deterioration process of the printer case 21 3.3 The decision rule with inspection, maintenance and replacement 23 3.4 A new Markov chain with combined transition probability 26 3.5 Costing related to printers 28 3.6 Problem conversion process and the solution of MATLAB 33 4 Final Discussion 39 Bibliography 43

    [1] Bellman,R. Dynamic programming. Princeton University Press, Princeton (1957).
    [2] Shapley, L.S. Stochastic games Proceedings of the National Academy of Science (1953), 1095-1100.
    [3] Howard, R.A. Dynamic programming and Markov processes. MIT Press (1960).
    [4] De Ghellinck,G.T. Les probl`emes de d´ecisions s´equentielles. Cahiers du Centre de Recherche Op´erationelle (1960), 161-179.
    [5] D’Epenoux,F. Sur un probl`eme de production et de stockage dans l’al´eatoire.Revue Fran¸caise de Recherche Op´erationelle (1960), 3-16.
    [6] Manne,A.S. Linear programming and sequential decisions. Management Science (1960), 259-267.
    [7] Blackwell,D. Discrete dynamic programming. Annals of Mathematical Statistics (1962), 719-726.
    [8] Derman,C. Finite state Markovian decision processes. Academic Press, New York (1970).
    [9] Hinderer, K. Foundations of non-stationary dynamic programming with discrete time parameter. Springer-Verlag, New York (1970).
    [10] Kushner,H. Introduction to stochastic control. Holt, Rineholt and Winston, New York (1971).
    [11] Mine,H. and S. Osaki. Markov decision processes. American Elsevier, New York (1970).
    [12] Ross, S.M. Applied probability models with optimazation applications. HoldenDay, San Francisco (1970).
    [13] Van Nunen,J.A.E.E. Contracting Markov decision processes. Mathematical Centre Tract 71, Mathematical Centre, Amsterdam. (1976).
    [14] Van der Wal,J. Stochastic dynamic programming. Mathematical Centre Tract 139, Mathematical Centre, Amsterdam. (1981).
    [15] Kallenberg,L.C.M. Linear programming and finite Markovian control problems. Mathematical Centre Tract 148, Mathematical Centre, Amsterdam (1983).
    [16] Federgruen,A. Markovian control problems: functional equations and algorithms. Mathematical Centre Tract 97, Mathematical Centre, Amsterdam (1984).
    [17] Vrieze,O.J. Stochastic games with finite state and action spaces. CWI Tract 33, Centre for Mathematics and Computer Science, Amsterdam (1987).
    [18] Hern´andez-Lerma,O. Adaptive Markov control processes. Springer-Verlag, New York (1987).
    [19] Altman,E. Constrained Markov decision processes. Chapman Hall/CRC, Boca Raton, Florida (1999).
    [20] Sennott,L.I. Stochastic dynamic programming and the control of queueing systems. Wiley, New York (1999).
    [21] Bertsekas,D.P. Dynamic programming and stochastic control. Academic Press, New York (1976).
    [22] Whittle,P. Optimization over time; dynamic programming and stochastic control. Volume I, Wiley, New York (1982).
    [23] Ross,S.M. Introduction to stochastic dynamic programming Academic Press, New York (1983).
    [24] Dietz,H.M. and V. Nollau Markov decision problems with countable state space Akademie-Verlag, Berlin (1983).
    [25] Bertsekas,D.P. Dynamic programming: deterministic and stochastic models Prentice-Hall, Englewood Cliff (1987).
    [26] Denardo,E.V. Dynamic programming: models and applications Prentice-Hall, Englewood Cliff (1982).
    [27] Heyman,D.P. and M. J. Sobel Stochastic models in Operations Research Volume II, MacGraw-Hill, New York (1984).
    [28] White,D.J. Markov decision processes Wiley, Chichester (1993).
    [29] Puterman,M.L. Markov decision processes. Wiley, New York (1994).
    [30] Bertsekas,D.P. Dynamic programming and optimal control I Athena Scientific, Belmont, Massachusetts. (1995).
    [31] Hern´andez-Lerma,O. and J. B. Lasserre Discrete-time Markov control processes: Basic optimality criteria. Springer-Verlag, New York (1996).
    [32] Filar,J.A. and O. J. Vrieze Competitive Markov decision processes. SpringerVerlag, New York (1997).
    [33] Kallenberg,L.C.M. Markov decision processes. Leiden university (2004).
    [34] Beckmann, M. An inventory model for arbitrary interval and quantity distributions of demands. Management Science Vol. 8 (1961), 35-57.
    [35] Feng Cheng. A periodic review inventory model with demand influenced by promotion decisions Management Science Vol. 45 (1999), 1463-1611.
    [36] Ilaria Giannoccaro. Inventory management in supply chains: a reinforcement learning approach. Management Science Vol. 78 (2002), 153-161.
    [37] Breiman, L. Stopping-rule problems. E.F. Beckenbach (ed.), Applied Combinatorial Mathematics, Wiley, New York (1964), 284-319.
    [38] Ross, S.M. Applied probability models with optimization applications. HoldenDay, San Francisco (1970).
    [39] Denardo, E.V. Dynamic programming. Models and Applications, Prentice-Hall (1982).
    [40] Hordijk, A. and N.M. van Dijk Time-discretization for controlled Markov processes. I.General approximation results. Kybernetika (1996) 1-16.
    [41] Howard, R.A. Dynamic programming and Markov processes. MIT Press, Cambridge (1960).
    [42] Schweitzer, P.J. Iterative solution of the functional equations of undiscounted Markov renewal programming. Journal of Mathematical Analysis and Applications (1971) 495-501.
    [43] Gittins, J.C. Bandit processes and dynamic allocation indices Journal of the Royal Statistic Society Series B (1979) 148-177.
    [44] Whittle, P. Multi-armed bandits and the Gittins index. Journal of the Royal Statistical Society, Series B (1980) 143-149.
    [45] Ross, S.M. Introduction to stochastic dynamic programming. Academic Press, New York (1983).
    [46] Varaiya, P.P., J.C. Walrand and C. Buyukkoc Extensions of the multi-armed bandit problem: the discounted case. IEEE Transactions on Automatic Control 30 (1985) 426-439.
    [47] Tsitsiklis, J.N. A lemma on the multi-armed bandit problem. IEEE Transactions on Automatic Control 31 (1986) 576-577.
    [48] Weber, R.R. On the Gittins index for multi-armed bandits. Annals of Applied Probabitily 2 (1992) 1024–1033.
    [49] Chen, Y.-R. and M.N. Katehakis Linear programming for finite state bandit problems. Mathematics of Operations Research 11 (1986) 180-183.
    [50] Kallenberg, L.C.M. A note on M.N.Katehakis and Y.-R.Chen’s computation of the Gittins index. Mathematics of Operations Research 11 (1986) 184-186.
    [51] Katehakis, M.N. and A.F. Veinott Jr. The multi-armed bandit problem: decomposition and computation. Mathematics of Operations Research 12 (1987) 262-268.
    [52] Ben-Israel, A. and S.D. Flam A bisection/successive approximation method for computing Gittins indices Zeitschrift f¨ur Operations Research 34 (1990) 411-422.
    [53] Liu, J.Y. and K. Liu An algorithm on the Gittins index. Systems Science and Mathematical Science 7 (1994) 106-114.
    [54] Sherif, Y.S. and M.L. Smith Optimal maintenance policies for systems subject to failure - A review Naval Research Logistics Quarterly 28 (1981) 47-74.
    [55] Derman, C. On optimal replacement rules when changes of state are Markovian. in:R.Bellman (ed.) Mathematical Optimization Techniques, University of California Press,Berkeley (1963) 201-210.
    [56] Kolesar, P. Minimum cost replacement under Markovian deterioration. Management Science 12 (1966) 694-706.
    [57] Derman, C. Finite state Markovian decision processes. Academic Press, New York (1970).
    [58] Kao, E.P.C. Optimal replacement rules when changes of state are semi-Markovian. Operations Research 21 (1973) 1231-1249.
    [59] Katehakis, M.N. and C. Derman Optimal repair allocation in a series system Mathematics of Operations Research 9 (1984) 615-623.
    [60] Smith, D.R. Optimal repair of a series system. Operations Research 26 (1978) 653-662.
    [61] Klein, M. Inspection-Maintenance-Replacement Schedules Under Markovian Deterioration. Management Science, Vol. 9 (1962).
    [62] Manne, A. Linear programming and sequential decisions. Management Science Vol. 6 (1960), 259-267.
    [63] Derman, C. On sequential decisions and markov chains. Management Science Vol. 9 (1962), 16-24
    [64] Charnes, A. and W. W. Cooper. Programming with linear fractional functional. The Technological Institute (1962).

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE