簡易檢索 / 詳目顯示

研究生: 林蔚安
Lin, Wei-Ann
論文名稱: 在電腦實驗中同時具備定性及定量因子的模擬器
Emulators for the computer experiments with both qualitative and quantitative factors
指導教授: 陳瑞彬
Chen, Ray-Bing
學位類別: 博士
Doctor
系所名稱: 管理學院 - 統計學系
Department of Statistics
論文出版年: 2025
畢業學年度: 113
語文別: 英文
論文頁數: 71
中文關鍵詞: 類別樹冷卻系統設計高斯過程混合輸入高斯過程多精準度高斯過程
外文關鍵詞: Category tree structure, Cooling System Design, Gaussian process, Mixed-input Gaussian process, Multi-fidelity Gaussian process
相關次數: 點閱:125下載:13
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文介紹了一種名為 Category Tree Gaussian Process (ctGP) 的嶄新樹狀代理模型方法,用於同時包含類別 (categorical) 及屬量 (quantitative) 因子的電腦實驗。ctGP 透過針對類別因子進行資料分割的樹狀結構,既能在資料中包含大量類別時保持預測準確度,也提供了高效率且可擴充的解決方案。在單一精度 (single-fidelity) 情境中,ctGP 已透過冷卻系統設計問題展現其成功;本研究進一步將 ctGP 擴展至多重精度 (multi-fidelity) 資料情境,不同精度層級代表不同的準確度和計算成本。

    在此多重精度 ctGP 架構中,研究分別探討了三種不同的分割方式:階層式分割 (hierarchical splitting) 、基於指標變數的分割 (indicator-based splitting) 和純粹類別分割 (purely categorical splitting) 。透過對合成函數和 Borehole 模型進行的數值實驗,證實了該方法在不同情境下的適應性與有效性,並顯示在預測準確度方面優於現有模型。此外,研究也在冷卻系統設計問題中應用多重精度資料,以進一步說明並驗證該方法的影響與實用價值。

    This thesis introduces the Category Tree Gaussian Process (ctGP), a novel tree-based surrogate modeling approach for computer experiments with mixed qualitative and quantitative factors. By leveraging a tree structure that partitions data based on qualitative factors, ctGP provides a computationally efficient and scalable solution while maintaining prediction accuracy, even in the presence of numerous categorical levels. In addition to its success in single-fidelity scenarios, demonstrated through a cooling system design problem, this research expands ctGP to manage multi-fidelity data settings, where each fidelity level provides differing levels of accuracy and computational cost. The proposed multi-fidelity ctGP framework is analyzed under three distinct settings: hierarchical splitting, indicator-based splitting, and purely categorical splitting. Numerical experiments using synthetic functions and the Borehole model demonstrate the method's adaptability and effectiveness across various scenarios. These studies highlight ctGP's superior prediction accuracy compared to existing models. Furthermore, the cooling system design problem is revisited with multi-fidelity data to illustrate the impact and practical benefits of the proposed approach.

    摘要 i Abstract ii 誌謝 iii Table of Contents iv List of Tables vi List of Figures vii Chapter 1. Introduction and Motivation 1 1.1. Introduction 1 1.2. Motivation 4 I Surrogate Modeling for Single-Fidelity Scenarios 8 Chapter 2. Category Tree GP Model for Mixed Inputs 9 2.1. Preliminaries 9 2.1.1. Gaussian Process (GP) 9 2.1.2. Mixed-Input GP 10 2.2. Basic Idea of the proposed method 11 2.3. Tree Growing: Binary Splitting 12 2.4. Tree Pruning 15 2.5. Computational Complexity 17 2.6. Parameter Estimation and Predictions 17 Chapter 3. Numerical Study 21 3.1. Monte Carlo Example 21 3.2. A synthetic example with one qualitative variable 23 3.3. Borehole function example 24 Chapter 4. Emulation Results For the Cooling System Design and Conclusion in Single-Fidelity Scenarios 28 4.1. Parallel fins design 28 4.2. Cross-cut extrusion design 29 4.3. Data availability 29 4.4. Summary and Concluding Remarks for Surrogate Modeling in Single-Fidelity Scenarios 30 II Surrogate Modeling for Multi-Fidelity Data Scenarios 33 Chapter 5. Multi-Fidelity Gaussian Process Model for Mixed-Input Data 34 5.1. Auto-regressive Qualitative and Quantitative Gaussian Process 34 5.2. Extension of ctGP to Multi-Fidelity Models 38 5.2.1. Treating Fidelity as a Categorical Factor 39 5.2.2. Using the Auto-Regressive Framework 40 5.2.3. Combining Tree-Structure Evaluation with Auto-Regressive Modeling 42 Chapter 6. Numerical Results for Multi-Fidelity Modeling and Conclusion in Single Fidelity Scenarios 43 6.1. Synthetic function with two levels of fidelities 43 6.1.1. Function setting 43 6.1.2. Simulation set-ups 44 6.1.3. Performance comparison 45 6.2. Borehole function with two levels of fidelities 47 6.2.1. Test Functions 47 6.2.2. Simulation setup 48 6.2.3. Performance comparison 49 6.3. Modified Borehole function with four levels of fidelities 50 6.3.1. Function setting 51 6.3.2. Simulation setup 51 6.3.3. Performance comparison 52 6.4. Summary and Concluding Remarks for Surrogate Modeling in Multi-Fidelity Scenarios 53 References 55 Appendix A. BCD algorithm for estimating cross-correlation matrix 59 Appendix B. Modified BCD algorithm for data with different quantitative input locations across categories 60

    [ANSYS, 2013] ANSYS (2013). ANSYS Icepak Tutorials. Release 15.0.
    [Breiman et al., 1984] Breiman, L., Friedman, J. H., Olshen, R., and Stone, C. (1984). Classification and Regression Tree. CRC Press.
    [Carvill, 1993] Carvill, J., editor (1993). Butterworth-Heinemann, Oxford. Mechanical Engineer's Data Handbook.
    [Chen et al., 2018] Chen, J. K., Chen, R.-B., Fujii, A., Suda, R., and Wang, W. (2018). Surrogate-assisted tuning for computer experiments with qualitative and quantitative parameters. Statistica Sinica, 28(2):761–789.
    [Chipman et al., 1998] Chipman, H. A., George, E. I., and McCulloch, R. E. (1998). Bayesian CART model search. Journal of the American Statistical Association, 93(443):935–948.
    [Deng et al., 2017] Deng, X., Lin, C. D., Liu, K.-W., and Rowe, R. (2017). Additive Gaussian process for computer models with qualitative and quantitative factors. Technometrics, 59(3):283–292.
    [Eymard et al., 2000] Eymard, R., Gallouët, T., and Herbin, R. (2000). Finite volume methods. Handbook of Numerical Analysis, 7:713–1018.
    [Fang et al., 2006] Fang, K.-T., Li, R., and Sudjianto, A. (2006). Design and Modeling for Computer Experiments. Chapman & Hall/CRC, New York.
    [Fricker et al., 2013] Fricker, T. E., Oakley, J. E., and Urban, N. M. (2013). Multivariate Gaussian process emulators with nonseparable covariance structures. Technometrics, 55(1):47–56.
    [Gramacy, 2007] Gramacy, R. B. (2007). tgp: an R package for Bayesian nonstationary, semiparametric nonlinear regression and design by treed Gaussian process models. Journal of Statistical Software, 19(9):1–46.
    [Gramacy, 2020] Gramacy, R. B. (2020). Surrogates: Gaussian Process Modeling, Design, and Optimization for the Applied Sciences. Chapman and Hall/CRC.
    [Gramacy and Lee, 2008] Gramacy, R. B. and Lee, H. K. H. (2008). Bayesian treed Gaussian process models with an application to computer modeling. Journal of the American Statistical Association, 103(483):1119–1130.
    [Gramacy and Taddy, 2010] Gramacy, R. B. and Taddy, M. (2010). Categorical inputs, sensitivity analysis, optimization and importance tempering with tgp version 2, an R package for treed Gaussian process models. Journal of Statistical Software, 33(6):1–48.
    [Harville, 1998] Harville, D. A. (1998). Matrix Algebra from a Statistician's Perspective. Springer, New York, NY.
    [Huang et al., 2016] Huang, H., Lin, D. K., Liu, M.-Q., and Yang, J.-F. (2016). Computer experiments with both qualitative and quantitative variables. Technometrics, 58(4):495–507.
    [Kenett and Zacks, 1998] Kenett, R. and Zacks, S. (1998). Modern Industrial Statistics: Design and Control of Quality and Reliability. Duxbury Press, Pacific Grove, CA.
    [Kennedy and Eberhart, 1995] Kennedy, J. and Eberhart, R. (1995). Particle swarm optimization. In Proceedings of ICNN'95-international conference on neural networks, volume 4, pages 1942–1948. IEEE.
    [Kennedy and O'Hagan, 2000] Kennedy, M. C. and O'Hagan, A. (2000). Predicting the output from a complex computer code when fast approximations are available. Biometrika, 87(1):1–13.
    [Kleijnen and Mehdad, 2014] Kleijnen, J. P. C. and Mehdad, E. (2014). Multivariate versus univariate Kriging metamodels for multi-response simulation models. European Journal of Operational Research, 236(2):573–582.
    [Kuhn and Johnson, 2019] Kuhn, M. and Johnson, K. (2019). Feature Engineering and Selection: A Practical Approach for Predictive Models. CRC Press.
    [Li et al., 2020] Li, M., Liu, M.-Q., Wang, X.-L., and Zhou, Y.-D. (2020). Prediction for computer experiments with both quantitative and qualitative factors. Statistics & Probability Letters, page 108858.
    [Li and Zhou, 2016] Li, Y. and Zhou, Q. (2016). Pairwise meta-modeling of multivariate output computer models using nonseparable covariance function. Technometrics, 58(4):483–494.
    [Lin et al., 2024] Lin, W.-A., Sung, C.-L., and Chen, R.-B. (2024). Category tree gaussian process for computer experiments with many-category qualitative factors and application to cooling system design. Journal of Quality Technology, 56(5):391–408.
    [Ma and Blaschko, 2021] Ma, X. and Blaschko, M. B. (2021). Additive tree-structured conditional parameter spaces in Bayesian optimization: A novel covariance function and a fast implementation. IEEE Transactions on Pattern Analysis & Machine Intelligence, 43(09):3024–3036.
    [Mak et al., 2018] Mak, S., Sung, C.-L., Wang, X., Yeh, S.-T., Chang, Y.-H., Joseph, V. R., Yang, V., and Wu, C. F. J. (2018). An efficient surrogate model for emulation and physics extraction of large eddy simulations. Journal of the American Statistical Association, 113(524):1443–1456.
    [Marcinichen et al., 2012] Marcinichen, J. B., Olivier, J. A., and Thome, J. R. (2012). On-chip two-phase cooling of datacenters: Cooling system and energy recovery evaluation. Applied Thermal Engineering, 41(August):36–51.
    [Microsoft and Weston, 2017] Microsoft and Weston, S. (2017). foreach: Provides Foreach Looping Construct for R. R package version 1.4.4.
    [Morris and Mitchell, 1995] Morris, M. D. and Mitchell, T. J. (1995). Exploratory designs for computational experiments. Journal of Statistical Planning and Inference, 43(3):381–402.
    [Qian et al., 2008] Qian, P. Z. G., Wu, H., and Wu, C. F. J. (2008). Gaussian process models for computer experiments with qualitative and quantitative factors. Technometrics, 50(3):383–396.
    [R Core Team, 2018] R Core Team (2018). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
    [Sacks et al., 1989] Sacks, J., Welch, W. J., Mitchell, T. J., and Wynn, H. P. (1989). Design and analysis of computer experiments. Statistical Science, 4(4):409–423.
    [Santner et al., 2018] Santner, T. J., Williams, B. J., and Notz, W. I. (2018). The Design and Analysis of Computer Experiments (Second Edition). Springer New York.
    [Svenson and Santner, 2016] Svenson, J. and Santner, T. (2016). Multiobjective optimization of expensive-to-evaluate deterministic computer simulator models. Computational Statistics & Data Analysis, 94(1):250–264.
    [Swiler et al., 2014] Swiler, L. P., Hough, P. D., Qian, P., Xu, X., Storlie, C., and Lee, H. (2014). Surrogate models for mixed discrete-continuous variables. In Constraint Programming and Decision Making, pages 181–202. Springer.
    [Tao et al., 2019] Tao, S., Zhang, Y., Apley, D. W., and Chen, W. (2019). LVGP: Latent Variable Gaussian Process Modeling with Qualitative and Quantitative Input Variables. R package version 2.1.5.
    [Xiong et al., 2012] Xiong, S., Qian, P., and Wu, C.-F. (2012). Sequential design and analysis of high-accuracy and low-accuracy computer codes. Technometrics, 55.
    [Yuan et al., 2021] Yuan, X., Zhou, X., Pan, Y., Kosonen, R., Cai, H., Gao, Y., and Wang, Y. (2021). Phase change cooling in data centers: A review. Energy and Buildings, 236:110764.
    [Zhang and Cai, 2015] Zhang, H. and Cai, W. (2015). When doesn't Cokriging outperform Kriging? Statistical Science, 30(2):176–180.
    [Zhang and Wang, 2010] Zhang, H. and Wang, Y. (2010). Kriging and cross-validation for massive spatial data. Environmetrics, 21(3-4):290–304.
    [Zhang et al., 2020a] Zhang, Q., Chien, P., Liu, Q., Xu, L., and Hong, Y. (2020a). Mixed-input Gaussian process emulators for computer experiments with a large number of categorical levels. Journal of Quality Technology, 53(4):410–420.
    [Zhang and Notz, 2015] Zhang, Y. and Notz, W. I. (2015). Computer experiments with qualitative and quantitative variables: A review and reexamination. Quality Engineering, 27(1):2–13.
    [Zhang et al., 2020b] Zhang, Y., Tao, S., Chen, W., and Apley, D. W. (2020b). A latent variable approach to Gaussian process modeling with qualitative and quantitative factors. Technometrics, 62(3):291–302.
    [Zhou et al., 2011] Zhou, Q., Qian, P. Z., and Zhou, S. (2011). A simple approach to emulation for computer models with qualitative and quantitative factors. Technometrics, 53(3):266–273.

    下載圖示
    校外:立即公開
    QR CODE