簡易檢索 / 詳目顯示

研究生: 鄭暘諭
Cheng, Yang-Yu
論文名稱: 利用經驗貝氏方法估計錯誤發現率
Estimation of False Discovery Rate Using Empirical Bayes Method
指導教授: 馬瀰嘉
Ma, Mi-Chia
學位類別: 碩士
Master
系所名稱: 管理學院 - 統計學系
Department of Statistics
論文出版年: 2016
畢業學年度: 104
語文別: 中文
論文頁數: 31
中文關鍵詞: EM演算法貝氏分析整體型I誤發生率錯誤發生率
外文關鍵詞: EM Algorithm, Bayesian Approach, FWER, FDR
相關次數: 點閱:214下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在多重檢定的問題中,如果不調整個別檢定之顯著水準,仍設定α,則m個檢定的整體犯錯率就會膨脹為mα。過去文獻顯示當虛無假設是錯的情況下,控制整體型I誤發生率(familywise error rate; FWER)的方法會出現較低的檢定力,和個別型I錯誤發生率(type I error rate)低於顯著水準的問題。同時對多個假設檢定進行比較時,首要問題是如何控制型I錯誤發生率,廣為熟知的是控制FWER,另一可能的解決辦法為控制錯誤發生率(false discovery rate; FDR),無論是FWER或是FDR,要能改善當虛無假設不為真時,所帶來較低檢定力的問題,可以針對虛無假設為真的個數給一較精確的估計。
    本篇假設數個基因資料分別呈混合型常態分配,及假設參數具先驗分配,利用貝氏驗後分配和EM演算法估計分配中虛無假設為真的比例,進而估計虛無假設為真時的個數和FDR。
    當基因個數足夠且病人個數較大的情況,真實虛無假設為真的比例越高,提出之EBay估計越能有較小的RMSE,估計越精確,且透過蒙地卡羅演算法可模擬不同參數組合下的表現性質,Ma & Chao (2011)應用McNemar檢定之估計方法,若維持設定顯著水準α=0.05會造成估計誤差偏大,Benjamini & Hochberg (2000)提出之估計方法在設定基因突變比例為隨機的情況下表現並不穩定,Ma & Tsai (2011)應用傅萊得曼檢定之估計方法亦有相同情形。

    In multiple testing problems, if you do not adjust the individual type I error rate and still set the individual significance level α, then the overall type I error rate of m hypotheses will be expanded to be mα.

    This study assumes that several genes have mixed normal distribution, and parameters have prior distribution. We use the Bayesian posterior distribution and EM algorithm to estimate the proportion of the null hypothesis which is true, then to estimate the number of null hypothesis which is true, and FDR.

    We compare the performance of these estimators for different parameters through the Monte Carlo algorithm. The estimator using McNemar test proposed by Ma & Chao (2011) may cause estimation error too large as the significance level is set to be α=0.05. The estimator proposed by Benjamini & Hochberg (2000) is unstable when the ratio of gene mutation is set to be random. The estimator using Friedman test proposed by Ma & Tsai (2011) also has the same scenario. When the number of genes and the number of patients both are large and the proportion of true null hypothesis is higher, the proposed EBay estimator has the smaller RMSE. Hence it’s more accurate.

    第一章 緒論 1 第二章 文獻探討 3 第三章 研究方法 8 第四章 統計模擬 13 第一節 實例分析 13 第二節 模擬程序 15 第三節 模擬結果 16 第五章 結論與建議 18 參考文獻 20 附錄 22

    一、英文
    1.Benjamini, Y., Hochberg, Y. (1995). “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing”, Journal of the Royal Statistical Society, B 57, pp.289-300.
    2.Benjamini, Y., Hochberg, Y. (2000). “On the Adaptive Control of the False Discovery Rate in Multiple Testing with Independent Statistics”, Journal of Educational and Behavioral Statistics, 25, pp.60-83.
    3.Benjamini, Y. and Liu, W. (1999). “A Step-down Multiple Hypotheses Testing Procedure that Controls the False Discovery Rate under Independence”, Journal of Statistical Planning and Inference, 82(1-2), pp.163-170.
    4.Diebolt J., Robert, C.P. (1994). “Estimation of Finite Mixture Distributions through Bayesian sampling”, Journal of the Royal Statistical Society, Series B, 56 (2), pp.363-375.
    5.Efron, B. (2007). “Correlation and Large-scale Simultaneous Significance Testing”, Journal of the American Statistical Association, 102, pp.93-103.
    6.Fraley, C. and Raftery, A.E. (2007). “Bayesian Regularization for Normal Mixture Estimation and Model-based Clustering”, Journal of Classification, 24, pp.155-181.
    7.Friguet, C., Kloareg, M., Causeur, D. (2009). “A Factor Model Approach to Multiple Testing under Dependence”, Journal of the American Statistical Association, 104, pp.1406-1415.
    8.Gordon, A., Glazko G., Qiu X., and Yakovlev A. (2007). “Control of the Mean Number of False Discoveries, Bonferroni and Stability of Multiple Testing”, The Annals of Applied Statistics, 1 (1), pp.179-190.
    9.Hsueh, H. M., Chen, J. J. and Kodel, R. L. (2003). “Comparison of Methods for Estimating the Number of True Null Hypotheses in Multiplicity Testing”, Journal of Biopharmaceutical Statistics, 13, pp.675-689.
    10.Liang, L. (2009). “On Simulation Methods for Two Component Normal Mixture Models under Bayesian Approach”, U.U.D.M. Project Report (2009), pp.17.
    http://www.diva-portal.org/smash/get/diva2:300849/FULLTEXT01.pdf
    11.Ma, M. C., Chao, W. C. (2011). “A Nonparametric Approach of Estimating the Number of True Null Hypotheses in Multiple Testing”, International Statistical Institute, August, Ireland, pp.4669-4674 .
    12.Ma, M. C., Tsai, C. Y. (2011). “A Nonparametric Approach to Estimate the Number of True Null Hypotheses in Multiple Testing under Dependency”, Master essay of Department of Statistics, NCKU.
    13.Storey, J. D., Dai, J. Y., and Leek, J. T. (2007). “The Optimal Discovery Procedure for Large-scale Significance Testing, with Application to Comparative Microarray Experiments”, Biostatistics, 8, pp.414-432.
    14.Titterington, D. M. (1985). “Statistical Analysis of Finite Mixture Distributions”, 1st Ed., Wiley, New York.

    二、中文
    1.林育興(2010),「以混合Beta模型估計多重比較檢定下虛無假設為真的比例」,國立臺北大學統計學系碩士論文。
    2.許乾柚(2008),「利用混合模型估計多重比較中真實虛無假設個數」,國立臺北大學統計學系碩士論文。

    下載圖示 校內:2018-07-01公開
    校外:2018-07-01公開
    QR CODE