簡易檢索 / 詳目顯示

研究生: 趙國秀
Chao, Kuo-Hsiu
論文名稱: 模型建立與假設評估上的轉換研究
The Study of Transformation in Model Building and Assumption Assessment
指導教授: 路繼先
Lu, Chi-Hsien
學位類別: 碩士
Master
系所名稱: 管理學院 - 統計學系
Department of Statistics
論文出版年: 2006
畢業學年度: 94
語文別: 英文
論文頁數: 80
中文關鍵詞: 加法性模式誤差變異齊一常態分配誤差殘差圖常態分位數圖
外文關鍵詞: Additive model, Homogeneous error variance, Normal error, Residual plot, Normal Q-Q plot
相關次數: 點閱:288下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 資料的單位通常是以量測方便而定的,這可能並不適合用在描述資料的簡化模型上. 為能夠達到模型的簡化, 我們通常會需要轉換反應變數或解釋變數或是兩者. 轉換的目的除了建構簡單適合的模式外, 也包括使得所選擇的模式能夠符合各項模型假設. Box and Cox (1964) 所提出的轉換應是在應用上最為成功的一個. 而他們也提出了以對反應變數作轉換以使得模式符合各項假設的選擇標準.
    藉著現今電腦與軟體在快速計算, 處理複雜程式, 以及繪製各式圖形的能力, 我們得以對此些方法能夠作得更有效率並得以擴充與延伸.

    我們重新檢視幾個 Draper and Hunter (1969) 上的例子為開端, 提出了改進的做法. 接著我們檢視 Snee (1986) 所提出處理多重模型的方法, 以有效率的程式來處理複雜的計算問題. 然後我們就診斷模型假設討論的加法性模式, 誤差變異齊一, 以及誤差為常態分配等的模型假設. 我們提出一個名為 P-值圖的做法, 來整合轉換的選擇以同時能夠滿足各項模型假設. 而一些與轉換和模型建構相關的議題亦包括於後.

    Usually, the units in which data are recorded are chosen merely as a matter of convenience in measurement. They need not be those in which the system that generates the data is modeled in its possible simplest form. To achieve simplicity, transformations may be applied to the response, or the explanatory variables, or both. Such purpose of taking transformation includes not only on building simple and proper model, but also on making the proposed model satisfy various model assumptions. The Box and Cox (1964) version of power transformation has been the most accepted one in practice. They also proposed a useful and practical criterion for achieving a model that satisfies various model assumptions
    via power transformation on the response variable.
    With the capability of modern computer and softwares: faster computing, handling of complicated programming, and generating of sophisticated graphs, we are able to make use of such approach more efficiently, and have the approach further extended and generalized.

    We start by having some examples in Draper and Hunter (1969) actually re-revisited, and propose ways to improve their results. Then, we examine the approach proposed by Snee (1986) in handling multiple model structure with an efficient way in programming to deal with complicated computation
    issue. Then, we discuss issues about diagnosis on model assumptions regarding additive model, homogeneous error variance, and normally distributed error. A so-called P-value plot approach is proposed to unify the ways of choosing transformation to make the proposed model
    satisfy various assumptions simultaneously.
    Relevant but miscellaneous issues of transformation and model building are included thereafter.

    1 Introduction 3 1.1 Power transformation . . . . . . . . . . . . . . . . . . . . 5 2 Transformation on Model Building 7 2.1 Mode y on x . . . . . . . . . . . . . . . . . . . . . . . . .9 2.1.1 RSS Approach for Model y(λ) on x(α) . . . . . . . . . 9 2.1.2 ML Approach for Model y(λ) on x(α) . . . . . . . . . .9 2.1.3 Demonstration with Another Data Set: Braking . . . . . 14 2.2 Model y on Mean Function M . . . . . . . . . . . . . . . . . 16 2.2.1 RSS Approach for Model y[λ] on M[λ] . . . . . . . . . 18 2.2.2 ML Approach for Model y[λ] on M[λ] . . . . . . . . . 19 3 Assumption Assessment 23 3.1 Additive Model . . . . . . . . . . . . . . . . . . . . . . .23 3.2 Homogeneous Error Variance . . . . . . . . . . . . . . . . . 31 3.3 Normally Distributed Error . . . . .. . . . . . . . . . . . .38 3.4 Assessing Various Assumptions . . . . . . . . . . . . . . . .44 3.4.1 P-value Plot for Others . . . . . . . . . . . . . . . .49 4 Relevant Issues of Power Transformation 51 4.1 A Simpler Version of Power Transformation . . . . . . . . . . 51 4.2 Correlation as Criterion for Model Building . . . . . . . . . 53 4.2.1 Correlation on Scatterplot . . . . . . . . . . . . . . .53 4.2.2 Correlation on Normal Q-Q Plot . . . . . . . . . . . . 56 5 Concluding Remarks 60 References 64 Appendix 65

    Bartlett, M. S. (1947), “The use of transformations,” Biometrics, 3, 39–52.
    Box, G. E. P. (1953), “Non-normality and tests on variances,” Biometrika, 40, 318–335.
    Box, G. E. P., andCox, D. R. (1964), “An analysis of transformations,” (with discussion), Journal of the Royal Statistical Society, series B, 26, 211–246.
    Box, G. E. P., andFung, C. A. (1995), “The importance ofdata transformation in designed experiments for life testing,” Quality Engineering, 7, 625–638.
    Box, G. E. P., Hunter, W. G., and Hunter, J. S. (2005), Statistics for Experimenters: Design, Innovation, and Discovery, New York: John Wiley& Sons.
    Breusch, T. S., and Pagan, A. R. (1979), “A simple test for heteroskedasticity and and random coefficient variation,” Econometrica, 47, 1287–1294.
    Chambers, J. M., Cleveland, W. S., Keiner, S., and Tukey, P. A. (1983), Graphical Methods For Data Analysis, New York: Chapman &Hall.
    Conover, W. J., Johnson, M. E.,andJohnson, M. M. (1981), “A comparative study oftests for homogeneityof variances, with applications to the outer continental shelfbidding data,”Technometrics, 23, 351–361.
    Cook, R. D.,andWeisberg, S. (1982), Residuals andInfluence in Regression, London: Chap-man & Hall.
    — (1983),“Diagnostics for heteroscedasticityin regression,” Biometrika, 70, 1–10.
    — (1999),AppliedRegression Including Computing and Graphics, New York: John Wiley& Sons.
    Draper, N. R., and Hunter, W. G. (1969), “Transformations: Some examples revisited,” Technometrics, 11, 23–40.
    Ezekial, M. (1941), Methods of Correlation Analysis, London: John Wiley&Sons.
    Gentleman, R., and Ihaka, R. (2000), “Lexical scope and statistical computing,” Journal ofComputational and GraphicalStatistics, 9, 491–508.
    Hald, A. (1960),Statistical Theory withEngineering Applications, New York: John Wiley.
    Hinkley, D. V. (1989), “Modified profile likelihood in transformed linear models,” Ap-pliedStatistics, 38, 495–506.
    Hinkley, D. V., and Runger, G. (1984), “The analysis of transformed data,” Journal of the American Statistical Association, 79, 302–320.
    Levene, H. (1960),Robust Tests for Equality of Variances, Standford UniversityPress.
    Meeker, W. Q., and Escobar, L. A. (1998), Statistical Methods for Reliability Data, New York: John Wiley& Sons.
    Milliken, G. A., and Graybill, F. A. (1968), “Extensions of the general linear hypothesis model,” Journal of the American Statistical Association, 65, 797–807.
    R Development Core Team (2006), R: A language and Environment for Statistical Comput-ing, R Foundation for Statistical Computing, Vienna, Austria: ISBN 3-900051-07-0, URL: http://www.R-project.org.
    Ryan, T. A., Joiner, B. L., and Ryan, B. F. (1976), The Minitab Student Handbook, North Scituate, MA: DuxburyPress.
    Shapiro, S. S., and Francia, R. S. (1972), “An approximate analysis of variance test for normality,” Journal of the American Statistical Association, 67, 215–216.
    Shapiro, S. S., and Wilk, M. B. (1965), “An analysis of variance test for normality (com-plete samples),”Biometrika, 52, 591–611.
    Shapiro, S. S., Wilk, M. B., andChen, H. J. (1968),“A comparative study of various tests for normality,” Journal of the American Statistical Association, 63, 1343–1372.
    Snee, R. D. (1986),“An alternative approach to fitting models when re-expression of the response is useful,” Journal of Quality Technology, 18, 211–225.
    St. Laurent, R. (1990),“The equivalence ofthe milliken-graybillprocedure andthe score test,” The American Statistician, 44, 36–37.
    Stephens, M. A. (1974),“EDF statistics for goodness-of-fit and some comparisons,” Jour-nal of the American Statistical Association, 69, 730–737.
    Tukey, J. W. (1949),“One degree of freedom for nonadditivity,” Biometrics, 5, 232–242.
    — (1977),Exploratory Data Analysis, Reading, MA: Addison-Wesley.
    Weisberg, S. (2005), AppliedLinear Regression, 3rd ed., New Jersey: John Wiley& Sons.


    下載圖示
    2006-06-28公開
    QR CODE