| 研究生: |
陳奕誠 Chen, I-Chen |
|---|---|
| 論文名稱: |
台灣地區未成年少女生育率時間和地區變化與社會經濟指標關係之統計模型探討:1977-2005年 Statistical Modeling of Secular Trends and Geographical Variations of Teen Fertility Rates in Relation to Socioeconomic Indicators in Taiwan: 1977-2005 |
| 指導教授: |
王新台
Wang, Shan-Tair |
| 學位類別: |
碩士 Master |
| 系所名稱: |
醫學院 - 公共衛生學系 Department of Public Health |
| 論文出版年: | 2007 |
| 畢業學年度: | 95 |
| 語文別: | 中文 |
| 論文頁數: | 90 |
| 中文關鍵詞: | 邊際模型 、隨機效用模型 、相依計數型資料 、卜瓦松迴歸 、廣義估計方程式 、社會經濟變項 、未成年少女生育率 |
| 外文關鍵詞: | Correlated count data, Socioeconomic variable, Teen fertility rate, Marginal model, Poisson regression, Random-effects model, Generalized estimating equations |
| 相關次數: | 點閱:173 下載:7 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
實務上分析生育率與預測變項之間的關係時,常常會將生育率視為連續型或二分類資料的形式,然而使用的統計方法假設線性或常態分配則會產生許多缺點,另外,在生態型研究中的生育率常會隨著時間進行蒐集,而且在這些橫斷式時間數列資料中也可能存在序列相關,忽略此相關性可能會導致關於預測變項效用的錯誤結論。在本研究中,我們利用由台灣出生登記所獲得的相依計數型資料,比較分析邊際模型和隨機效用模型。台灣的未成年少女(15-19歲)生育率在亞洲四小龍當中高居首位,因此確認相關的顯著預測因子是相當重要的,未成年少女生育率時間和地區變化與社會經濟變項(人口密度、未成年女性高中教育達成率、十五歲以上女性高中教育達成率、勞動參與率、女性勞動參與率、失業率及家庭平均每年經常性收入)的資料來自台灣地區23個縣市,蒐集時間從1977到2005年,統計方法以分析計數型資料的邊際模型和隨機效用模型為主,初步資料分析顯示卜瓦松迴歸具有額外的卜瓦松變動,在接下來的資料分析中必須要進一步考慮。忽略序列觀察值相依性所得到的迴歸係數和相關的95%信賴區間和以廣義估計方程式和隨機係數分析的結果不同,當資料分析考慮序列相關時,人口密度在縱斷面(不同年代)效用上會有縮減的情形產生,不考慮序列相關時,橫斷面(不同縣市)效用的迴歸係數信賴區間較寬,縱斷面效用的迴歸係數信賴區間則較窄。卜瓦松和負二項隨機效用模型都有相似於負二項邊際模型的邊際平均數和變異數存在,廣義估計方程式和隨機係數分析一般都有可比較的結果,不過邊際模型則具有較易解釋的相關性架構。選取的模型對於未成年少女生育率的預測情形除了澎湖縣和屏東縣外,對於其他縣市的預測都不錯,但是在迴歸係數的估計和檢定上,由於觀察對象過少,邊際模型迴歸係數的估計變異數和隨機效用模型的最大概似估計值都需考慮偏誤修正。研究結論認為呈現的額外卜瓦松變動和序列相關都必須考慮到模型建構之中,最後選擇負二項邊際模型,則因為此模型具有相關性架構易於詮釋的優點。
In practice, the fertility rates are often treated as continuous data or dichotomized in studying its relationship to predictor variables. However, such approaches suffer numerous drawbacks as linearity and Gaussian assumption are violated. In addition, the fertility rates are usually collected over time in ecological studies and serial correlations may exist in these cross-section time series data. Ignoring the correlation may lead to erroneous conclusions in regard to the effects of predictor variables. In this study, we compared the marginal and random effects models for the analysis of correlated count data obtained from the Taiwan birth registry. Taiwan has the highest teen fertility rates as compared with Japan, Korea, Singapore and Hong Kong. It is important to identity its significant predictors. The relationship between the secular trends and geographical variations of teen fertility rates and socioeconomic variables (population density, proportion of entire population and female population in paid labor force, high school completion rates among the female population and its subgroup of teen population, unemployment rate, and average annual family income) from 1977 to 2005 in Taiwan were analyzed using the marginal and random effects models for count data. Preliminary data analysis using the Poisson regression model indicated the presence of extra-Poisson variation and was considered in further data analysis. Both the magnitude and the associated ninety-five percent confidence intervals ignoring dependency were different from those of the generalized estimating equations (GEE) and random coefficient analysis. It was evident that the effect of the population density on the secular trends was reduced when the serial correlation was accounted for in the data analysis. The confidence intervals for the effects of the socioeconomic variables on geographical variations were wider and the confidence intervals were shorter for their effects on secular trends when the correlation was not ignored. Both the Poisson and negative random effects models have marginal means and variances similar to those of the negative binomial marginal models. The GEE and random coefficient analysis results in general were comparable. However, the marginal model has easily interpretable correlation structures. These models predict the teen fertility rates well except for Peng Hu and Ping Tung counties. However, for the estimation and testing of the regression coefficients, bias corrections in sandwich estimators of variance for the regression coefficients in the marginal models and the maximum likelihood estimators in the random effects models are needed. In conclusion, both the extra-Poisson variation and serial correlation were present and must be considered in statistical modeling. The negative binomial marginal model has the advantage of more interpretable correlation structures and fits the main purpose of this study.
英文部分
Agresti A. Categorical Data Analysis, John Wiley & Sons, Inc., Hoboken, New Jersey. 2002.
Allison PD. Logistic Regression Using the SAS System: Theory and Application, Cary, N.C.: SAS Institute Inc. 1999.
Breslow N. Extra-Poisson variation in log-linear models. Applied Statistics 1984; 33: 38-44.
Cameron AC, Trivedi PK. Econometric Models Based on Count Data: Comparisons and Applications of Some Estimators. Journal of Applied Econometrics 1986; 1: 29-53.
Cameron AC, Trivedi PK. Regression Analysis of Count Data. Cambridge University Press: Cambridge. 1998.
Chang YC. Residuals Analysis of the Generalized Linear Models for Longitudinal Data. Stat Med 2000; 19: 1277-1293.
Chin HC, Quddus MA. Applying the Negative Binomial Model to Examine Traffic Accident Occurrence at Signalized Intersections. Accident Analysis and Prevention 2003; 35: 253-259.
Cook RD. Detection of Influential Observations in Linear Regression. Technometrics 1977; 19: 15-18.
Crouchley R, Davies RB. A Comparison of Population Average and Random-Effect Models for the Analysis of Longitudinal Count Data with Base-line Information. Journal of the Royal Statistical Society. 1999; 162(3): 331-347.
Diggle PJ, Liang K-Y, Zeger SL. Analysis of Longitudinal Data, Oxford University Press, New York. 1994.
Diggle PJ, Heagerty PJ, Liang K-Y, Zeger SL. Analysis of Longitudinal Data, Oxford University Press, New York. 2002.
Fitzmaurice GM, Laird NM, Rotnitzky AG. Regression Models for Discrete Longitudinal Responses. Stat Sci 1993; 8: 284-309.
Gardner W, Mulvey EP, Shaw EC. Regression Analysis of Counts and Rates: Poisson, Overdispersed Poisson, Negative Binomial Models. Psychological Bulletin 1995; 118(3): 392-404.
Gmel G, Rehm J, Frick U. Methodological Approaches to Conducting Pooled Cross-Sectional Time Series Analysis: The Example of the Association between All-Cause Mortality and per capita Alcohol Consumption for Men in 15 European States. European Addiction Research 2001; 7: 128-137.
Hardin JW, Hilbe JM. Generalized Estimating Equations. New York: Chapman & Hall, 2002.
Hausman J, Hall BH, Griliches Z. Econometric Models for Count Data with an Application to the Patents-R&D Relationship. Econometrica 1984; 52: 909-938.
Hepburn L, Miller M, Azrael D, Hemenway D. The Effect of Nondiscretionary Concealed Weapon Carrying Laws on Homicide. Journal of Trauma-Injury Infection & Critical Care 2004; 56(3): 676-681.
Hu FB, Goldberg J, Hedeker D, Flay BR, Pentz MA. Comparison of Population-Averaged and Subject-Specific Approaches for Analyzing Repeated Binary Outcomes. Am J Epidemiol 1998; 147: 694-703.
Jowaheer V, Sutradhar BC. Analysis Longitudinal Count Data with Overdispersion. Biometrika Trust 2002; 89(2): 389-399.
Laird NM, Ware JH. Random-Effects Models for Longitudinal Data. Biometrics 1982; 38: 1825-1839.
Lawless JF. Negative Binomial and Mixed Poisson Regression. Canad. J. Statist. 1987; 15: 209-225.
Liang K-Y, Zeger SL. Longitudinal Data Analysis Using Generalized Linear Models. Biometrika 1986; 73: 13-22.
McCullagh P, Nelder JA. Generalized Linear Models, 2nd ed. New York: Chapman & Hall, 1989.
Pan W. Akaike’s Information Criterion in Generalized Estimating Equations. Biometrics 2001; 57: 120-125.
Pan Z, Lin DY. Goodness-of-Fit for Generalized Linear Mixed Models. Biometrics 2005; 61: 1000-1009.
Preisser JS, Qaqish BF. Deletion Diagnostics for Generalised Estimating Equations. Biometrika 1996; 83(3): 551-562.
Prentice RL. Correlated Binary Regression with Covariates Specific to Each Binary Observation. Biometrics 1988; 44: 1033-1048.
Schellhorn M, Stuck AE, Minder CE, Beck JC. Health Services Utilization of Elderly Swiss: Evidence from Panel Data. Health Econ 2000; 9: 533-545.
Shankar VN, Albin RB, Milton JC, Mannering, FL. Evaluation of Median Crossover Likelihoods with Clustered Accident Counts: An Empirical Inquiry Using the Random Effect Negative Binomial Model. Transport. Res. Record 1998; 1635: 44-48.
Thall PF, Vail SC. Some Covariance Models for Longitudinal Count Data with Overdispersion. Biometrics 1990; 46: 657-671.
Twisk JW. Longitudinal Data Analysis. A Comparison between Generalized Estimating Equations and Random Coefficient Analysis. European Journal of Epidemiology 2004; 19(8): 769-776.
Vonesh EF, Chinchilli VM, Pu K. Goodness-of-Fit in Generalized Nonlinear Mixed-Effects Models. Biometrics 1996; 52: 572-587.
Zeger SL, Liang K-Y. An Overview of Methods for the Analysis of Longitudinal Data. Stat Med 1992; 11 :1825-1139.
Zheng B. Summarizing the Goodness of Fit of Generalized Linear Models for Longitudinal Data. Stat Med 2000; 19: 1265-1275.
中文部分
余琴芬,社會經濟指標與未成年生育率關係之生態研究,國立成功大學公共衛生研究所碩士論文,2002。
張玉坤,GEE之敏感度分析—偵測高影響之觀察值,中華衛誌,1996;15(5):403-410。