簡易檢索 / 詳目顯示

研究生: 陳玟郿
Chen, Wen-mei
論文名稱: 肺癌血清MALDI實驗資料之統計分析
Statistical Analysis of MALDI Mass Spectrometry Data of Unfractionated Serum in Lung Cancer
指導教授: 馬瀰嘉
Ma, Mi-Chia
學位類別: 碩士
Master
系所名稱: 管理學院 - 統計學系
Department of Statistics
論文出版年: 2008
畢業學年度: 96
語文別: 英文
論文頁數: 29
中文關鍵詞: 蛋白質晶片基準點分類器八次交叉驗證
外文關鍵詞: baseline, 8-fold cross validation, mass spectrometry data, classifier
相關次數: 點閱:117下載:6
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • DNA 微陣列晶片可以一次同時觀察到數千至上萬個基因。因此,
    這項技術被廣泛的使用在臨床實驗中,尤其在搜尋癌症的致病基因
    上。所以近年來有大量關於這些研究的報告,但這些報告所搜尋出的基因卻無法適用在大部分的人身上。主要理由是因為許多研究者都忽略一些潛在而易犯的錯誤,如蒐集資料的方式、樣本間基本資料的齊一性等,使得研究結果受到質疑。因此,這篇論文主要是提供非統計學家基本資料分析的步驟,將其應用在MALDI 資料上(蛋白質晶片)來搜尋致病基因。最後,我們將此步驟套用在由范德堡大學石瑜教授所提供的肺癌資料上,將其基因表現值透過基準點的調整後,使用四種分類器以及八次的交叉驗證,篩選出四個基因可以用以區別肺癌,為X2016, X2035, X2056, 和 X2082,且分類準確率達97%。

    The expression of several thousand genes can be studied simultaneously by DNA microarrays. It is a highly promising technique with broad applications, especially
    applies prognostic prediction at present. Therefore, the literatures are abundant about prognostic markers for patients with cancer. But most proposed markers are false
    alarm. Because of many potential pitfalls exist in the microarrays data, for example the method of collecting data is not clear, the baseline of data does not consistent etc. The majority researchers leave it out of consideration, therefore the result usually is
    queried. This paper provides fundamental statistical analysis based on the matrix-assisted laser desorption ionization (MALDI) mass spectrometry data provided by Dr.
    Yu Shyr of Vanderbilt University. We analyze the lung cancer data using subjectbackground data to adjust the intensity data of mass spectrometry. Four features are
    found including X2016, X2035, X2056, and X2082 to distinguish lung cancer with classified accuracy of 97% by the support vector machine classifier and 8-fold cross
    validation.

    List of Tables .....................................II List of Figures ...................................III Chapter 1 Introduction ............................. 1 Chapter 2 Flow Chart and Methods ....................3 2.1 Flow Chart ......................................3 2.2 Classification Methods ......................... 6 2.3 Evaluation approach ............................ 8 2.4 Cox regression ................................ 10 Chapter 3 Real Example ............................ 12 3.1 Experimental Setup ............................ 12 3.2 Confirmation .................................. 22 3.3 Cox regression Analysis ....................... 26 Chapter 4 Conclusions ............................. 27 Reference ......................................... 29

    [1] Bø T. and Jonassen I. (2002) “New feature subset selection procedures for classification of expression profiles.” Genome Biol 3: 0017.1–0017.11
    [2] Dudoit S., Fridlyand J., Speed T. P.. (2002) “Comparison of discrimination methods for the classification of tumors using gene expression data.” J Am Stat Assoc 97:77–87.
    [3] Hedenfalk I., Duggan D., Chen Y. et al. (2001) ”Gene-expression profiles in hereditary breast cancer.” N Engl J Med 344(8):539-48.
    [4] Van 't Veer L.J. , Dai H. , van de Vijver M. J. , et al. (2002) ”Gene expression profiling predicts clinical outcome of breast cancer.” Nature; 415:530–36.
    [5] Yildiz P.B., Shyr Y, PhD, Rahman J. S. M., et al. (2007) “Diagnostic Accuracy of MALDI Mass Spectrometric Analysis of Unfractionated Serum in Lung Cancer.” J Thorac Oncol. 2: 893–901
    [6] Zhang H., Yu C.Y., Xiong M. et al. (2001) “Recursive partitioning for tumor classification with gene expression microarray data.” Proc Natl Acad Sci 98: 6730–6735

    下載圖示 校內:2010-07-07公開
    校外:2010-07-07公開
    QR CODE