成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	林巧玲 Lin, Chiao-Ling
論文名稱：	利用EM演算法優化線性區別分析分類器下的主動學習演算法 Combine Expectation-Maximization Algorithm with Active Learning for Linear Discriminant Analysis Classifier
指導教授：	陳瑞彬 Chen, Ray-Bing
學位類別：	碩士 Master
系所名稱：	管理學院 - 統計學系 Department of Statistics
論文出版年：	2019
畢業學年度：	107
語文別：	英文
論文頁數：	32
中文關鍵詞：	主動學習、線性區別分析、最大期望演算法
外文關鍵詞：	Active Learning, LDA classifier, EM algorithm
相關次數：	點閱：303 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

主動學習是大量未標籤樣本易取且花費成本少但標籤樣本相當稀少，而人工進行樣本標籤又太過昂貴，故為了避免浪費成本，我們利用一些準則挑選有資訊的未標籤樣本進行人工標籤。我們的研究顯示線性區別分析分類器的預測精確度可以藉由增加大量未標籤樣本來加以改善。在本論文中採用的準則主要有經驗AUC以及經驗AUC之影響函數，可使分類器增加更多資訊達到優化效果。過程中結合最大期望演算法以及線性區別分析分類器，以對未標籤樣本進行評估，確認是否有助於分類器優化。在EM演算法過程中，若有過多的未標籤樣本會對分類準確率造成負面影響，故於未標籤樣本中加入加權因子使得分類的準確率可以改善，此外，訓練集樣本數增加到一定程度時分類準確率趨近平衡，故設定停止條件以免造成更多浪費。

Our study shows that the accuracy of linear discriminant analysis (LDA) classifier can be improved by augmenting labeled training data with a pool of unlabeled data. We introduce some criteria, for example, empirical AUC and influence function for empirical AUC, to select the unlabeled points for labeling. The goal is to sequentially identify the unlabeled points to improve the classification accuracy. In addition, the EM algorithm is used to take the unlabeled points into classifier learning. For the huge unlabeled data set, an augmented EM algorithm is used by taking the weight factor to adjust the information from the unlabeled points. Furthermore, we come out a possible stopping criterion for the proposed active learning algorithm.

摘要i
Abstract ii
誌謝iii
Table of Contents iv
List of Tables v
List of Figures vi
Chapter 1. Introduction 1
Chapter 2. Literature Review 3
2.1. EM Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2. Linear Discriminant Analysis (LDA) . . . . . . . . . . . . . . . . . . . . . 5
2.3. Empirical Area Under the Curve (eAUC) . . . . . . . . . . . . . . . . . . 6
2.4. Influence Function for Empirical AUC . . . . . . . . . . . . . . . . . . . . 7
Chapter 3. EM-Active Learning Algorithm with AUC type Query criteria 9
3.1. Augmented EM Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2. Querying Informative and Representative Examples (QUIRE) Criterion . . 11
3.3. Stopping Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4. The Proposed Active Learning Algorithm . . . . . . . . . . . . . . . . . . 13
Chapter 4. Simulation and Data Analysis 15
4.1. Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2. Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Chapter 5. Conclusion and Future Study 27
References 28
Appendix A. Simulation Rsults 29
                                    

Chang, Y.-c. I. and Chen, R.-B. (2019). Active learning with simultaneous subject and variable selections. Neurocomputing, 329:495–505.
Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1):1–22.
Deng, X., Joseph, V. R., Sudjianto, A., and Wu, C. J. (2009). Active learning through sequential design, with applications to detection of money laundering. Journal of the American Statistical Association, 104(487):969–981.
Hampel, F. R. (1974). The influence curve and its role in robust estimation. Journal of the American Statistical Association, 69(346):383–393.
Ke, B.-S., Chiang, A. J., and Chang, Y.-c. I. (2018). Influence analysis for the area under the receiver operating characteristic curve. Journal of Biopharmaceutical Statistics, 28(4):722–734.
Nigam, K., McCallum, A. K., Thrun, S., and Mitchell, T. (2000). Text classification from labeled and unlabeled documents using EM. Machine learning, 39(2-3):103–134.
Pepe, M. S. (2003). The statistical evaluation of medical tests for classification and prediction. Medicine.
Smith, J. W., Everhart, J., Dickson, W., Knowler, W., and Johannes, R. (1988). Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Annual Symposium on Computer Application in Medical Care, page 261. American Medical Informatics Association.

校外：不公開電子論文及紙本論文均尚未授權公開

簡易檢索 / 詳目顯示

相關論文