| 研究生: |
黃建智 Huang, Jian-Zhi |
|---|---|
| 論文名稱: |
基於卷積神經網路辨識基底細胞於三倍頻顯微影像 Basal Cell Recognition based on Convolutional Neural Network in Third Harmonic Generation Microscopy Image |
| 指導教授: |
李國君
Lee, Gwo-Giun |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2017 |
| 畢業學年度: | 105 |
| 語文別: | 英文 |
| 論文頁數: | 44 |
| 中文關鍵詞: | 基底細胞辨識 、卷積神經網路 、手調式初始化 、非監督式預訓練 、基底細胞辨識 、三倍頻顯微術 |
| 外文關鍵詞: | basal cell recognition, convolutional neural network, hand-crafted initialization, unsupervised pre-training, Third Harmonic Generation Microscopy |
| 相關次數: | 點閱:100 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本篇論文來自一項跨領域合作計畫,該計畫的目標為開發用於皮膚疾病診斷的電腦輔助醫學影像分析系統,其背景結合了醫學、光學成像及影像分析,其中影像分析的工作是由我們的團隊負責。本篇論文的應用為基底細胞辨識,原因來自於某些皮膚疾病會導致細胞形態的改變,我們能夠利用醫學影像中細胞的形態來判別該皮膚是否患有疾病,這樣的訊息能夠提供給醫師作進一步的診斷。有別於以往的醫學影像分析基於病理特性利用影像處理技術來萃取特徵,我們使用在模式辨別(pattern recognition)上有卓越表現的卷積神經網路,其以大量資料的訓練來學習資料中的形態,然而由於在醫學影像的獲取上通常不是很便易,很難收集到足夠量的資料來訓練神經網路,所以想討論給卷積神經網路一個好的初始狀態是否能讓網路的訓練收斂得更快,我們使用了兩種方法,分別為手調式初始化以及非監督式預訓練,手調式初始化是根據本身對資料的了解來決定卷積神經網路中卷積層的核,而在非監督式預訓練的實驗中,我們先利用卷積自動編碼器(convolutional auto-encoder)自我學習的能力來學習資料的形態,再施以監督式微調,實驗結果顯示兩種方法皆有助於使網路的訓練收斂得更快,在辨識率上,手調式初始化比非監督式預訓練稍微高一點,但當資料過於複雜以至於難以挑選合適的核時,非監督式預訓練則可能比較有幫助,所以應該以對資料的了解程度來選擇適合的初始化方法。
This thesis is from an interdisciplinary cooperation project for developing a computer-aided medical image analysis system for skin diseases diagnosis. Its background combines the medical knowledge, optical imaging technique, and image analysis. The application of this thesis is basal cell recognition. Since some skin diseases will change the appearance of cell, this thesis determines whether a skin is diseased by the cell representation in medical images. Such information could provide to doctor for skin disease diagnosis.
Different from the traditional medical image analysis which uses image processing techniques to extract features based on pathological characteristics, the convolutional neural network (CNN) which has outstanding performance on pattern recognition is adopted in this thesis. It learns the data representation by training with plenty of data. Due to the acquisition of medical images is usually inconvenient, it’s hard to collect sufficient data to train the neural network. So we would like to discuss that if the network training could converge earlier with a smaller training set by having a better initialization of CNN. The methods used here are the hand-crafted initialization and unsupervised pre-training. The hand-crafted initialization is to determine the kernels in the convolutional layers of CNN based on the prior knowledge of data. In the experiment of unsupervised pre-training, convolutional auto-encoder is adopted to self-learn the data representation before supervised fine-tuning. Experimental results show that both methods could help to make the network training converge earlier. The recognition rate of hand-crafted initialization is slightly higher than unsupervised pre-training here, but unsupervised pre-training might be more helpful while the pattern of data is more complicated. Hence the initialization method should be chosen depending upon the extent the data has been understood.
[1] S.-Y. Chen, H.-Y. Wu, and C.-K. Sun, "In vivo harmonic generation biopsy of human skin," Journal of Biomedical Optics, vol. 14, pp. 060505-060505-3, 2009.
[2] S.-Y. Chen, S.-U. Chen, H.-Y. Wu, W.-J. Lee, Y.-H. Liao, and C.-K. Sun, "In vivo virtual biopsy of human skin by using noninvasive higher harmonic generation microscopy," IEEE Journal of Selected Topics in Quantum Electronics, vol. 16, pp. 478-492, 2010.
[3] C.-K. Sun, S.-W. Chu, S.-Y. Chen, T.-H. Tsai, T.-M. Liu, C.-Y. Lin, et al., "Higher harmonic generation microscopy for developmental biology," Journal of structural biology, vol. 147, pp. 19-30, 2004.
[4] S. Shetty, "Keratinization and its Disorders," Oman Med J, vol. 27, pp. 348-57, Sep 2012.
[5] M.-R. Tsai, Y.-H. Cheng, J.-S. Chen, Y.-S. Sheen, Y.-H. Liao, and C.-K. Sun, "Differential diagnosis of nonmelanoma pigmented skin lesions based on harmonic generation microscopy," Journal of biomedical optics, vol. 19, pp. 036001-036001, 2014.
[6] S. Kim, S. Ahn, and H. Choo, "Artificial Intelligence of AlphaGo," SPRi Issue Report, vol. 1, p. 2016, 2016.
[7] P. Sermanet, S. Chintala, and Y. LeCun, "Convolutional neural networks applied to house numbers digit classification," in Pattern Recognition (ICPR), 2012 21st International Conference on, 2012, pp. 3288-3291.
[8] S. Lawrence, C. L. Giles, A. C. Tsoi, and A. D. Back, "Face recognition: A convolutional neural-network approach," IEEE transactions on neural networks, vol. 8, pp. 98-113, 1997.
[9] S. Haykin and N. Network, "A comprehensive foundation," Neural Networks, vol. 2, 2004.
[10] V. Nair and G. E. Hinton, "Rectified linear units improve restricted boltzmann machines," in Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010, pp. 807-814.
[11] Y. LeCun, L. Bottou, G. B. Orr, and K.-R. Müller, "Efficient BackProp," in Neural Networks: Tricks of the Trade, G. B. Orr and K.-R. Müller, Eds., ed Berlin, Heidelberg: Springer Berlin Heidelberg, 1998, pp. 9-50.
[12] L. Bottou, "Large-scale machine learning with stochastic gradient descent," in Proceedings of COMPSTAT'2010, ed: Springer, 2010, pp. 177-186.
[13] G. Hinton, "A practical guide to training restricted Boltzmann machines," Momentum, vol. 9, p. 926, 2010.
[14] E. Aarts and J. Korst, "Simulated annealing and Boltzmann machines," 1988.
[15] G. E. Hinton, "Training products of experts by minimizing contrastive divergence," Neural computation, vol. 14, pp. 1771-1800, 2002.
[16] W. R. Gilks and P. Wild, "Adaptive rejection sampling for Gibbs sampling," Applied Statistics, pp. 337-348, 1992.
[17] R. Salakhutdinov and G. E. Hinton, "Deep Boltzmann Machines," in AISTATS, 2009, p. 3.
[18] A. Ng, "Sparse autoencoder," CS294A Lecture notes, vol. 72, pp. 1-19, 2011.
[19] J. Masci, U. Meier, D. Cireşan, and J. Schmidhuber, "Stacked convolutional auto-encoders for hierarchical feature extraction," in International Conference on Artificial Neural Networks, 2011, pp. 52-59.
[20] G. E. Hinton, "Deep belief networks," Scholarpedia, vol. 4, p. 5947, 2009.
[21] O. Chapelle, B. Scholkopf, and A. Zien, "Semi-Supervised Learning (Chapelle, O. et al., Eds.; 2006)[Book reviews]," IEEE Transactions on Neural Networks, vol. 20, pp. 542-542, 2009.
[22] X. Zhu, "Semi-supervised learning," in Encyclopedia of machine learning, ed: Springer, 2011, pp. 892-897.
[23] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in Advances in neural information processing systems, 2012, pp. 1097-1105.
[24] N. Kalchbrenner, E. Grefenstette, and P. Blunsom, "A convolutional neural network for modelling sentences," arXiv preprint arXiv:1404.2188, 2014.
[25] P. Y. Simard, D. Steinkraus, and J. C. Platt, "Best practices for convolutional neural networks applied to visual document analysis," in ICDAR, 2003, pp. 958-962.
[26] D. Barber, Bayesian reasoning and machine learning: Cambridge University Press, 2012.
[27] A. Hyvärinen, J. Karhunen, and E. Oja, Independent component analysis vol. 46: John Wiley & Sons, 2004.
[28] T.-W. Lee, "Independent component analysis," in Independent Component Analysis, ed: Springer, 1998, pp. 27-66.
[29] S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift," arXiv preprint arXiv:1502.03167, 2015.
[30] N. Pinto, D. D. Cox, and J. J. DiCarlo, "Why is real-world visual object recognition hard?," PLoS Comput Biol, vol. 4, p. e27, 2008.
[31] S. Lyu and E. P. Simoncelli, "Nonlinear image representation using divisive normalization," in Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, 2008, pp. 1-8.
[32] K. Jarrett, K. Kavukcuoglu, and Y. Lecun, "What is the best multi-stage architecture for object recognition?," in 2009 IEEE 12th International Conference on Computer Vision, 2009, pp. 2146-2153.
[33] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, "Improving neural networks by preventing co-adaptation of feature detectors," arXiv preprint arXiv:1207.0580, 2012.
[34] L. Prechelt, "Automatic early stopping using cross validation: quantifying the criteria," Neural Networks, vol. 11, pp. 761-767, 1998.
[35] R. C. S. L. L. Giles, "Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping," in Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference, 2001, p. 402.
[36] S. Frühwirth‐Schnatter, "Data augmentation and dynamic linear models," Journal of time series analysis, vol. 15, pp. 183-202, 1994.
[37] G. C. Wei and M. A. Tanner, "A Monte Carlo implementation of the EM algorithm and the poor man's data augmentation algorithms," Journal of the American statistical Association, vol. 85, pp. 699-704, 1990.
[38] D. M. Allen, "The relationship between variable selection and data agumentation and a method for prediction," Technometrics, vol. 16, pp. 125-127, 1974.
[39] B. Schölkopf and A. J. Smola, Learning with kernels: support vector machines, regularization, optimization, and beyond: MIT press, 2002.
[40] A. Y. Ng, "Feature selection, L 1 vs. L 2 regularization, and rotational invariance," in Proceedings of the twenty-first international conference on Machine learning, 2004, p. 78.
[41] M. Y. Park and T. Hastie, "L1‐regularization path algorithm for generalized linear models," Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 69, pp. 659-677, 2007.
[42] M. Schmidt, G. Fung, and R. Rosales, "Fast optimization methods for l1 regularization: A comparative study and two new approaches," in European Conference on Machine Learning, 2007, pp. 286-297.
[43] J. Moody, S. Hanson, A. Krogh, and J. A. Hertz, "A simple weight decay can improve generalization," Advances in neural information processing systems, vol. 4, pp. 950-957, 1995.
[44] N. K. Treadgold and T. D. Gedeon, "Simulated annealing and weight decay in adaptive learning: the SARPROP algorithm," IEEE Transactions on Neural Networks, vol. 9, pp. 662-668, 1998.
[45] N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: a simple way to prevent neural networks from overfitting," Journal of Machine Learning Research, vol. 15, pp. 1929-1958, 2014.
[46] G. G. Lee, H. H. Lin, M. R. Tsai, S. Y. Chou, W. J. Lee, Y. H. Liao, et al., "Automatic Cell Segmentation and Nuclear-to-Cytoplasmic Ratio Analysis for Third Harmonic Generated Microscopy Medical Images," IEEE Transactions on Biomedical Circuits and Systems, vol. 7, pp. 158-168, 2013.
[47] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, et al., "Caffe: Convolutional architecture for fast feature embedding," in Proceedings of the 22nd ACM international conference on Multimedia, 2014, pp. 675-678.
[48] J. Liu, J. M. White, and R. M. Summers, "Automated detection of blob structures by Hessian analysis and object scale," in 2010 IEEE International Conference on Image Processing, 2010, pp. 841-844.
[49] J. Illingworth and J. Kittler, "A survey of the Hough transform," Computer vision, graphics, and image processing, vol. 44, pp. 87-116, 1988.
[50] Y. Xie and Q. Ji, "A new efficient ellipse detection method," in Pattern Recognition, 2002. Proceedings. 16th International Conference on, 2002, pp. 957-960.
[51] J. B. Roerdink and A. Meijster, "The watershed transform: Definitions, algorithms and parallelization strategies," Fundamenta informaticae, vol. 41, pp. 187-228, 2000.
[52] V. Grau, A. Mewes, M. Alcaniz, R. Kikinis, and S. K. Warfield, "Improved watershed transform for medical image segmentation using prior information," IEEE transactions on medical imaging, vol. 23, pp. 447-458, 2004.
[53] S. Rifai, P. Vincent, X. Muller, X. Glorot, and Y. Bengio, "Contractive auto-encoders: Explicit invariance during feature extraction," in Proceedings of the 28th international conference on machine learning (ICML-11), 2011, pp. 833-840.
[54] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, et al., "Going deeper with convolutions," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1-9.
校內:2022-01-01公開