簡易檢索 / 詳目顯示

研究生: 孔亨書
Kung, Hen-Shu
論文名稱: 皮膚特徵追蹤---五種方法的比較
Tracking of skin features---Comparison of five methods
指導教授: 吳馬丁
Nordling, Torbjörn
學位類別: 碩士
Master
系所名稱: 工學院 - 機械工程學系
Department of Mechanical Engineering
論文出版年: 2024
畢業學年度: 112
語文別: 英文
論文頁數: 105
中文關鍵詞: 帕金森氏症特徵追蹤機器學習深度學習影像閃爍
外文關鍵詞: Parkinson's disease, Machine learning, Deep learning, Feature tracking, Video filckering
相關次數: 點閱:98下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 研究簡介: 帕金森病是一種漸進性神經退行性疾病,其主要症狀為進行性運動障礙,包括震顫和肌肉僵直。目前,醫護人員通常依賴經驗進行帕金森病的診斷,其中姿勢性震顫評估是常用的運動任務之一,包含在統一帕金森病評分量表(UPDRS)中。由於姿勢性震顫動作微細且難以人眼觀察,研究人員開始採用計算機視覺和深度學習技術對圖像進行特徵追蹤及定量分析。然而,在應用深度學習進行皮膚特徵追蹤時,我們面臨了內存資源的限制,以及計算時間過長的問題,使得帕金森病診斷在方便和迅速方面成為一個挑戰。
    研究目標: 我們的首要目標是評估現有方法在攝影環境下追蹤皮膚特徵的能力,以及其計算成本。第二個目標是測試主成分分析(PCA)作為特徵提取器用於追蹤。
    研究方法: 我們的實驗流程分為四個主要步驟: 數據收集、預處理、特徵追蹤和成對比較。首先,在國立成功大學醫院使用三部三星 Galaxy S7 edge 手機錄製了 PTA影片(1280x720 像素,240 FPS)。其次,通過平均亮度調整來消除影片閃爍,這比使用重複素材的方法亮度更為平均。第三,分別使用五種特徵追蹤方法:DFE、加權DFE(WDFE)、Lucas-Kanade 經典光流(COF)、雙向光流(BOF)和主成分分析(PCA)追蹤兩個貼紙、皮膚痣、皺紋和光滑的皮膚區域。在 DFE 和 WDFE 中,使用自動編碼器為訓練模型部分,該模型經過訓練可再現手部皮膚裁剪的圖像,用於提取 128 個特徵,而在 PCA 方法中,則使用初始幀的所有裁剪輸入標準 PCA 模型進行相同訓練操作。在這三種方法中,我們比較了手動選擇的初始特徵和每個可能裁剪的提取特徵,並在每幀中預測具有最小殘差平方和 (SSR) 的特徵位置。最後,我們對不同方法進行成對比較,因為缺乏地面真實數據,且已經顯示手動標註錯誤與 DFE 的追蹤錯誤相同大小。最後,我們對各種方法進行了成對比較。由於缺乏基準測量值,我們透過已證實人工標註與 DFE 追蹤誤差相同的方法進行了與其準確度的比較。
    研究結果: 我們平均將影片中的閃爍亮度變化從 5.54 減少到 0.85,這為我們的追蹤算法提供了更好的影片數據。在大多數情況下,BOF 和 PCA 的追蹤結果與 DFE 相似。對於貼紙特徵、皮膚痣和皺紋,BOF 的平均絕對差(MAD)小於 0.5 像素。對於貼紙和皮膚痣,PCA 的 MAD 分別小於 0.4 和 0.6 像素。我們觀察到當特徵過於相似時,PCA 方法的追蹤誤差大於其他方法。最後,PCA 方法減少了約 60% 的計算時間,而 BOF 方法則減少了 80% 的計算時間。
    研究結論: 對於明顯的貼紙和皮膚特徵,BOF 和 PCA 算法提供的結果與 DFE 相似。此外,它們相較於 DFE 有效地減少了計算時間。

    Background: Parkinson’s disease is a progressive neurodegenerative disorder with tremors and muscle rigidity as main symptoms. Healthcare professionals usually rely on experience to diagnose Parkinson’s disease, in which postural tremor assessment (PTA) is one of the commonly used motor tasks included in the Unified Parkinson’s Disease Rating Scale (UP- DRS). Since postural tremor movements are subtle and difficult to observe with the human eye, researchers have begun to use computer vision and deep learning techniques to analyze video recordings quantitatively. In 2021, Chang and Nordling introduced the Deep feature encoder (DFE) and demonstrated so small tracking errors of distinctive skin features that they could stem from the manual labelling based on a χ2-test. However, DFE suffer from large computational cost. How to quantify the motion accurately while keeping the computational cost down is still an open question.
    Objectives: Our first objective is to evaluate existing methods ability to track skin features in videos of PTA and their computational cost. Our second objective is to test principal component analysis (PCA) as a feature extractor for tracking.
    Methods: Our experimental procedure consists of four main steps–data collection, pre- processing, feature tracking, and pairwise comparison. First, video of PTA was recorded at the National Cheng Kung University Hospital using three Samsung Galaxy S7 edge (1280x720 px at 240 FPS). Second, video flicker was removed using average brightness adjustment, which yielded better results than the duplicate footage method. Third, five feature tracking methods were used to track two stickers, skin moles, wrinkles, and smooth skin patches: DFE, weighted DFE (WDFE), Lucas-Kanade classic optical flow (COF), bidirectional opti- cal flow (BOF), and principal component analysis (PCA). In DFE and WDFE the encoder part of an autoencoder, trained to reproduce hand skin crops, is used to extract 128 features,
    while a standard PCA of all possible crops of the initial frame is used to do the same in the PCA method. In these three methods, we then compare the extracted features of the manually selected initial feature and every possible crop, and predict the feature location at the point with smallest sum of squared residuals in each frame. Fourth, we perform pairwise compar- ison of the different methods, since no ground truth is available and manual labelling errors have been shown to be of the same size as the tracking errors of DFE.
    Results: We on average reduced the luminance variance in the videos with flicker from 5.54 to 0.85, providing better video data for our tracking algorithm. In most cases, the tracking results of BOF and PCA are similar to DFE. For sticker features, skin moles, and wrinkles, the mean absolute difference (MAD) of the BOF is less than 0.5 pixels. For stickers and skin moles, the MAD of the PCA is less than 0.4 and 0.6 pixels, respectively. We observe instances where the tracking by the PCA method deviated from the others when the features are too similar. Meanwhile, the PCA method reduces the computation time by about 60%, while the BOF method reduces the computation time by 80%.
    Conclusion: BOF and PCA algorithms provide results similar to DFE for distinct sticker and skin features. In addition, they effectively reduce the computation time compared to DFE.

    Chinese abstract i Abstract iii Acknowledgment v Table of Contents vi List of Tables vii List of Figures viii List of symbols ix 1 Introduction 1 1.1 Computer vision for assessing Parkinson's disease ......................... 1 1.1.1 Motion assessment ........................ 2 1.1.2 Quantitative movement assessment ........................ 2 1.2 Literature review of Visual Object Tracking ........................ 4 1.2.1 Traditional feature extraction methods ........................ 4 1.2.2 Deep learning method ........................ 5 1.3 Problem statement and objective ........................ 11 1.4 Organization of this thesis ........................ 11 2 Theory and Methodology 13 2.1 Process of experiment ............................ 15 2.1.1 Experimental setup............................ 15 2.2 Flicker removal .............................. 16 2.2.1 Video quality assessment .............................. 19 2.2.1.1 Time domain analysis .............................. 19 2.2.1.2 Frequency domain analysis .............................. 20 2.3 Deep Feature Encoding algorithm .............................. 20 2.3.1 Training Using Auto Encoder .............................. 21 2.3.2 Pixel scan matching strategy .............................. 21 2.3.3 Sub-pixel level prediction ................................ 23 2.3.4 Edge pixel sensitivity.......................... 24 2.4 Principal Component Analysis extraction algorithm .............................. 26 2.4.1 PCA extractor model .............................. 27 2.4.2 Singular value decomposition .............................. 29 2.5 Bidirectional Optical Flow algorithm .............................. 31 2.5.1 Enhance tracking strategy .............................. 34 2.5.2 Image pyramid construction .............................. 35 2.5.3 Reverse iterative optical flow correction .............................. 37 2.6 Performance evaluation through pairwise comparison matrix ........................ 39 3 Results and discussion 49 3.1 Output of data preprocessing ........................... 49 3.1.1 Video flickering removal ........................... 50 3.2 Output of pairwise comparison of tracking methods ............................. 63 3.2.1 Comparing tracking method with Mean Absolute Difference (MAD) ......... 63 3.3 Website and Mobile App ........................ 80 4 Conclusions and future work 85 4.1 Conclusions ............................ 85 4.2 Future work ............................ 87 References 89

    Ashyani, A., Wu, Y.-H., Hsu, H.-W., and Nordling, T. E. (2023). An analysis ofmathbbP - invariance and dynamical compensation properties from a control perspective. arXiv preprint arXiv:2303.10996.
    Bishop, C. M. and Nasrabadi, N. M. (2006). Pattern recognition and machine learning, volume 4. Springer.
    Bouguet, J.-Y. et al. (2001). Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm. Intel corporation, 5(1-10):4.
    Chang, J. R. and Nordling, T. E. (2021). Skin feature point tracking using deep feature encodings. arXiv preprint arXiv:2112.14159.
    Chang, S., Zhang, F., Huang, S., Yao, Y., Zhao, X., and Feng, Z. (2019). Siamese feature pyramid network for visual tracking. In 2019 IEEE/CIC International Conference on Communications Workshops in China (ICCC Workshops), pages 164–168. IEEE.
    Choi, H., Kang, B., and Kim, D. (2022). Moving object tracking based on sparse optical flow with moving window and target estimator. Sensors, 22(8):2878.
    Cubo, E., Mariscal, N., Solano, B., Becerra, V., Armesto, D., Calvo, S., Arribas, J., Seco, J., Martinez, A., Zorrilla, L., et al. (2017). Prospective study on cost-effectiveness of home- based motor assessment in parkinson’s disease. Journal of telemedicine and telecare, 23(2):328–338.
    Danelljan, M., Hager, G., Shahbaz Khan, F., and Felsberg, M. (2015). Convolutional fea- tures for correlation filter based visual tracking. In Proceedings of the IEEE international conference on computer vision workshops, pages 58–66.
    Do, Y.-S. and Jeong, Y.-J. (2013). A new area efficient surf hardware structure and its application to object tracking. In 2013 IEEE International Conference of IEEE Region 10 (TENCON 2013), pages 1–4. IEEE.
    Goetz, C. G., Tilley, B. C., Shaftman, S. R., Stebbins, G. T., Fahn, S., Martinez-Martin, P., Poewe, W., Sampaio, C., Stern, M. B., Dodel, R., et al. (2008). Movement disorder society-sponsored revision of the unified parkinson’s disease rating scale (mds-updrs): scale presentation and clinimetric testing results. Movement disorders: official journal of the Movement Disorder Society, 23(15):2129–2170.
    Han, B., Roberts, W., Wu, D., and Li, J. (2007). Robust feature-based object tracking. In Al- gorithms for Synthetic Aperture Radar Imagery XIV, volume 6568, pages 250–261. SPIE.
    Hao-Wei-Tu et al. (2022). Experiment setup optimization and pre-processing of video to remove artifacts for improved quality and skin feature tracking.
    He, X., Zhao, L., and Chen, C. Y.-C. (2021). Variable scale learning for visual object track- ing. Journal of Ambient Intelligence and Humanized Computing, pages 1–16.
    Kidziński, Ł., Yang, B., Hicks, J. L., Rajagopal, A., Delp, S. L., and Schwartz, M. H. (2020). Deep neural networks enable quantitative movement analysis using single-camera videos. Nature communications, 11(1):4054.
    Kovalenko, E., Shcherbak, A., Somov, A., Bril, E., Zimniakova, O., Semenov, M., and Samoylov, A. (2022). Detecting the parkinson’s disease through the simultaneous analysis of data from wearable sensors and video. IEEE Sensors Journal, 22(16):16430–16439.
    Krupicka, R., Szabo, Z., Viteckova, S., and Ruzicka, E. (2014). Motion capture system for finger movement measurement in parkinson disease. Radioengineering, 23(2):659–664.
    Laaroussi, K., Saaidi, A., Masrar, M., and Satori, K. (2018). Human tracking using joint color-texture features and foreground-weighted histogram. Multimedia Tools and Appli- cations, 77:13947–13981.
    Lipsmeier, F., Taylor, K. I., Kilchenmann, T., Wolf, D., Scotland, A., Schjodt-Eriksen, J., Cheng, W.-Y., Fernandez-Garcia, I., Siebourg-Polster, J., Jin, L., et al. (2018). Evalua- tion of smartphone-based testing to generate exploratory outcome measures in a phase 1 parkinson’s disease clinical trial. Movement Disorders, 33(8):1287–1297.
    Liu, S., Huang, L., Shi, X., and Sun, Y. (2021). Siamese networks with distance-iou loss for real-time visual tracking. In 2021 5th International Conference on Digital Signal Process- ing, pages 109–115.
    Lucas, B. D. and Kanade, T. (1981). An iterative image registration technique with an ap- plication to stereo vision. In IJCAI’81: 7th international joint conference on Artificial intelligence, volume 2, pages 674–679.
    Parkinson, J. (2002). An essay on the shaking palsy. The Journal of neuropsychiatry and clinical neurosciences, 14(2):223–236.
    Ranjan, A. and Black, M. J. (2017). Optical flow estimation using a spatial pyramid network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4161–4170.
    Shah, S. T. H. and Xuezhi, X. (2021). Traditional and modern strategies for optical flow: an investigation. SN Applied Sciences, 3:1–14.
    Tao, R., Gavves, E., and Smeulders, A. W. (2016). Siamese instance search for tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1420–1429.
    Vignoud, G., Desjardins, C., Salardaine, Q., Mongin, M., Garcin, B., Venance, L., and De- gos, B. (2022). Video-based automated assessment of movement parameters consistent with mds-updrs iii in parkinson’s disease. Journal of Parkinson’s Disease, (Preprint):1– 12.
    Wang, X., Wang, H., and Tian, C. (2021). Infrared image brightness correction for tir object tracking. In 2021 7th International Conference on Computer and Communications (ICCC), pages 704–708. IEEE.
    Yeh, T.-H. (2019). 帕金森氏症治療現況. http://web.tccf.org.tw/lib/addon.php? act=post&id=4500, last visited on 2022-04-27.
    Young, J., Murthy, L., Westby, M., Akunne, A., and O’Mahony, R. (2010). Diagnosis, prevention, and management of delirium: summary of nice guidance. Bmj, 341.
    Zhang, A., De la Torre, F., and Hodgins, J. (2020). Comparing laboratory and in-the-wild data for continuous parkinson’s disease tremor detection. In 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pages 5436– 5441. IEEE.
    Zhou, H., Yuan, Y., and Shi, C. (2009). Object tracking using sift features and mean shift.
    Computer vision and image understanding, 113(3):345–352.

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE