| 研究生: |
杜浩瑋 Tu, Hao-Wei |
|---|---|
| 論文名稱: |
最佳化實驗設計與影片前處理來增進資料之質量進而強化皮膚特徵追蹤之結果 Experiment setup optimization and pre-processing of video to remove artifacts for improved quality and skin feature tracking |
| 指導教授: |
吳馬丁
Nordling, Torbjörn |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 機械工程學系 Department of Mechanical Engineering |
| 論文出版年: | 2022 |
| 畢業學年度: | 110 |
| 語文別: | 英文 |
| 論文頁數: | 66 |
| 中文關鍵詞: | 帕金森氏症 、影像閃爍 、特徵追蹤 、自動解碼器 、雙目視覺 |
| 外文關鍵詞: | Parkinson's disease, Auto encoder, Feature tracking, stereo vision, Video flickering |
| 相關次數: | 點閱:77 下載:4 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
研究背景: 帕金森氏症 (Parkinson’s diease) 是一種會影響肢體動作的神經疾病。帕金
森氏症的動作測試檢驗的結果通常是依據醫生臨床上的觀察而定。Finger Tapping
Task (FTT) 是一個常見的試驗動作,FTT 的應用通常仰賴於臨床上的經驗,所以無
法避判斷上的主觀性。深度學習的蓬勃發展史的電腦在於特定項目的測試中甚至可
以比人類還準確。已經好許多研究將機器視覺應用於量化動作並且取得顯著的成效。
研究目標: 我們的目標是提供一套自動量化 FTT 動作之演算法,並且指使用低成本
的機器像是 RGB 相機以及觸控板。
研究方法: 我們制定了一套實驗器材以用於收集目標的動作資料,我們使用相機以及
觸控螢幕收集資料。由於我們的影像資料受到燈光頻率的影響以至於資料中包含了
不規則的閃爍,所以我們使用了 Diplicate method 來減少此效應的引響。為了減少背
景對於追蹤結果的影響,我們使用皮膚語意分割模型將我們的目標的皮膚特徵區分
出來。接著使用皮膚特徵追蹤演算法來產生手指的二維資料,使用了雙目視覺的演
算法來將我們的二維追蹤結果映射到三維空間中。我們從追蹤結果中萃取八種特徵
來比較影片與觸控螢幕的 FTT 資料。
研究結果: 移除燈光閃爍的方法成功提供了穩定質量的資料給我們的追蹤演算法。同
手指上的追蹤結果的距離相關性達到了 0.94 我們使用的皮膚語義模型幫助減少大約
50% 的計算時間影像 FTT 的資料與觸控螢幕 FTT 的資料並沒有明顯的相關性,平均
速度的相關係數只有 0.64
研究結論: 我們只展示了演算法的適應性,只幫主我們朝向目標邁進了一大步。不過
我們還是需要更多的資料以及比較來驗證演算法在臨床上的適用性。
關鍵字: 帕金森氏症,影像閃爍,特徵追蹤,自動解碼器,雙目視覺
Background: Parkinson’s disease (PD) is a neurodegenerative disease that affects the ability to move and perform even the most basic everyday routines. Diagnosis and assessment
of progression rely on clinical observations. The Finger Tapping Task (FTT) is part of the
Unified Parkinson’s Disease Rating Scale (UPDRS) and is commonly used by physicians
to quantify the severity and progression of motor symptoms. Application of it relies on the
physician’s experience and is thus subjective. Advancement in sensors and Computer vision,
in particular Deep learning, has enabled computers to outperform humans in object recognition and other classification tasks. Several studies have shown encouraging progress when
applying these techniques to movement analysis in PD applications.
Aim: Our objective is to automate and digitalize the FTT using low cost sensors, like RGB
cameras or touch panels.
Method: We develop a standardized experiment setup for recording video of the subject doing the FTT in the air or against a touch screen of a tablet. The video contains flickering
due to artificial lighting, so we implement two pre-processing methods for flicker removal–
Duplicate and FlickerFree. To avoid artifacts due to different backgrounds, we implemented
skin segmentation of the hand and removed the background before further image processing.
We implement the Deep Feature Encodings by Chang and Nordling (2021) for tracking of
skin features on the thumb and index finger and use triangulation to obtain the Euclidean distance of these in three dimensions. We extracted 8 features from the trajectory of the distance
between the fingers and compared FTT in the air and on the touch screen.
Results: The flicker removal made it possible to track distinct skin features on the thumb
and index finger through the whole 15 second video. The tracking result of skin features on
the same finger shows a distance correlation above 0.94. The skin segmentation reduces the
computation time of the tracking by 50%. The Pearson correlation between FTT in the air and on the tablet was weak for all 8 features, reaching 0.64 for the average speed.
Conclusions: We have taken several steps towards our aim of digitalizing the FTT, demonstrating the feasibility of our video based approach. Refinement of the methods to make them
robust and, above all, a larger data set for validation and comparison to the physician’s manual scoring is needed.
Keywords: Parkinson’s disease, Auto encoder, Feature tracking, stereo vision, Video flicker
Agarwal, A. and Suryavanshi, S. (2017). Real-time* multiple object tracking (mot) for autonomous navigation. Tech. Rep.
Albawi, S., Mohammed, T. A., and Al-Zawi, S. (2017). Understanding of a convolutional neural network. In 2017 international conference on engineering and technology (ICET), pages 1–6. Ieee.
Ashyani, A., Lin, C.-L., Roman, E., Yeh, T., Kuo, T., Tsai, W.-F., Lin, Y., Tu, R., Su, A., Wang, C.-C., Tan, C.-H., and Nordling, T. E. M. (2022). Digitization of updrs upper limb motor examinations towards automated quantification of symptoms of parkinson’s disease. Manuscript in preparation.
Bay, H., Tuytelaars, T., and Gool, L. V. (2006). Surf: Speeded up robust features. In European conference on computer vision, pages 404–417. Springer.
Bertinetto, L., Valmadre, J., Henriques, J. F., Vedaldi, A., and Torr, P. H. (2016). Fullyconvolutional siamese networks for object tracking. In European conference on computer vision, pages 850–865. Springer.
Bradski, G. R. and Davis, J. W. (2002). Motion segmentation and pose recognition with motion history gradients. Machine Vision and Applications, 13(3):174–184.
Brennan, W. (2020). Semanticsegmentation. https://github.com/WillBrennan/ SemanticSegmentation. Last access: 2022-06-06.
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., and Shah, R. (1993). Signature verification using a” siamese” time delay neural network. Advances in neural information processing systems, 6.
Bu, F., Cai, Y., and Yang, Y. (2016). Multiple object tracking based on faster-rcnn detector and kcf tracker. Technical report, Technical Report [Online]. https://pdfs. semanticscholar.org.
Cao, Z., Simon, T., Wei, S.-E., and Sheikh, Y. (2017). Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7291–7299.
Chang, C.-M., Huang, Y.-L., Chen, J.-C., and Lee, C.-C. (2019). Improving automatic tremor and movement motor disorder severity assessment for parkinson's disease with deep joint training. In 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 3408–3411. IEEE.
Chang, J. R. and Nordling, T. E. (2021). Skin feature point tracking using deep feature encodings. arXiv preprint arXiv:2112.14159.
Dai, Y., Tang, Z., Wang, Y., et al. (2019). Data driven intelligent diagnostics for parkinson' s disease. IEEE Access, 7:106941–106950.
Farnebäck, G. (2003). Two-frame motion estimation based on polynomial expansion. In Scandinavian conference on Image analysis, pages 363–370. Springer.
Ferraris, C., Nerino, R., Chimienti, A., Pettiti, G., Azzaro, C., Albani, G., Mauro, A., and Priano, L. (2020). Automated assessment of motor impairments in parkinson’s disease. The Clinical Neurologist International, 1:1009.
Ferraris, C., Nerino, R., Chimienti, A., Pettiti, G., Cau, N., Cimolin, V., Azzaro, C., Albani, G., Priano, L., and Mauro, A. (2018). A self-managed system for automated assessment of updrs upper limb tasks in parkinson's disease. Sensors, 18(10):3523
Ferraris, C., Nerino, R., Chimienti, A., Pettiti, G., Cau, N., Cimolin, V., Azzaro, C., Priano, L., and Mauro, A. (2019). Feasibility of home-based automated assessment of postural instability and lower limb impairments in parkinson’s disease. Sensors, 19(5):1129.
Ferraris, C., Nerino, R., Chimienti, A., Pettiti, G., Pianu, D., Albani, G., Azzaro, C., Contin, L., Cimolin, V., and Mauro, A. (2014). Remote monitoring and rehabilitation for patients with neurological diseases. In Proceedings of the 9th International Conference on Body Area Networks, pages 76–82.
Fiaz, M., Mahmood, A., and Jung, S. K. (2018). Tracking noisy targets: A review of recent object tracking approaches. arXiv preprint arXiv:1802.03098.
Gilbert, G. T. (1991). Positive definite matrices and sylvester’s criterion. The American Mathematical Monthly, 98(1):44–46.
Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2015). Region-based convolutional networks for accurate object detection and segmentation. IEEE transactions on pattern analysis and machine intelligence, 38(1):142–158
Guo, D., Wang, J., Cui, Y., Wang, Z., and Chen, S. (2020). Siamcar: Siamese fully convo lutional classification and regression for visual tracking. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6269–6277
Guo, Z., Zeng, W., Yu, T., Xu, Y., Xiao, Y., Cao, X., and Cao, Z. (2022). Vision-based finger tapping test in patients with parkinson’s disease via spatial-temporal 3d hand pose estimation. IEEE Journal of Biomedical and Health Informatics.
Hartley, R. and Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge university press.
Held, D., Thrun, S., and Savarese, S. (2016). Learning to track at 100 fps with deep regression networks. In European conference on computer vision, pages 749–765. Springer.
Henriques, J. F., Caseiro, R., Martins, P., and Batista, J. (2014). High-speed tracking with kernelized correlation filters. IEEE transactions on pattern analysis and machine intelli gence, 37(3):583–596.
Hu, M.-K. (1962). Visual pattern recognition by moment invariants. IRE transactions on information theory, 8(2):179–187.
Hughes, A. J., Daniel, S. E., Kilford, L., and Lees, A. J. (1992). Accuracy of clinical diagnosis of idiopathic parkinson’s disease: a clinico-pathological study of 100 cases. Journal of neurology, neurosurgery & psychiatry, 55(3):181–184.
Khan, T., Nyholm, D., Westin, J., and Dougherty, M. (2014). A computer vision framework for finger-tapping evaluation in parkinson’s disease. Artificial intelligence in medicine, 60(1):27–40.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25.
Krupicka, R., Szabo, Z., Viteckova, S., and Ruzicka, E. (2014). Motion capture system for finger movement measurement in parkinson disease. Radioengineering, 23(2):659–664
Langevin, R., Ali, M. R., Sen, T., Snyder, C., Myers, T., Dorsey, E. R., and Hoque, M. E. (2019). The park framework for automated analysis of parkinson’s disease characteristics. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 3(2):1–22.
Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., and Yan, J. (2019). Siamrpn++: Evolution of siamese visual tracking with very deep networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4282–4291
Li, B., Yan, J., Wu, W., Zhu, Z., and Hu, X. (2018a). High performance visual tracking with siamese region proposal network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8971–8980.
Li, P., Wang, D., Wang, L., and Lu, H. (2018b). Deep visual tracking: Review and experi mental comparison. Pattern Recognition, 76:323–338.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer.
Lindeberg, T. (2012). Scale invariant feature transform
Long, J., Shelhamer, E., and Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440.
Ma, C., Huang, J.-B., Yang, X., and Yang, M.-H. (2015). Hierarchical convolutional features for visual tracking. In Proceedings of the IEEE international conference on computer vision, pages 3074–3082
Mathis, A., Mamidanna, P., Cury, K. M., Abe, T., Murthy, V. N., Mathis, M. W., and Bethge, M. (2018). Deeplabcut: markerless pose estimation of user-defined body parts with deep learning. Nature neuroscience, 21(9):1281–1289.
Mei, J., Desrosiers, C., and Frasnelli, J. (2021). Machine learning for the diagnosis of parkin son’s disease: A review of literature. Frontiers in aging neuroscience, 13:184.
OAK-D (2020). OAK-D: Stereo camera with edge ai. Stereo Camera with Edge AI capabil ities from Luxonis and OpenCV.
OpenCV (2013). Camera calibration and 3d reconstruction. https://docs.opencv.org/ 3.4/d9/d0c/group__calib3d.html. Last Access: 2022-06-06.
OpenCV (2016). Opencv: Color conversions. https://docs.opencv.org/3.4/de/d25/ imgproc_color_conversions.html. Accessed: 2022-06-06.
Pytorch Team (2020). Fully-convolutional network model with resnet-50 and resnet-101 backbones. https://pytorch.org/hub/pytorch_vision_fcn_resnet101/. Ac cessed: 2022-06-06.
Rubchinsky, L. L., Kuznetsov, A. S., MD, V. L. W., and Sigvardt, K. A. (2007). Tremor. http://www.scholarpedia.org/w/index.php?title=Tremor&action= cite&rev=137204. Last access: 2022-06-02.
Rublee, E., Rabaud, V., Konolige, K., and Bradski, G. (2011). Orb: An efficient alternative to sift or surf. In 2011 International conference on computer vision, pages 2564–2571. Ieee.
Russell, B. C., Torralba, A., Murphy, K. P., and Freeman, W. T. (2008). Labelme: a database and web-based tool for image annotation. International journal of computer vision, 77(1):157–173.
Shin, J., Kim, H., Kim, D., and Paik, J. (2020a). Fast and robust object tracking using tracking failure detection in kernelized correlation filter. Applied Sciences, 10(2):713.
Shin, J. H., Ong, J. N., Kim, R., Park, S.-m., Choi, J., Kim, H.-J., and Jeon, B. (2020b). Objective measurement of limb bradykinesia using a marker-less tracking algorithm with 2d-video in pd patients. Parkinsonism & Related Disorders, 81:129–135
Simonyan, K. and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Soleimanitaleb, Z., Keyvanrad, M. A., and Jafari, A. (2019). Object tracking methods: A review. In 2019 9th International Conference on Computer and Knowledge Engineering (ICCKE), pages 282–288. IEEE.
Stanford Medicine 25 (2018). Approach to the exam for parkinson’s disease. https:// stanfordmedicine25.stanford.edu/the25/parkinsondisease.html. Last access: 2022-06-02.
Su, A. (2022). Evaluation of motor examination in parkinson's disease using a continuous touch-based finger tapping test on tablets
Tao, R., Gavves, E., and Smeulders, A. W. (2016). Siamese instance search for tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1420–1429
Wang, X., Li, C., Luo, B., and Tang, J. (2018). Sint++: Robust visual tracking via adversarial positive instance generation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4864–4873.
Wang, X., Wang, H., and Tian, C. (2021). Infrared image brightness correction for tir object tracking. In 2021 7th International Conference on Computer and Communications (ICCC), pages 704–708. IEEE.
Wang, X., Zhao, Y., and Pourpanah, F. (2020). Recent advances in deep learning
Wu, X. and Shi, Z. (2018). Utilizing multilevel features for cloud detection on satellite imagery. Remote Sensing, 10:1853
Wu, Y., Lim, J., and Yang, M.-H. (2013). Online object tracking: A benchmark. In Proceed ings of the IEEE conference on computer vision and pattern recognition, pages 2411–241
Xie, R., Wen, J., Quitadamo, A., Cheng, J., and Shi, X. (2017). A deep auto-encoder model for gene expression prediction. BMC genomics, 18(9):39–49
Xiong, F., Zhang, B., Xiao, Y., Cao, Z., Yu, T., Zhou, J. T., and Yuan, J. (2019). A2j: Anchor to-joint regression network for 3d articulated pose estimation from a single depth image. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 793–802.
Zhang, Z. (2000). A flexible new technique for camera calibration. IEEE Transactions on pattern analysis and machine intelligence, 22(11):1330–1334