| 研究生: |
游旻軒 You, Min-Hsuan |
|---|---|
| 論文名稱: |
利用具有語意自適應功能之 Transformer 增強二維材料語意分割 Enhancing Two-Dimensional Material Semantic Segmentation with Semantic-Adaptive Transformer |
| 指導教授: |
陳奇業
Chen, Chi-Yeh |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2023 |
| 畢業學年度: | 111 |
| 語文別: | 英文 |
| 論文頁數: | 41 |
| 中文關鍵詞: | 二維材料辨識 、語意分割 、變壓器模型 |
| 外文關鍵詞: | Two-dimensional materials, Semantic segmentation, Transformer model |
| 相關次數: | 點閱:94 下載:5 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在二維材料領域中,獲取適合的薄膜材料是一項極具挑戰性的任務。目前,化學氣相沉積(CVD)和機械剝離是兩種主流方法被廣泛應用。儘管CVD可以精確控制薄膜的厚度並製備單層材料,但其設備要求高昂且需要高溫,對基板選擇有限,可能造成基板損壞且反應過程複雜,難以確保純度。相比之下,機械剝離不涉及化學反應,具有較高的材料品質,尤其適用於二維材料的基礎研究。然而,機械剝離需要手動剝離薄膜材料,操作依賴於膠帶或其他工具,容易產生人為差異,並且薄膜材料的單層性質的確認也存在挑戰。
本研究旨在利用深度學習方法加速二維材料的識別程序。我們採用了基於變壓器(Transformer)的模型架構,並提出了一種語意自適應模塊,該模塊促使語意資訊與多頭自注意力機制相互合作,從而提升模型性能。通過對語意資訊進行深度監督訓練,語意知識將被動融入模型,提高了對資料集的理解能力。我們的方法在成功大學光電系提供的二維材料數據集上進行了詳細的實驗和分析,並實現了比現有深度學習方法更高的精確度。這項研究表明,語意資訊與多頭注意力的協作在二維材料識別領域具有潛在價值。
Obtaining suitable thin film materials in the field of two-dimensional materials is a highly challenging task.
Currently, chemical vapor deposition (CVD) and mechanical exfoliation are two widely used methods.
However, these methods have their limitations and challenges.
In this study, we propose a deep learning approach based on the transformer model to accelerate the identification process of two-dimensional materials.
We introduce a semantic adaptation module that enables the collaboration between semantic information and multi-head self-attention mechanism, enhancing the performance of the model.
Through deep supervision training and the incorporation of prior knowledge, our approach achieves higher accuracy in a two-dimensional materials dataset provided by the Department of Photonics at National Cheng Kung University.
The results of this study demonstrate the potential value of the collaboration between semantic information and multi-head attention in the field of two-dimensional material recognition.
[1] ijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence, 39(12):2481–2495, 2017.
[2] Dan Bing, Yingying Wang, Jing Bai, Ruxia Du, Guoqing Wu, and Liyan Liu. Optical contrast for identifying the thickness of two-dimensional materials. Optics Communications, 406:128–138, 2018.
[3] Chong Chen, Ying Liu, Xianfang Sun, Carla Di Cairano-Gilfedder, and Scott Titmus. Automobile maintenance prediction using deep learning with gis data. Procedia CIRP, 81:447–452, 2019.
[4] Liang-Chieh Chen, Yi Yang, Jiang Wang, Wei Xu, and Alan L Yuille. Attention to scale: Scale-aware semantic image segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3640–3649, 2016. [5] Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), pages 801–818, 2018.
[6] Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), pages 801–818, 2018.
[7] Zhengxue Cheng, Heming Sun, Masaru Takeuchi, and Jiro Katto. Deep convolutional autoencoder-based lossy image compression. In 2018 Picture Coding Symposium (PCS), pages 253–257. IEEE, 2018.
[8] Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, and Yichen Wei. Deformable convolutional networks. In Proceedings of the IEEE international conference on computer vision, pages 764–773, 2017.
[9] Ivana Despotovi ́c, Bart Goossens, and Wilfried Philips. Mri segmentation of the human brain: challenges, methods, and applications. Computational and mathematical methods in medicine, 2015, 2015.
[10] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
[11] Di Feng, Christian Haase-Sch ̈utz, Lars Rosenbaum, Heinz Hertlein, Claudius Glaeser, Fabian Timm, Werner Wiesbeck, and Klaus Dietmayer. Deep multi-modal object detection and semantic segmentation for autonomous driving: Datasets, methods, and challenges. IEEE Transactions on Intelligent Transportation Systems, 22(3):1341–1360, 2020.
[12] Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang, and Hanqing Lu. Dual attention network for scene segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3146–3154, 2019.
[13] Enlai Gao, Shao-Zhen Lin, Zhao Qin, Markus J Buehler, Xi-Qiao Feng, and Zhiping Xu. Mechanical exfoliation of two-dimensional materials. Journal of the Mechanics and Physics of Solids, 115:248–262, 2018.
[14] Junjun He, Zhongying Deng, Lei Zhou, Yali Wang, and Yu Qiao. Adaptive pyramid context network for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7519–7528, 2019.
[15] Kaiming He, Georgia Gkioxari, Piotr Doll ́ar, and Ross Girshick. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969, 2017.
[16] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
[17] Li Huang, Dan Zhang, Fei-Hu Zhang, Zhi-Hong Feng, Yu-Dong Huang, and Yang Gan. High-contrast sem imaging of supported few-layer graphene for differentiating distinct layers and resolving fine features: There is plenty of room at the bottom. Small, 14(22):1704190, 2018.
[18] Shihua Huang, Zhichao Lu, Ran Cheng, and Cheng He. Fapn: Feature-aligned pyramid network for dense image prediction. In Proceedings of the IEEE/CVF international conference on computer vision, pages 864–873, 2021.
[19] Zilong Huang, Xinggang Wang, Lichao Huang, Chang Huang, Yunchao Wei, and Wenyu Liu. Ccnet: Criss-cross attention for semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 603–612, 2019.
[20] Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, and Humphrey Shi. Semask: Semantically masked transformers for semantic segmentation. arXiv preprint arXiv:2112.12782, 2021.
[21] Inhwa Jung, Matthew Pelton, Richard Piner, Dmitriy A Dikin, Sasha Stankovich, Supinda Watcharotone, Martina Hausner, and Rodney S Ruoff. Simple approach for high-contrast optical imaging and characterization of graphene-based sheets. Nano Letters, 7(12):3569–3575, 2007.
[22] Alexander V Kolobov and Junji Tominaga. Two-dimensional transition-metal dichalcogenides, volume 239. Springer, 2016.
[23] N Krishnaraj, Mohamed Elhoseny, M Thenmozhi, Mahmoud M Selim, and K Shankar. Deep learning model for real-time image compression in internet of underwater things (iout). Journal of Real-Time Image Processing, 17:2097–2111, 2020.
[24] OI Lebedev, NA Kiselev, AG Vasiliev, and AA Orlikovsky. Tem investigation of gexsi1x/si (111) heterostructures grown by mbe. Microscopy of Semiconducting Materials 1995, 146:297–300, 1995.
[25] Hai Li, Jumiati Wu, Xiao Huang, Gang Lu, Jian Yang, Xin Lu, Qihua Xiong, and Hua Zhang. Rapid and reliable thickness identification of two-dimensional nanosheets using optical microscopy. ACS nano, 7(11):10344–10353, 2013.
[26] Hanchao Li, Pengfei Xiong, Jie An, and Lingxue Wang. Pyramid attention network for semantic segmentation. arXiv preprint arXiv:1805.10180, 2018.
[27] uhao Li, Yangyang Kong, Jinlin Peng, Chuanbin Yu, Zhi Li, Penghui Li, Yunya Liu, Cun-Fa Gao, and Rong Wu. Rapid identification of two-dimensional materials via machine learning assisted optic microscopy. Journal of Materiomics, 5(3):413–421, 2019.
[28] Guosheng Lin, Anton Milan, Chunhua Shen, and Ian Reid. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1925–1934, 2017.
[29] Tsung-Yi Lin, Piotr Doll ́ar, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2117–2125, 2017.
[30] Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
[31] Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
[32] Satoru Masubuchi and Tomoki Machida. Classifying optical microscope images of exfoliated graphene flakes by data-driven machine learning. npj 2D Materials and Applications, 3(1):4, 2019.
[33] Satoru Masubuchi, Eisuke Watanabe, Yuta Seo, Shota Okazaki, Takao Sasagawa, Kenji Watanabe, Takashi Taniguchi, and Tomoki Machida. Deep-learning-based image segmentation integrated with optical microscopy for automatically searching for twodimensional materials. npj 2D Materials and Applications, 4(1):3, 2020.
[34] Mark S.. Nixon and Alberto S Aguado. Feature extraction & image processing for computer vision. Academic press, 2012.
[35] Siyuan Qiao, Liang-Chieh Chen, and Alan Yuille. Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10213– 10224, 2021.
[36] Fereshteh Ramezani, Sheikh Parvez, J Pierce Fix, Arthur Battaglin, Seamus Whyte, Nicholas J Borys, and Bradley M Whitaker. Automatic detection of multilayer hexagonal boron nitride in optical images using deep learning-based computer vision. Scientific Reports, 13(1):1595, 2023.
[37] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and ComputerAssisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
[38] Gabino Rubio-Bollinger, Ruben Guerrero, David Perez de Lara, Jorge Quereda, Luis Vaquero-Garzon, Nicolas Agra ̈ıt, Rudolf Bratschitsch, and Andres Castellanos-Gomez. Enhanced visibility of mos2, mose2, wse2 and black-phosphorus: making optical identification of 2d semiconductors easier. Electronics, 4(4):847–856, 2015.
[39] Max Schwarz, Anton Milan, Arul Selvam Periyasamy, and Sven Behnke. Rgb-d object detection and semantic segmentation for autonomous manipulation in clutter. The International Journal of Robotics Research, 37(4-5):437–451, 2018.
[40] Robin Strudel, Ricardo Garcia, Ivan Laptev, and Cordelia Schmid. Segmenter: Transformer for semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 7262–7272, 2021.
[41] P Sutter and E Sutter. Thickness determination of few-layer hexagonal boron nitride films by scanning electron microscopy and auger electron spectroscopy. Apl Materials, 2(9):092502, 2014.
[42] Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. Non-local neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7794–7803, 2018.
[43] Xuefeng Wang, Ming Zhao, and David D Nolte. Optical contrast and clarity of graphene on an arbitrary substrate. Applied Physics Letters, 95(8):081102, 2009.
[44] Minho Won, Hee Sun Byun, Kyeong Ah Park, and Gang Min Hur. Post-translational control of nf-κb signaling by ubiquitination. Archives of pharmacal research, 39:1075– 1084, 2016.
[45] Bin Wu, Ling Wang, and Zhenyu Gao. A two-dimensional material recognition image algorithm based on deep learning. In 2019 International Conference on Information Technology and Computer Application (ITCA), pages 247–252. IEEE, 2019.
[46] Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M Alvarez, and Ping Luo. Segformer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems, 34:12077–12090, 2021.
[47] Juntan Yang and Haimin Yao. Automated identification and characterization of twodimensional materials via machine learning-based processing of optical microscope images. Extreme Mechanics Letters, 39:100771, 2020.
[48] Fisher Yu and Vladlen Koltun. Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122, 2015.
[49] Hualing Zeng, Gui-Bin Liu, Junfeng Dai, Yajun Yan, Bairen Zhu, Ruicong He, Lu Xie, Shijie Xu, Xianhui Chen, Wang Yao, et al. Optical signature of symmetry variations and spin-valley coupling in atomically thin tungsten dichalcogenides. Scientific reports, 3(1):1608, 2013.
[50] Shuang Zhang, Jiong Yang, Renjing Xu, Fan Wang, Weifeng Li, Muhammad Ghufran, Yong-Wei Zhang, Zongfu Yu, Gang Zhang, Qinghua Qin, et al. Extraordinary photoluminescence and strong temperature/angle-dependent raman responses in few-layer phosphorene. ACS nano, 8(9):9590–9596, 2014.
[51] Xin Zhang, Xiao-Fen Qiao, Wei Shi, Jiang-Bin Wu, De-Sheng Jiang, and Ping-Heng Tan. Phonon and raman scattering of two-dimensional transition metal dichalcogenides from monolayer, multilayer to bulk material. Chemical Society Reviews, 44(9):2757– 2785, 2015.
[52] Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2881–2890, 2017.
[53] Weijie Zhao, Zohreh Ghorannevis, Kiran Kumar Amara, Jing Ren Pang, Minglin Toh, Xin Zhang, Christian Kloc, Ping Heng Tan, and Goki Eda. Lattice dynamics in monoand few-layer sheets of ws 2 and wse 2. Nanoscale, 5(20):9677–9683, 2013.
[54] Weijie Zhao, Zohreh Ghorannevis, Leiqiang Chu, Minglin Toh, Christian Kloc, PingHeng Tan, and Goki Eda. Evolution of electronic structure in atomically thin sheets of ws2 and wse2. ACS nano, 7(1):791–797, 2013.
[55] Zhen Zhu, Mengde Xu, Song Bai, Tengteng Huang, and Xiang Bai. Asymmetric nonlocal neural networks for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 593–602, 2019.