成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	阮氏璇珍 Nguyen Thi Huyen, Tran
論文名稱：	基於多模態代理的 RAG 智慧客製化製造系統維護助手 Multimodal Agentic RAG-based Assistant for Intelligent Customized Support in Manufacturing Systems Maintenance
指導教授：	謝昱銘 Hsieh, Yu-Ming
共同指導:	鄭芳田 Cheng, Fan-Tien
學位類別：	碩士 Master
系所名稱：	智慧半導體及永續製造學院 - 半導體封測學位學程 Program on Semiconductor Packaging and Testing
論文出版年：	2026
畢業學年度：	114
語文別：	英文
論文頁數：	34
外文關鍵詞：	Multimodal Retrieval-Augmented Generation, Agentic Orchestration, Knowledge Graph, Industrial AI, Virtual Maintenance Assistant, Semiconductor Manufacturing, xPPU, Visual Embedding, ColPali, Large Language Models
相關次數：	點閱：13 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

Modern manufacturing has created real pressure to build intelligent virtual assistants that support operators throughout the maintenance lifecycle. Large language models show promise for general reasoning, but fall short in specialized industrial settings without system-specific knowledge. Retrieval-Augmented Generation (RAG) helps bridge that gap — but standard RAG hits a wall with multimodal content like images, diagrams, and sensor data, which are common in technical manufacturing documentation.
This thesis introduces the Multimodal Agentic RAG-based (MMAR) framework, built for intelligent maintenance assistance in manufacturing. MMAR is organized around three core ideas: Modular Subsystem Design, Agentic Orchestration for Adaptive Retrieval, and Validation-Driven Refinement. Validation used the xPPU system as a testbed, with domain experts from TU Munich providing ground-truth annotations. MMAR was benchmarked against V-RAG (text-only) and GraphRAG across three query complexity levels, outperforming both on Response Relevancy, F1 Score, and Recall. These results support MMAR as a deployment-ready platform for next-generation industrial assistants.

ABTRACT	I
ACKNOWLEDGEMENTS	II
TABLE OF CONTENT 	III
LIST OF FIGUREURES	V
LIST OF TABLES	VI
CHAPTER 1 INTRODUCTION	7
1.1  Research Background	7
1.2  Problem Statement	7
1.3  Proposed Solution	8
1.4  Research Structure	9
CHAPTER 2 RELATED WORKS	10
2.1  Background and Gaps in Existing RAG Systems	10
2.2  Comparative Analysis	11
CHAPTER 3 OVERVIEW OF MULTIMODAL AGENTIC RAG-BASED FRAMEWORK FOR INTELLIGENT CUSTOM ASSISTANT	14
3.1  System Architecture Overview	14
3.2  Core Design Principles	14
3.3  Deployment Architecture and Workflow	15
3.4  Visual Retrieval-Augmented Generation (VRAG) for xPPU Maintenance	16
3.5  Query Embedding and Similarity Search Mechanism	17
3.6  Multi-Model Agentic Workflow	18
3.7  Benchmark Dataset and Evaluation Methodology	19
CHAPTER 4 CORE METHODOLOGY: GRAPH DATABASE IMPORT PROCESS	21
4.1  Pre-processing Stage	21
4.2  Graph Construction Process	23
4.3  Relationship and Metadata Collection	23
4.4  Overview of Graph Layers	23
4.5  Integration with MMAR Monitoring and Retrieval	24
CHAPTER 5 VALIDATION METHODOLOGY AND EXPERIMENTAL DESIGN	25
5.1  Methodology	25
5.2  Evaluation Metrics	25
5.3  Experimental Setup	26
5.4  Error Analysis	26
CHAPTER 6 EXPERIMENT RESULT AND ANALYTIC	27
6.1  Overall System Performance	27
6.2  Results by Query Complexity Level	27
6.3  Quantitative Validation and Deployment	27
CHAPTER 7 CONCLUSION AND FUTURE WORK	30
7.1  Conclusions	30
7.2  Future Work	30
REFERENCES	31
                                    

[1] C.-Y. Lin, T.-H. Tsai, and T.-L. Tseng, “Generative AI for Intelligent Manufacturing Virtual Assistants in the Semiconductor Industry,” IEEE Robot. Autom. Lett., vol. 10, no. 3, pp. 3132–3139, Apr. 2025.
[2] R. Wouhaybi, A. Gersho, M. Shoaib, B. Orandi, and J. C. Walrand, “Guest Editorial: An End-to-End Machine Learning Perspective on Industrial IoT,” IEEE Internet Things Mag., vol. 5, no. 1, pp. 22–23, Mar. 2022.
[3] Y.-G. Kim and T.-H. Park, “Anomaly Detection Using Autoencoder with Feature Vector Frequency Map,” IEEE Access, vol. 9, pp. 73808–73817, 2021.
[4] P. V. Amadori, T. Fischer, R. Wang, and Y. Demiris, “Predicting Secondary Task Performance: A Directly Actionable Metric for Cognitive Overload Detection,” IEEE Trans. Cogn. Dev. Syst., vol. 13, no. 3, pp. 1373–1385, Dec. 2022.
[5] Y. Tong et al., “Multi-Attribute Auction-Based Resource Allocation for Twins Migration in Vehicular Metaverses: A GPT-Based DRL Approach,” IEEE Trans. Cogn. Commun. Netw., vol. 11, no. 1, pp. 638–653, Feb. 2025.
[6] K. Hamad and M. Kaya, “A detailed analysis of optical character recognition technology,” Int. J. Appl. Math. Electron. Comput., Special Issue-1, pp. 233–239, 2016.
[7] K. T. Petricek et al., “AI assistants: A framework for semi-automated data wrangling,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 9, pp. 9295–9306, Sep. 2023.
[8] R. Pereira, C. Lima, T. Pinto, and A. Reis, “Virtual Assistants in Industry 3.0: A Systematic Literature Review,” Electronics, vol. 12, no. 19, 3096, 2023.
[9] C. Li et al., “Bringing a Natural Language-enabled Virtual Assistant to Industrial Mobile Robots for Learning, Training and Assistance of Manufacturing Tasks,” IEEE Conf. Publ., 2022.
[10] N. Mehdiyev, L. Mayer, J. Lahann, and P. Fettke, “Deep learning-based clustering of processes and their visual exploration,” Expert Syst., vol. 31, no. 2, e13012, Feb. 2023.
[11] Z. Tan et al., “Human–Machine Interaction in Intelligent and Connected Vehicles: A Review of Status Quo, Issues, and Opportunities,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 9, pp. 13954–13975, Sept. 2022.
[12] M. J. Callaghan, J. Harkin, T. M. McGinnity and L. P. Maguire, “Intelligent User Support in Autonomous Remote Experimentation Environments,” IEEE Trans. Ind. Electron., vol. 55, no. 6, pp. 2355–2367, June 2008.
[13] R. A. C. Diaz et al., “Context Aware Control Systems: An Engineering Applications Perspective,” IEEE Access, vol. 8, pp. 215550–215569, 2020.
[14] M. Zhang et al., “ADAGENT: Anomaly Detection Agent with Multimodal Large Models in Adverse Environments,” IEEE Access, vol. 12, pp. 172061–172074, 2024.
[15] K. Rehrl, S. Bruntsch and H.-J. Mentz, “Assisting Multimodal Travelers: Design and Prototypical Implementation of a Personal Travel Companion,” IEEE Trans. Intell. Transp. Syst., vol. 8, no. 1, pp. 31–42, March 2007.
[16] M. Perakakis and A. Potamianos, “A Study in Efficiency and Modality Usage in Multimodal Form Filling Systems,” IEEE Trans. Audio Speech Lang. Process., vol. 16, no. 6, pp. 1194–1206, Aug. 2008.
[17] D.-S. Kim and K.-S. Hong, “Multimodal Biometric Authentication Using Teeth Image and Voice in Mobile Environment,” IEEE Trans. Consum. Electron., vol. 54, no. 4, pp. 1790–1797, Nov. 2008.
[18] G. De Rossi et al., “A First Evaluation of a Multi-Modal Learning System to Control Surgical Assistant Robots via Action Segmentation,” IEEE Trans. Med. Robot. Bionics, vol. 3, no. 3, pp. 714–724, Aug. 2021.
[19] L.-B. Hernandez-Salinas et al., “IDAS: Intelligent Driving Assistance System Using RAG,” IEEE Open J. Veh. Technol., vol. 5, pp. 1139–1165, 2024.
[20] S. Ge et al., “An Innovative Solution to Design Problems: Applying the Chain-of-Thought Technique to Integrate LLM-Based Agents with Concept Generation Methods,” IEEE Access, vol. 13, pp. 10499–10512, 2025.
[21] R. Zhang et al., “Generative AI Agents With Large Language Model for Satellite Networks via a Mixture of Experts Transmission,” IEEE J. Sel. Areas Commun., vol. 42, no. 12, pp. 3581–3596, Dec. 2024.
[22] J. Höfgen et al., “Deploying Vision Retrieval Augmented Generation as Assistant for xPPU Maintenance,” in Proc. 2025 IEEE 21st Int. Conf. Autom. Sci. Eng. (CASE), 2025, pp. 1950–1955.
[23]Y. Zhao et al., “Multimodal Retrieval-Augmented Generation: A Survey,” arXiv preprint arXiv:2504.08748, 2025. [Online]. Available: https://doi.org/10.48550/arXiv.2504.08748
[24] S. Yu, C. Tang, B. Xu, J. Cui, J. Ran, Y. Yan, Z. Liu, S. Wang, X. Han, Z. Liu, and M. Sun, “VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents,” in Proc. Int. Conf. Learn. Represent. (ICLR), 2025. [Online]. Available: https://arxiv.org/abs/2410.10594
[25] O. Oliveira and D. Grossmann, “Document GraphRAG: Knowledge Graph Enhanced Retrieval Augmented Generation for Document Question Answering Within the Manufacturing Domain,” Electronics, vol. 14, no. 11, Art. no. 2102, 2025, doi: 10.3390/electronics14112102.

簡易檢索 / 詳目顯示

相關論文