成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	楊明翰 Yang, Ming-Han
論文名稱：	生成式人工智慧為基之工作足跡監控模式與技術開發 Development of Models and Enabling Technologies for Generative AI–Based Work Footprint Monitoring
指導教授：	陳裕民 Chen, Yuh-Min
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 製造資訊與系統研究所 Institute of Manufacturing Information and Systems
論文出版年：	2026
畢業學年度：	114
語文別：	中文
論文頁數：	111
中文關鍵詞：	工作足跡監控、代理人、視覺語言模型、查索增強生成、知識圖譜
外文關鍵詞：	Work Footprint Monitoring, Agent, Vision Language Model, Retrieval-Augmented Generation, Knowledge Graph
相關次數：	點閱：3 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

隨著企業數位轉型的推進，營運管理模式已由經驗導向逐步轉為資料驅動。然而，現行管理制度多仍仰賴傳統工時指標，難以有效反映員工實際工作行為特性與潛在效率瓶頸。此外，主管在面對大量且複雜的工作足跡資料時，往往缺乏即時且具結構性的分析工具支援，使管理決策高度依賴個人經驗，不僅增加管理負荷，也提高決策不一致的風險。
本研究旨在建構一套以生成式人工智慧（Generative AI, GAI）為核心的工作足跡監控模式與技術架構，研究重點聚焦於 PDCA 管理循環中的 Check（檢查）階段。透過蒐集製造執行系統（Manufacturing Execution System, MES）之行為資料，定義並偵測遲到、早退、休息次數異常與休息時間過長等多類工作異常行為。
本研究採用中心化多代理人（Multi-Agent）架構進行實作，涵蓋路由、教學、助理與決策支援四類代理人，分別負責任務分派、操作引導、資訊摘要與決策輔助；在技術層面，整合檢索增強生成（RAG）之知識圖譜，將隱性管理經驗轉化為結構化知識鏈，以支援異常原因推論與輔導建議生成，並進一步微調視覺語言模型（VLM）Qwen2.5-VL 以輔助判讀趨勢圖中的「突刺」、「急轉」等關鍵特徵，同時結合思維鏈（Chain of Thought, CoT）機制，提升分析過程的透明度與可解釋性。
本研究透過實作工作足跡監控平台與智慧代理人，展示異常趨勢分析、工作足跡績效評比，以及輔導成效指標的計算方式，協助主管進行系統化輔導。

As enterprises undergo digital transformation, management practices are shifting from experience-based approaches to data-driven decision making. However, traditional working-hour indicators remain insufficient to reflect actual employee behaviors and efficiency issues, and managers often lack structured tools to analyze complex work footprint data.
This study proposes a Generative AI–based Work Footprint Monitoring framework focusing on the Check phase of the PDCA cycle. Behavioral data from Manufacturing Execution Systems are analyzed to detect various work anomalies. A centralized Agent architecture is adopted to support task coordination, information summarization, and decision assistance. The framework integrates Retrieval-Augmented Generation with a Knowledge Graph to transform implicit managerial experience into structured knowledge for anomaly reasoning and coaching recommendations. In addition, a Vision Language Model is applied to assist in trend chart interpretation, enhancing analytical transparency and interpretability. The implemented system demonstrates effective anomaly analysis and performance evaluation, supporting systematic and data-driven managerial coaching.

摘要 i
致謝 vi
目錄 vii
表目錄 xii
圖目錄 xiii
第 1 章 緒論 1
1 研究背景 1
2 研究動機 3
3 研究目的 3
4 研究項目與方法 4
5 研究步驟 7
第 2 章 文獻探討 8
1 領域文獻探討 8
1.1 工作足跡監控相關研究與應用 8
1.2 異常偵測與分析 9
2生成式人工智慧與大型語言模型技術 9
2.1 大型語言模型 9
2.2 視覺語言模型 10
2.3 上下文學習 11
2.4 提示工程 12
2.5 模型微調技術 13
3 檢索增強生成技術 14
3.1 檢索增強生成技術架構 14
3.2 知識圖譜 16
4 AI代理人 17
4.1 AI 代理人之定義 17
4.2多代理人 18
4.3代理人應用範例 20
5 文獻探討總結 21
第 3 章 工作足跡監控模式設計 22
1人員足跡監控模式 22
2工作足跡異常偵測與分析機制開發 25
2.1 工作足跡收集與異常偵測機制 25
2.2 工作足跡異常分析與評比 28
2.3 足跡績效分析與評比 30
2.4 輔導成效分析 32
3工作足跡監控平台系統架構 34
第 4 章 代理人設計與技術開發 39
1 代理人設計 39
1.1 代理人架構 39
1.2 路由代理人 40
1.3 教學代理人 41
1.3 助理代理人 45
1.4 決策支援代理人 48
2  向量知識檢索庫建構 50
2.1 知識資料處理與建構流程 51
2.2 向量式知識檢索策略 52
3 知識圖譜建構 53
3.1 輔導知識圖譜建構 54
3.2 平台知識圖譜建構 65
3.3 圖表知識圖譜建構 68
4趨勢圖視覺語言模型實作 69
4.1 視覺語言模型選擇 70
4.2 提示工程 71
4.3 訓練資料生成流程 74
4.4 模型微調 78
第 5 章 應用案例 79
1 分析功能應用 79
1.1 應用案例展示：工作異常分析與評比 79
1.2 應用案例展示：足跡績效分析與評比 80
1.3 應用案例展示：輔導成效分析 81
2 代理人應用 83
2.1 應用案例展示：平台操作指引 83
2.2 應用案例展示：輔助決策支援 84
2.3 應用案例展示：圖表輔助判讀 85
第 6 章 結論、研究限制與未來展望 86
1 結論 86
2 研究限制 87
3 未來展望 87
參考文獻 89
                                    

Davenport, T. H. (2006). Competing on analytics. Harvard business review, 84(1), 98.Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM computing surveys (CSUR), 41(3), 1-58.
Lu, Q., Zhu, L., Xu, X., Xing, Z., Harrer, S., & Whittle, J. (2024, June). Towards responsible generative ai: A reference architecture for designing foundation model based agents. In 2024 IEEE 21st International Conference on Software Architecture Companion (ICSA-C) (pp. 119-126). IEEE.
Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., ... & McGrew, B. (2023). Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
Sundarajan, A. (2025). Enhancing Workplace Productivity and Well-being Using AI Agent. arXiv preprint arXiv:2501.02368.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training.
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in neural information processing systems, 33, 1877-1901.
Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M. A., Lacroix, T., ... & Lample, G. (2023). Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
TOUVRON, Hugo, et al. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
Zhang, R., Han, J., Liu, C., Gao, P., Zhou, A., Hu, X., ... & Qiao, Y. (2023). Llama-adapter: Efficient fine-tuning of language models with zero-init attention. arXiv preprint arXiv:2303.16199.
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., ... & Lowe, R. (2022). Training language models to follow instructions with human feedback. Advances in neural information processing systems, 35, 27730-27744.
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., ... & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 35, 24824-24837.
Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in neural information processing systems, 33, 9459-9474.
Hogan, A., Blomqvist, E., Cochez, M., d’Amato, C., Melo, G. D., Gutierrez, C., ... & Zimmermann, A. (2021). Knowledge graphs. ACM Computing Surveys (Csur), 54(4), 1-37.
Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., ... & Zhou, D. (2022). Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171.
LinkedIn Engineering. (2016, October 6). Building the LinkedIn knowledge graph. LinkedIn Engineering Blog. https://www.linkedin.com/blog/engineering/knowledge/building-the-linkedin-knowledge-graph
He, X., Tian, Y., Sun, Y., Chawla, N., Laurent, T., LeCun, Y., ... & Hooi, B. (2024). G-retriever: Retrieval-augmented generation for textual graph understanding and question answering. Advances in Neural Information Processing Systems, 37, 132876-132907.
Wen, Y., Wang, Z., & Sun, J. (2024, August). Mindmap: Knowledge graph prompting sparks graph of thoughts in large language models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 10370-10388).
Omrani, P., Hosseini, A., Hooshanfar, K., Ebrahimian, Z., Toosi, R., & Akhaee, M. A. (2024, April). Hybrid retrieval-augmented generation approach for LLMs query response enhancement. In 2024 10th International Conference on Web Research (ICWR) (pp. 22-26). IEEE.
Xi, Z., Chen, W., Guo, X., He, W., Ding, Y., Hong, B., ... & Gui, T. (2025). The rise and potential of large language model based agents: A survey. Science China Information Sciences, 68(2), 121101.
Guo, T., Chen, X., Wang, Y., Chang, R., Pei, S., Chawla, N. V., ... & Zhang, X. (2024). Large language model based multi-agents: A survey of progress and challenges. arXiv preprint arXiv:2402.01680.
Li, J., Zhang, Q., Yu, Y., Fu, Q., & Ye, D. (2024). More agents is all you need. arXiv preprint arXiv:2402.05120.
Sreedhar, K., & Chilton, L. (2024). Simulating human strategic behavior: Comparing single and multi-agent llms. arXiv preprint arXiv:2402.08189.
Tran, K. T., Dao, D., Nguyen, M. D., Pham, Q. V., O'Sullivan, B., & Nguyen, H. D. (2025). Multi-agent collaboration mechanisms: A survey of llms. arXiv preprint arXiv:2501.06322.
Wang, Z., Liu, Z., Zhang, Y., Zhong, A., Wang, J., Yin, F., ... & Wen, Q. (2024, October). Rcagent: Cloud root cause analysis by autonomous agents with tool-augmented large language models. In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management (pp. 4966-4974).
Dan, Y., Lei, Z., Gu, Y., Li, Y., Yin, J., Lin, J., ... & Qiu, X. (2023). Educhat: A large-scale language model-based chatbot system for intelligent education. arXiv preprint arXiv:2308.02773.
Zambare, P., Thanikella, V. N., Kottur, N. P., Akula, S. A., & Liu, Y. (2025). Netmoniai: An agentic ai framework for network security & monitoring. arXiv preprint arXiv:2508.10052.
Tao, W., Leu, M. C., & Yin, Z. (2020). Multi-modal recognition of worker activity for human-centered intelligent manufacturing. Engineering Applications of Artificial Intelligence, 95, 103868.
Al Jassmi, H., Al Ahmad, M., & Ahmed, S. (2021). Automatic recognition of labor activity: a machine learning approach to capture activity physiological patterns using wearable sensors. Construction Innovation, 21(4), 555-575.
Aloini, D., Fronzetti Colladon, A., Gloor, P., Guerrazzi, E., & Stefanini, A. (2022). Enhancing operations management through smart sensors: measuring and improving well-being, interaction and performance of logistics workers. The TQM Journal, 34(2), 303–329. https://doi.org/10.1108/TQM-06-2021-0195
Masry, A., Do, X. L., Tan, J. Q., Joty, S., & Hoque, E. (2022, May). Chartqa: A benchmark for question answering about charts with visual and logical reasoning. In Findings of the association for computational linguistics: ACL 2022 (pp. 2263-2279).
Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., ... & Chen, W. (2022). Lora: Low-rank adaptation of large language models. ICLR, 1(2), 3.
Ding, R., Han, S., Xu, Y., Zhang, H., & Zhang, D. (2019, June). Quickinsights: Quick and automatic discovery of insights from multi-dimensional data. In Proceedings of the 2019 international conference on management of data (pp. 317-332).

簡易檢索 / 詳目顯示

相關論文