簡易檢索 / 詳目顯示

研究生: 傅乙晟
Fu, Yi-Cheng
論文名稱: CoEvo:生成式人工智慧之多代理系統在建築設計創新流程的探討
CoEvo: Exploring Multi-Agent Generative AI Systems in the Innovative Architectural Design Process
指導教授: 鄭泰昇
Jeng, Tay-Sheng
學位類別: 碩士
Master
系所名稱: 規劃與設計學院 - 建築學系
Department of Architecture
論文出版年: 2025
畢業學年度: 113
語文別: 中文
論文頁數: 210
中文關鍵詞: 多代理系統生成式人工智慧建築設計流程設計探索模式空間智能人機協作
外文關鍵詞: Multi-Agent System, Generative AI, Architectural Design Process, Design Exploration Pattern, Spatial Intelligence, Human-AI Collaboration
相關次數: 點閱:23下載:11
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著生成式人工智慧(Generative AI)的迅速崛起,其應用在建築設計領域仍面臨流程斷裂與知識整合的挑戰。現有AI工具多為分散式應用,難以觸及設計過程中複雜、隱性的決策網絡。為此,本研究的核心目標是建構一個名為CoEvo的生成式AI多代理(Multi-Agent)協作系統,旨在將建築設計中隱性的知識與複雜的流程,正規化為一個由多AI代理協同工作的可追蹤、可編輯的節點工作流程。如此AI 才能更緊密的與人類協作,有效率地生成出具有建築內涵的設計方案。

    本研究以人類的設計思考流程為基礎,探索如何將生成式人工智慧與多代理系統,整合應用於建築設計的創新流程,提煉出兩種整合人類設計思考與人工智慧的設計探索模式:「目標導向的最佳化設計流程」與「廣域探索導向的同步化設計流程」。接著,為支持這兩種模式,本研究設計、實現並比較了兩種對應的多智能體協作架構:一種是強調穩定與效率的順序化協作架構,適用於對單一方案的深度精煉;另一種是具備高度彈性與動態性的層級化協作架構,適用於設計初期的多路徑並行探索。

    通過在CoEvo平台上的案例實證與比較分析,研究結果表明,層級化協作架構不僅在支持複雜的廣域探索時表現出顯著優勢,其靈活性更使其能夠向下兼容並模擬順序化的優化流程,展現出作為未來通用型設計平台的巨大潛力;而順序化架構則在高度標準化的特定任務中,仍保有其效率價值。

    本研究的核心貢獻不僅在於提供了一個可行的系統框架,更在於提出了一套以多代理系統為基礎的設計流程方法論,並揭示了多AI協作對建築師創意與多樣化探索策略的影響。CoEvo的實踐,為未來開發能與建築師深度協同、重塑建築師價值核心的AI輔助設計代理,提供了關鍵的理論基礎與實踐洞見,期望能藉此推動建築設計流程的協作效率與創新潛力。

    The integration of Generative AI (GenAI) into architectural design is often hampered by fragmented processes and difficulties in leveraging tacit knowledge. This research introduces CoEvo, a GenAI multi-agent collaborative system aimed at formalizing complex and implicit architectural design knowledge and workflows into an editable, traceable node-based process. Grounded in human design thinking, this study explores integrating GenAI and multi-agent systems into innovative architectural design processes, distilling two core design exploration patterns: "goal-oriented optimization" and "broad exploration-oriented synchronization." Correspondingly, sequential and hierarchical multi-agent architectures are implemented and tested within CoEvo to support these patterns.

    Case studies demonstrate the hierarchical architecture's superior flexibility and its capacity to encompass sequential optimization, highlighting its potential as a versatile AI-assisted design platform, while the sequential architecture retains efficiency for specific, standardized tasks. Beyond a functional system, this research proposes a multi-agent design methodology that reveals how AI collaboration shapes architectural creativity and exploration. CoEvo provides key theoretical and practical insights for developing future AI design agents that deeply integrate with architects, aiming to enhance efficiency and innovation in design workflows.

    目錄 VII 第一章 前言 1 1.1研究背景 1 1.1.1建築師工具的沿革與設計流程的演變 1 1.1.2 數位工具的發展與整合應用的挑戰 3 1.2研究動機 6 1.2.1核心挑戰:AI與建築師「隱性知識」的隔閡 6 1.2.2研究契機:以多智能體系統作為「正規化」隱性流程的途徑 7 1.3研究目標 8 1.3.1將隱性流程正規化:建構一個可執行的多智能體協作框架 8 1.3.2提出並比較兩種正規化的多智能體設計探索架構 9 1.3.3探索流程透明的人機協作新模式 10 1.3.4建立具擴展性的應用框架 10 1.4研究方法 10 1.4.1 理論基礎與技術回顧 10 1.4.2 CoEvo系統規劃與開發 11 1.4.3 案例測試與反思 11 1.4.4 研究成果討論及後續建議 12 第二章 文獻回顧 14 2.1 人工智慧代理人基礎能力與關鍵技術 14 2.1.1 AI Agent 的基本模組:建檔、感知、記憶、計畫與行動 15 2.1.2 驅動 AI Agent 的關鍵技術 19 2.2 生成式人工智慧多代理系統的發展與應用 27 2.2.1多代理系統的演進:從規則到生成式 AI 驅動 28 2.2.2 生成式 AI 多代理系統技術機制與協作模式 31 2.3 人工智慧與現況建築業的應用整合與挑戰 40 2.3.1 生成式 AI 在建築設計中的應用案例 40 2.3.2 現況建築事務所應用生成式 AI 的瓶頸與痛點 44 第三章 系統規劃及開發 47 3.1系統規劃 47 3.1.1傳統事務所設計流程的實務分析 47 3.1.2提煉兩種核心設計探索模式 49 3.1.3設計思考的理論框架:雙菱形模型 52 3.2系統開發 54 3.2.1建立多AI協作架構 55 3.2.2系統開發簡介 59 第四章 系統執行 65 4.1順序化架構:目標導向的最佳化設計流程 65 4.1.1確定設計定位 68 4.1.2設計方案迭代發展 70 4.1.3成果彙總及後續建議 80 4.2層級化架構:整合式設計流程 82 4.2.1廣域探索:中前期概念發展 86 4.2.2最佳化探索:後期概念細化發展 99 4.2.3人機協同的方案優化操作 114 4.2.4建立方案探索報告 117 4.3執行結果分析 119 4.3.1探索模式與協作架構的演化 119 4.3.2 AI生成結果的評估:量變是否能引發質變? 122 4.3.3正規化隱性知識的體現與價值 125 第五章 研究結論與反思 128 5.1 CoEvo的研究成果 128 5.1.1正規化隱性設計流程,重塑建築師的角色 128 5.1.2揭示設計模式與協作架構的「超集」演化關係 129 5.1.3開創了基於多智能體的生成式AI設計流程框架 130 5.2研究限制與建議 132 5.2.1技術瓶頸 132 5.2.2實務挑戰 134 5.2.3後續研究建議 135 參考文獻 138 附錄一 CoEvo技術操作手冊 146 1.1系統總覽與部署準備 146 1.1.1硬體需求規格 146 1.1.2軟體環境配置 146 1.1.3依賴套件安裝 147 1.2專案架構與核心文件 147 1.2.1專案目錄結構 147 1.2.2工作流圖表定義基礎 148 1.2.3核心配置文件 152 1.3 AI工具集與整合系統 156 1.3.1工具集總覽 156 1.3.2圖像與3D生成工具集 (ComfyUI & Trellis) 161 1.3.3 MCP工具集 167 1.4 AI系統啟動後基本操作 174 1.4.1快速啟動 174 1.4.2介面介紹 175 1.5多AI協作模架構設計與構成 176 1.5.1目標導向的最佳化設計流程:順序化協作架構 176 1.5.2廣域探索導向的同步化設計流程:層級化協作架構 180 1.5.3生成式AI模型規格 189 附錄二 相關資源連結 194

    論文
    1. 張哲豪,(2014)。設計思考在建築設計操作之應用。國立臺灣科技大學。
    2. 龔智群,(2023)。人工智慧輔助BIM建築設計流程。國立成功大學。
    3. 陳欣道,(2024)。以人工智慧作為新媒介之建築設計方法。國立成功大學。
    4. 陳建安,(2024)。應用人工智慧多模態環境促進建築設計中的互動工作流程。國立成功大學。
    5. Franklin, S., & Graesser, A. (1997). Is it an agent, or just a program?:A taxonomy for autonomous agents. In Intelligent agents III agent theories, architectures, and languages (pp. 21-35). Springer Berlin Heidelberg.
    6. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C. Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems 33 (NeurIPS 2020) (pp. 1877-1901).
    7. Ho, J., Jain, A., & Abbeel, P. (2020). Denoising Diffusion Probabilistic Models. Advances in Neural Information Processing Systems, 33, 6840-6851.
    8. Rombach, R., Blattmann, A., Lorenz, D., & Esser, P. (2022). High-resolution image synthesis with latent diffusion models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10684-10695.
    9. Alayrac, J.-B., Donahue, J., Luc, P., Miech, A., Barr, I., Hasson, Y., Lenc, K., Mensch, A., Millican, K., Reynolds, M., Ring, R., Sharifzadeh, S., Sharkey, S., Wielscher, F., Zimmermann, B., Monteiro, M., Papamakarios, G., Carreira, J., Simonyan, K., & Zisserman, A. (2022). Flamingo:a Visual Language Model for Few-Shot Learning. Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (pp. 23716-23736).
    10. Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E. H., Le, Q. V., & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (pp. 24824-24837).
    11. Kojima, T., Gu, S. S., Reid, M., Matsuo, Y., & Iwasawa, Y. (2022). Large language models are zero-shot reasoners. In Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (pp. 22199-22213).
    12. Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K. R., & Cao, Y. (2023a). ReAct:Synergizing reasoning and acting in language models. In International Conference on Learning Representations (ICLR 2023).
    13. Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K., & Yao, S. (2023). Reflexion:Language agents with verbal reinforcement learning. Advances in Neural Information Processing Systems, 36.
    14. Park, J. S., O'Brien, J., Cai, C. J., Morris, M. R., Liang, P., & Bernstein, M. S. (2023). Generative agents:Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST '23). Association for Computing Machinery, New York, NY, USA, Article 2, 1–22.
    15. Schick, T., Dwivedi-Yu, J., Dessì, R., Raileanu, R., Lomeli, M., Hambro, E., Zettlemoyer, L., Cancedda, N., & Scialom, T. (2023). Toolformer:Language models can teach themselves to use tools. Advances in Neural Information Processing Systems, 36.
    16. Li, P., Li, B., & Li, Z. (2023). Sketch-to-architecture:Generative AI-aided architectural design. In Proceedings of the 31st Pacific Conference on Computer Graphics and Applications. The Eurographics Association.
    期刊
    1. Jennings, N.R., Sycara, K. & Wooldridge, M. A Roadmap of Agent Research and Development. Autonomous Agents and Multi-Agent Systems 1, 7–38 (1998). DOI:https://doi.org/10.1023/A:1010090405266
    2. Khozium, M. O. (2013). Multi-agent system overview:Architectural designing using practical approach. International Journal of Computers & Technology, 5(2), 447-455. DOI:10.24297/ijct.v5i2.3527
    3. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative Adversarial Networks. arXiv preprint arXiv:1406.2661.
    4. Kingma, D. P., & Welling, M. (2015). Auto-Encoding Variational Bayes. arXiv preprint arXiv:1312.6114.
    5. Kassem, M., Succar, B., & Dawood, N. (2020). Building information modelling:predicting and managing the challenges of interfirm collaboration. Construction Management and Economics, 38(9), 847-868.
    6. Hossam Eldin, H., Bakir, R. and El-Fiki, S. (2021), "An interdisciplinary approach for tacit knowledge communication between the designer and the computer", Open House International, Vol. 46 No. 3, pp. 416-431. DOI:https://doi.org/10.1108/OHI-02-2021-0037.
    7. Wu, X., Song, X., Zhao, D., & Yang, P. (2021). A review of artificial intelligence in architectural design. Building and Environment, 194, 107698.
    8. Wang, G., Xie, Y., Jiang, Y., Mandlekar, A., Xiao, C., Zhu, Y., Fan, L., & Anandkumar, A. (2023). Voyager:An open-ended embodied agent with large language models. arXiv preprint arXiv:2305.16291.
    9. Li, J., Li, D., Savarese, S., & Hoi, S. (2023). BLIP-2:Bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv preprint arXiv:2301.12597.
    10. Xiang, J., Lv, Z., Xu, S., Deng, Y., Wang, R., Zhang, B., Chen, D., Tong, X., & Yang, J. (2024). Structured 3D Latents for Scalable and Versatile 3D Generation. arXiv preprint arXiv:2412.01506.
    11. Hong, Y., Zhen, H., Chen, P., Zheng, S., Du, Y., Chen, Z., & Gan, C. (2023). 3D-LLM:Injecting the 3D World into Large Language Models. arXiv preprint arXiv:2307.12981.
    12. Beyan, E. V. P., & Rossy, A. G. C. (2023). A review of AI image generator:influences, challenges, and future prospects for architectural field. Journal of Artificial Intelligence in Architecture, 2(1), 53-65.
    13. Dong, Y., Deng, M., Wu, M., & Yang, B. (2022). Generative design for architecture:Review, challenges, and future. Automation in Construction, 137, 104193.
    14. Nagy, F., Mandour, A., & Ahmed, I. A.-E. (2023). The digital transformation toward an integrated design process. Engineering Research Journal, 177, A1–A27. Faculty of Engineering, Helwan University.
    15. Yang L, Zhang Z, Song Y, et al (2023) Diffusion models:A comprehensive survey of methods and applications. ACM Computing Surveys 56(4):1-39.
    16. Wu, Q., Bansal, G., Zhang, J., Wu, Y., Zhang, S., Zhu, E., Wang, C. (2023). AutoGen:Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework. arXiv preprint arXiv:2308.08155.
    17. Hong, S., Zheng, M., Chen, J., Yang, Y., Lin, L., Zhou, Y., Zhang, Y. (2023). MetaGPT:Meta Programming for Multi-Agent Collaborative Framework. arXiv preprint arXiv:2308.00352.
    18. Paananen, V., Oppenlaender, J., & Visuri, A. (2023). Using text-to-image generation for architectural design ideation. International Journal of Architectural Computing, 0(0), 1–17. DOI:https://doi.org/10.1177/14780771231222783
    19. Ko, J., Ajibefun, J., & Yan, W. (2023). Experiments on generative AI-powered parametric modeling and BIM for architectural design (arXiv:2308.00227). arXiv:https://doi.org/10.48550/arXiv.2308.00227
    20. Wang, Y., Yao, J., Fu, X., & Liu, Y. (2023). Exploring the potential of AI generative design in architectural education. Journal of Interior Design, 48(3), 177-193.
    21. Oquab, M., Darcet, T., Moutakanni, T., Vo, H. V., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., El-Nouby, A., Assran, M., Ballas, N., Galuba, W., Howes, R., Jegou, H., Mairal, J., Labatut, P., Joulin, A., & Bojanowski, P. (2024). DINOv2:Learning robust visual features without supervision. Transactions on Machine Learning Research.
    22. Wang, L., Ma, C., Feng, X., Zhang, Z., Yang, H., Zhang, J., Chen, Z., Tang, J., Chen, X., Lin, Y., Zhao, W. X., Wei, Z., & Wen, J. (2024). A survey on large language model based autonomous agents. Frontiers of Computer Science, 18(6), 186345. DOI:https://doi.org/10.1007/s11704-024-40231-1.
    23. Shi, M., Seo, J., Cha, S. H., Xiao, B., & Chi, H.-L. (2024). Generative AI-powered architectural exterior conceptual design based on the design intent. Journal of Computational Design and Engineering, 11(5), 125-142.
    24. Wang, X., He, Z., & Peng, X. (2024). Artificial-Intelligence-Generated Content with Diffusion Models:A Literature Review. Mathematics, 12(7), 977. DOI:https://doi.org/10.3390/math12070977.
    25. Xiang, J., Ma, Z., Xu, S., Fan, Z., Chen, J., Li, Y., & Tang, J. (2024). Trellis:A Unified Structured Latent Representation for Scalable and Versatile 3D Generation. arXiv preprint arXiv:2403.01234.
    26. Guo, T., Chen, X., Wang, Y., Chang, R., Pei, S., Chawla, N. V., Wiest, O., & Zhang, X. (2024). Large Language Model based Multi-Agents:A Survey of Progress and Challenges. arXiv preprint arXiv:2402.01680v2.
    27. Li, P., Li, B., & Li, Z. (2024). Generating daylight-driven architectural design via diffusion models. arXiv preprint arXiv:2404.13353.
    28. Gaier, A., Stoddart, J., Villaggi, L., & Sudhakaran, S. (2024). Generative design through quality-diversity data synthesis and language models. arXiv:https://doi.org/10.48550/arXiv.2405.09997.
    29. Zhang, J., Xiang, R., Kuang, Z., Wang, B., & Li, Y. (2024). ArchGPT:Harnessing large language models for supporting renovation and conservation of traditional architectural heritage. Heritage Science, 12(220). DOI:https://doi.org/10.1186/s40494-024-01334-x.
    30. Zeng, L., & Li, H. (2024). Boosting Architectural Generation via Prompts:Report. arXiv:2404.15971.
    31. Lia, C., Zhang, T., Du, X., Zhang, Y., & Xie, H. (2024, October 23). Generative AI models for different steps in architectural design:A literature review preprint. arXiv:https://arxiv.org/abs/2404.01335
    32. Nguyen, T., Chin, P., & Tai, Y.-W. (2025). MA-RAG:Multi-agent retrieval-augmented generation via collaborative chain-of-thought reasoning. arXiv preprint. arXiv:https://arxiv.org/abs/2505.20096v1.
    33. Han, H., Wang, Y., Shomer, H., Guo, K., Ding, J., Lei, Y., Halappanavar, M., Rossi, R. A., Mukherjee, S., Tang, X., He, Q., Hua, Z., Long, B., Zhao, T., Shah, N., Javari, A., Xia, Y., & Tang, J. (2025). Retrieval-augmented generation with graphs (GraphRAG). arXiv preprint. arXiv:https://arxiv.org/abs/2501.00309v2.

    專書與雜誌
    1. 庄惟敏,(2018)。建築策劃與後評估。清華大學建築設計研究院。中國建築工業出版社。
    2. 邱浩修,(2023)。當代建築演繹:機械、數位、生態到人工智慧的設計思考。田園城市。
    3. 鄭泰昇,(2024)。AI建築師。TA台灣建築。
    4. 簡聖芬,(2024)。大型語言模型與建築計畫。TA台灣建築。
    5. Polanyi, M. (1966). The Tacit Dimension. University of Chicago Press.
    6. Prawitz, D. (1990). Tacit Knowlege — An Impediment for AI?. In:Göranzon, B., Florin, M. (eds) Artifical Intelligence, Culture and Language:On Education and Work. The Springer Series on Artificial Intelligence and Society. Springer, London.
    7. Brand, S. (1995). How buildings learn:What happens after they're built. Penguin Books.
    8. Kolarevic, B. (2003). Architecture in the Digital Age:Design and Manufacturing. Spon Press.
    9. Wooldridge, M. (2009). An introduction to multiagent systems (2nd ed.). John Wiley & Sons.
    10. Eastman, C., Teicholz, P., Sacks, R., & Liston, K. (2011). BIM Handbook:A Guide to Building Information Modeling for Owners, Managers, Designers, Engineers and Contractors. John Wiley & Sons.
    11. Norman, D. A. (2013). The design of everyday things:Revised and expanded edition. Basic books.
    12. Fischer, M., Ashcraft, H. W., Reed, D., & Khanzode, A. (2017). Integrating project delivery. John Wiley & Sons.
    13. Russell, S. J., & Norvig, P. (2020). Artificial Intelligence:A Modern Approach (4th ed.). Pearson.
    14. Schrijver, L. (Ed.). (2021). The Tacit Dimension:Architecture knowledge and scientific research. Leuven University Press.
    網路資料
    1. TYarchistudioBIM. (2024)。 革新建築設計:利用 Revit 與 AI 實現高效能。 檢自:https://tyarchistudiobim.com/blog/aibim
    2. Logto Blog. (2025)。 什麼是 MCP (Model Context Protocol) 及其工作原理。檢自:https://blog.logto.io/zh-TW/what-is-mcp
    3. Design Council. (2005). The Double Diamond:A universally accepted depiction of the design process. Retrieve from https://www.designcouncil.org.uk/our-resources/the-double-diamond/
    4. The Daily Omnivore. (2013, January 3). Tacit knowledge. The Daily Omnivore. Retrieve from https://thedailyomnivore.net/2013/01/03/tacit-knowledge/
    5. Nessler, D. (2018). How to apply a design thinking, HCD, UX or any creative process from scratch — Revised & New Version. UX Collective. Retrieve from https://uxdesign.cc/how-to-solve-problems-applying-a-uxdesign-designthinking-hcd-or-any-design-process-from-scratch-v2-aa16e2dd550b
    6. Shahram Seificar. (2023). How To Use AI In Architecture and Managing The Design Process Phases. Retrieve from https://www.linkedin.com/pulse/how-use-ai-architecture-managing-design-process-shahram-seificar-phd-1rmjc/
    7. Weng, L. (2023). LLM-powered Autonomous Agents. Lil'Log. Retrieve from https://lilianweng.github.io/posts/2023-06-23-agent/
    8. Fatima, N. (2024). LangGraph:Streamlining workflow design with graph-based AI applications. Medium. Retrieved from https://medium.com/@noorfatimaafzalbutt/langgraph-streamlining-workflow-design-with-graph-based-ai-applications-6ecefc2c437f
    9. LlamaIndex. (2025). Context engineering: What it is and techniques to consider. Retrieved from https://www.llamaindex.ai/blog/context-engineering-what-it-is-and-techniques-to-consider
    10. Danawale, S., & Sawant, R. (2025). How to build LLM agent to automate your code review workflow using CrewAI? Ionio. Retrieved from https://www.ionio.ai/blog/how-to-build-llm-agent-to-automate-your-code-review-workflow-using-crewai
    11. Restack. (2025). CrewAI vs AutoGen vs LangGraph. Restack. Retrieved from https://www.restack.io/p/crewai-answer-crewai-vs-autogen-vs-langgraph-cat-ai
    12. LangGraph.(n.d.a). Multi-agent Systems. Retrieved from https://langchain-ai.github.io/langgraph/tutorials/workflows/
    13. LangChain.(n.d.b). Memory. Retrieve from https://langchain-ai.github.io/langgraph/concepts/memory/#managing-long-conversation-history
    14. LangChain.(n.d.c). How to use the graph API? Retrieve from https://langchain-ai.github.io/langgraph/how-tos/graph-api/#png
    15. Model Context Protocol. (n.d.). Introduction. Retrieve from https://modelcontextprotocol.io/introduction
    16. DeepWisdom. (n.d.). Multi-Agent 101. Retrieved from https://docs.deepwisdom.ai/main/en/guide/tutorials/multi_agent_101.html

    下載圖示 校內:立即公開
    校外:立即公開
    QR CODE