| 研究生: |
施柏安 Shih, Po-An |
|---|---|
| 論文名稱: |
一種利用多模態大型語言模型推理的高效率自動駕駛系統設計空間探索方法 A Multimodal LLM-Driven Design Space Exploration Approach for Highly Efficient Autonomous Driving Systems |
| 指導教授: |
涂嘉恆
Tu, Chia-Heng |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2025 |
| 畢業學年度: | 114 |
| 語文別: | 英文 |
| 論文頁數: | 57 |
| 中文關鍵詞: | 自駕車系統 、設計空間探索 、大型語言模型 、多代理系統 、系統效能瓶頸分析 、共同模擬 |
| 外文關鍵詞: | Autonomous Driving Systems, Design Space Exploration, Large Language Models, Multi-Agent Systems, Performance Bottleneck Analysis, Co-simulation |
| 相關次數: | 點閱:4 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
高效率自動駕駛系統的設計,因為探索跨越多樣化駕駛場景的軟硬體組合包含高度複雜性而備受阻礙。傳統的設計空間探索 (DSE) 方法受限於其處理多型態的資料能力不足,以及在評估自駕行為正確性時,需高度依賴於人工介入來驗證正確性。本論文提出了一個基於多代理大型語言模型 (LLM) 的軟體框架,此框架將整個DSE流程自動化。透過整合多模態推理、自駕車模擬及效能分析工具,我們的系統能夠自主解析執行結果,引導更為高效率的設計空間探索,在有限的搜尋時間下,找到更適合的軟硬體設計 (design points)。本論文使用LLM 代理人架構來協調與自動化DSE各個不同階段的功能,從接收使用者需求、生成候選design points、調整模擬參數且進行自駕模擬與觀測效能,並透過 LLM 分析視覺上以及文字上的執行結果,從而達到無需人工干預即可識別目前 design point 的整體性能瓶頸,並透過此資訊讓LLM可以在下一次DSE的迭代中,找到更適合的 design point。上述的系統原型實作後,以兩個自駕情境來驗證我們的系統,包含一個無人計程車案例以及一個SAE Level 4自動泊車模式,結果顯示在相同的DSE時間預算下,相較於遺傳演算法,我們設計的框架能發現更多帕雷托最優解(Pareto optimality) 且較具成本效益的設計點。我們的研究結果展示基於 LLM 方法在自駕系統設計上的潛力,我們相信這為自動駕駛系統設計自動化邁出了奠基性的一步。
The design of high-efficiency autonomous driving systems is hindered by the complexity involved in exploring vast hardware and software combinations across diverse driving scenarios. Traditional Design Space Exploration (DSE) methods are limited by their insufficient ability to handle multimodal data and their heavy reliance on manual intervention to verify the correctness of autonomous driving behaviors. This thesis proposes a software framework based on Multi-agent Large Language Models (LLMs) that automates the DSE workflow. By integrating multimodal reasoning, autonomous vehicle simulation, and performance analysis tools, our system can automatically interpret execution results to guide more efficient design space exploration. This allows the system to identify superior HW/SW design points within a limited search time budget. Specifically, the framework utilizes an LLM agent architecture to coordinate and automate various DSE stages: from receiving user requirements and generating candidate design points to adjusting hardware/software parameters and monitoring performance. By analyzing both visual and textual execution outputs, the LLMs can identify overall performance bottlenecks of a given design point without human intervention. This insight enables the LLMs to converge on more optimized design points in subsequent DSE iterations. We validated our prototype framework using two autonomous driving scenarios: a Robotaxi use case and an SAE Level 4 Automated Valet Parking mode. Experimental results indicate that under the same DSE time budget, our framework discovers more Pareto-optimal and cost-effective design points compared to traditional genetic algorithms. Our findings demonstrate the significant potential of LLM-based methodologies in autonomous driving system design, representing a foundational step toward the full automation of autonomous driving system development.
[1] Cpupower: linux cpu frequency tuning tool, 2025.
[2] Deepseek-r1: incentivizing reasoning capability in llms via reinforcement learning, 2025.
[3] Meta releases llama 4 scout with 10-million-token context window, 2025.
[4] Mistral small 3.1: a new leader in the small models category with image understanding capabilities, 2025.
[5] Meta AI. Introducing llama 3.1: our most capable models to date, 2024.
[6] Pedro Henrique Exenberger Becker, José María Arnau, and Antonio González. Demys-tifying power and performance bottlenecks in autonomous driving systems. In IISWC’20, pages 205–215, 2020.
[7] Christophe Bédard, Ingo Lütkebohle, and Michel Dagenais. Ros2_tracing: multipur-pose low-overhead framework for real-time tracing of ros 2. IEEE Robot. Autom. Lett.,7(3):6511–6518, 2022.
[8] Tobias Blass, Arne Hamann, Ralph Lange, Dirk Ziegenbein, and Björn B. Brandenburg.Automatic latency management for ros 2: benefits, challenges, and open problems. InRTAS ’21, pages 264–277, 2021.
[9] Daniel Casini, Tobias Blass, Ingo Lütkebohle, and Björn B. Brandenburg. Response-time analysis of ros 2 processing chains under reservation-based scheduling (artifact).Dagstuhl Artifacts Ser., 5(1):05:1–05:2, 2019.
[10] Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, and T Meyarivan. A fast and elitistmultiobjective genetic algorithm: Nsga-ii. IEEE Transactions on Evolutionary Com-putation, 6(2):182–197, 2002.
[11] Mathieu Desnoyers. Low-impact operating system tracing. PhD thesis, Montréal, ÉcolePolytechnique de Montréal, 2009.
[12] Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and VladlenKoltun. Carla: an open urban driving simulator. In CoRL 2017, volume 78 of Proc.Mach. Learn. Res., pages 1–16. PMLR, 2017.
[13] Kaiyu He, Mian Zhang, Shuo Yan, Peilin Wu, and Zhiyu Zoey Chen. Idea: enhancingthe rule learning ability of large language model agent through induction, deduction,and abduction, 2024.
[14] Sirui Hong, Mingchen Zheng, Jonathan Chen, Wharton Cheng, Ceyao Zhang, ZiyangWang, Steven KS Yau, Zijuan Lin, Liyang Zhou, Chenyu Ran, et al. Metagpt: Meta pro-gramming for a multi-agent collaborative framework. arXiv preprint arXiv:2308.00352,2023.
[15] Xinyi Hou, Yunjia Zhao, Yiqiao Liu, Zhou Yang, Kailong Wang, Li Li, Xiapu Luo,David Lo, John Grundy, and Haoyu Wang. Large language models for software engi-neering: A systematic literature review. arXiv preprint arXiv:2308.10620, 2023.
[16] International Organization for Standardization. Iso 26262: Road vehicles - functionalsafety, 2018. Geneva, Switzerland.
[17] International Organization for Standardization. Iso 21448: Road vehicles - safety of theintended functionality, 2022. Geneva, Switzerland.
[18] Shinpei Kato, Yuki Kitamura, Kazuya Takeda, Ryo Egawa, Hiroshi Ishiguro, et al. Au-toware: An open-source autonomous driving software, 2018.
[19] Philip Koopman and Michael Wagner. Challenges in autonomous vehicle testing andvalidation. SAE International Journal of Transportation Safety, 4(1):15–24, 2016.
[20] Kaixin Ma, Hongming Zhang, Hongwei Wang, Xiaoman Pan, Wenhao Yu, and DongYu. Laser: llm agent with state-space exploration for web navigation, 2023.
[21] Steve Macenski, Tully Foote, Brian Gerkey, Chris Lalancette, and William Woodall.Robot operating system 2: design, architecture, and uses in the wild. Sci. Robot.,7(66):eabm6074, 2022.
[22] SAE International. J3016_202104: Taxonomy and Definitions for Terms Related toDriving Automation Systems for On-Road Motor Vehicles. Standard, SAE Interna-tional, April 2021.
[23] Yu Shang, Yu Li, Keyu Zhao, Likai Ma, Jiahe Liu, Fengli Xu, and Yong Li.Agentsquare: automatic llm agent search in modular design space, 2024.
[24] Po-An Shih, Shao-Hua Wang, Yung-Che Li, Chia-Heng Tu, and Chih-Han Chang. Amulti-agent llm framework for design space exploration in autonomous driving systems,2025.
[25] Shao-Hua Wang, Chen-Xuan Lin, Yu-Tsung Wu, and Chia-Heng Tu. Pard: a dataflowaware profiler for ros-based autonomous driving software. ACM Trans. Cyber-Phys.Syst., 9(2):Article 20, 2025.
[26] Shao-Hua Wang, Chia-Heng Tu, Ching-Chun Huang, and Jyh-Ching Juang. Executionflow aware profiling for ros-based autonomous vehicle software. In ICPP Workshops’22, pages 13:1–13:7. ACM, 2022.
[27] Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Chi, Quoc Le, andDenny Zhou. Chain-of-thought prompting elicits reasoning in large language models.Advances in Neural Information Processing Systems, 35:24824–24837, 2022.
[28] Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang,Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W. White,Doug Burger, and Chi Wang. Autogen: enabling next-gen llm applications via multi-agent conversation, 2023.
[29] Lei Xu, Shanshan Wang, Emmanuel Casseau, and Chenglong Xiao. Intelligent4dse:optimizing high-level synthesis design space exploration with graph neural networksand large language models, 2025.
[30] Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, andYuan Cao. React: Synergizing reasoning and acting in language models. arXiv preprintarXiv:2210.03629, 2022.