簡易檢索 / 詳目顯示

研究生: 林渝恆
Lin, Yu-Heng
論文名稱: 結合光達之運動預估和相機之行為辨識的融合算法發展及其於佔據柵格地圖之應用
Development of Lidar-Based Motion Prediction and Camera-Based Action Recognition Fusion Algorithm and Its Application to Occupancy Grid Map
指導教授: 莊智清
Juang, Jyh-Ching
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 94
中文關鍵詞: 自動駕駛車輛感測器融合運動預估行為辨識佔據柵格地圖
外文關鍵詞: Autonomous Vehicle, Sensor Fusion, Motion Prediction, Action Recognition, Occupancy Grid Map
相關次數: 點閱:74下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 各國政府已積極投入自駕車技術的開發與研究,例如美國於密西根州設立Mcity實驗場域,緊密結合密西根大學的研究能量和公部門的建設規劃,達到產官學合作的效益。另外,韓國的K-city以及日本的Jtown也是顯著的例子。在此背景之下,台灣為了保持創新在2018年制定了『無人載具科技創新實驗條例』實質推動自動駕駛科技的發展,並於2019年在沙崙綠能科學城建成台灣智駕測試實驗室。
    本論文即是在台灣智駕測試實驗室,搜集實驗數據。我們先是基於相機的影像取得行人的行為辨識結果,再從光達點雲取得行人的運動預估軌跡,最後再將這兩者的資訊傳入感知融合架構,並整合顯示於佔據柵格地圖上,使得自駕車能依據佔據柵格地圖來進行後續的路徑規劃。因為,感知代表著自駕車對於周遭環境的理解,要是對環境的理解不足或偏誤將對車輛安全造成極大的危害。本論文因針對行人進行偵測、追蹤並延伸出對其運動方向的預估以及行為判斷,最終能將以上資訊統整過後添加至語意地圖(Semantic Map)上,達到對應衝出行人和讓路行人的功能。
    自動駕駛所需的硬體包括車輛,車載電腦,感測器,聯網設備。為了顧及感測資訊的實時性,感測器數據預處理的方便性和統一的格式定義,本論文基於機器人作業系統(ROS)開發自駕車的感知融合算法。該作業系統提供模組化的開發環境,擁有豐富且多樣的程式庫,提供研究人員更大的彈性,縮短開發算法的時間。
    本論文利用相機實現前方行人的行為辨識時,應用了金字塔盧卡斯-卡納德特徵追蹤(Pyramidal Implementation of the Lucas-Kanade Feature Tracker)、Yolo v3、ShuffleNet v2等算法,利用光達實現行人的運動預估時,應用了交互多模型(Interacting Multiple Model)、無跡卡爾曼濾波(Unscented Karlman Filter)以及機率數據關聯(Probabilistic Data Association)等算法。因影像和掃描點雲各有其訊號特性,基於各自擅長的部分進行融合,可以對於行人意圖進行更全面的了解。
    目前成大自駕車已在台灣智駕測試實驗室進行封閉測試,期望未來能持續發展更為高效及強健的系統,逐步將成大自駕車從封閉場域推展至開放道路。

    Many governments have actively invested in research and development of autonomous vehicles. For example, America has established the Mcity test facility, linking the research resource of Michigan University and the construction planning capability from the public sector, which attains a desired effect from the university-industry-government partnership. Besides, there are also other obvious examples like K-city in Korea or Jtown in Japan. Under this circumstance, Taiwan made the “Unmanned Vehicles Technology Innovative Experimentation Act’’ to keep innovating in 2018, driving the development of autonomous vehicles with real action. Next, Taiwan CAR Lab was established in Shalun Smart Green Energy Science City in 2019.
    This thesis collects the experimental data from Taiwan CAR Lab. Firstly, we obtain the action recognition result of the pedestrian based on the camera image, and then we obtain the pedestrian’s estimated trajectory based on the Lidar point cloud. Eventually, we compose these two data and send it to the sensor fusion architecture. Besides, I integrate the fusion result with the occupancy grid map so that the autonomous vehicle could do the path planning based on it afterward. Because perception means the vehicle’s understanding of the environment. If the understanding is not adequate or incorrect, it will cause tremendous damage to vehicle safety. Since the algorithm in this thesis detects and tracks the pedestrians specifically, which in turn predicts their movement and identifies what action they are taking. Thus, we could integrate the information and adjust the semantic map accordingly. This makes the architecture capable of dealing with the rush-out pedestrian and give-way pedestrian.
    The hardware system of autonomous driving contains vehicle, on-board computers, sensors, and network equipment. In order to maintain the real-time performance of sensor data and the accessibility of data preprocessing with the unified format, this thesis develops the multi-sensor fusion algorithm based on Robot Operating System (ROS), which provides a modular development environment and a variety of libraries. And it also provides higher flexibility for the researcher, shortening the development time of the algorithm.
    In this thesis, we apply Pyramidal Implementation of the Lucas-Kanade Feature Tracker, Yolo v3, and ShuffleNet v2 to retrieve the action recognition result of the pedestrian in front of the ego vehicle. Also, we apply Interacting Multiple Model, Unscented Karlman Filter, Joint Probabilistic Data Association to obtain the motion prediction of the pedestrian. Since the camera and Lidar data have their characteristics, we choose the field they are good at and fuse them together to make a better understanding of pedestrian’s intention.
    Currently, NCKU autonomous vehicle has been tested in the closed field. We expect that the system robustness and efficiency could be enhanced gradually and finally be tested on the open road.

    摘要 I Abstract III Acknowledgment VI Contents VII List of Tables IX List of Figures X List of Acronyms XII Chapter 1 Introduction 1 1.1 Motivation and Objectives 1 1.2 Literature Review 2 1.3 Contributions 3 1.4 Thesis Structures 3 Chapter 2 Autonomous Vehicle System 5 2.1 Levels of Vehicle Autonomy 5 2.2 NCKU Vehicle Configuration 7 2.3 Sensors Introduction 8 2.4 On-board Computer Configuration 10 2.5 Coordinate Transformation and Systems 12 2.5.1 Coordinate Transformation 12 2.5.2 Coordinate Systems 14 Chapter 3 Methodology 18 3.1 Algorithms Overview 18 3.2 Camera Pipeline 19 3.2.1 YOLO: Real-time Object Detection 20 3.2.2 Deep High-Resolution Net: Human Pose Estimation 24 3.2.3 Pyramidal Imp. of the Lucas-Kanade Feature Tracker 29 3.2.4 ShuffleNet: Action Recognition 40 3.3 Lidar Pipeline 48 3.3.1 Unscented Kalman Filter 49 3.3.2 Multi-Object Tracking System 53 3.4 Range Vision Fusion 60 Chapter 4 System Development and Simulation 63 4.1 Robot Operating System 63 4.2 System Architecture 65 4.3 Point Cloud Preprocessing 66 4.4 Occupancy Grid Map Integration 71 4.4.1 Predicted Path Mapping 75 4.4.2 Action “WAVE” Painting with Augmented Circle 77 4.5 Simulation and Analysis 80 4.5.1 Scenario A: Give-way Pedestrian 81 4.5.2 Scenario B: Rush-out Pedestrian 84 4.5.3 Scenario C: Two Pedestrians Walking Ahead 86 Chapter 5 Conclusions and Future Work 90 Bibliography 91

    [1] Z. Tu et al., “Multi-stream CNN: Learning representations based on human-related regions for action recognition,” Pattern Recognition, 2018.
    [2] M. Zolfaghari, G. L. Oliveira, N. Sedaghat, and T. Brox, “Chained Multi-stream Networks Exploiting Pose, Motion, and Appearance for Action Classification and Detection,” in Proceedings of the IEEE International Conference on Computer Vision, 2017.
    [3] Y. Huang, S.-H. Lai, and S.-H. Tai, “Human Action Recognition Based on Temporal Pose CNN and Multi-dimensional Fusion,” in Proceedings of the European Conference on Computer Vision, 2018.
    [4] Y. Tang, Y. Tian, J. Lu, P. Li, and J. Zhou, “Deep Progressive Reinforcement Learning for Skeleton-Based Action Recognition,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2018.
    [5] L. Shi, Y. Zhang, J. Cheng, and H. Lu, “Two-stream adaptive graph convolutional networks for skeleton-based action recognition,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019.
    [6] M. Li, S. Chen, X. Chen, Y. Zhang, Y. Wang, and Q. Tian, “Actional-structural graph convolutional networks for skeleton-based action recognition,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019.
    [7] Y. Du, W. Wang, and L. Wang, “Hierarchical recurrent neural network for skeleton based action recognition,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2015.
    [8] S. Ingle and M. Phute, “Tesla Autopilot : Semi Autonomous Driving, an Uptick for Future Autonomy,” International Research Journal of Engineering and Technology, pp. 2395–56, 2016.
    [9] M. Buehler, K. Iagnemma, and S. Singh, The DARPA urban challenge: autonomous vehicles in city traffic, vol. 56. Springer, 2009.
    [10] C. Urmson et al., “Autonomous driving in urban environments: Boss and the urban challenge,” Journal of Field Robotics, 2008.
    [11] M. Montemerlo et al., “Junior: The Stanford Entry in the Urban Challenge,” in Springer Tracts in Advanced Robotics, 2009.
    [12] J. Leonard et al., “A perception-driven autonomous urban vehicle,” in Springer Tracts in Advanced Robotics, 2009.
    [13] B. Do Kim, C. M. Kang, J. Kim, S. H. Lee, C. C. Chung, and J. W. Choi, “Probabilistic vehicle trajectory prediction over occupancy grid map via recurrent neural network,” in IEEE 20th International Conference on Intelligent Transportation Systems, 2017, pp. 399–404.
    [14] Z. Fang and A. M. L´opez, “Is the Pedestrian going to Cross? Answering by 2D Pose Estimation,” IEEE Intelligent Vehicles Symposium (IV), pp. 1271–1276, 2018.
    [15] SAE International, “Taxonomy and Definitions for Terms Related to On-Road Motor Vehicle Automated Driving Systems,” 2014.
    [16] SAE International, “Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles J3016,” 2018.
    [17] B. Schoettle, “Sensor Fusion: a Comparison of Sensing Capabilities of Human Drivers and Highly Automated Vehicles,” Sustainable Worldwide Transportation, no. SWT-2017-12, 2017.
    [18] J. Diebel, “Representing attitude: Euler angles, unit quaternions, and rotation vectors,” Matrix, 2006.
    [19] J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” arXiv: 1804.02767, 2018.
    [20] A. Kathuria, “What’s new in YOLO v3.” [Online]. Available: https://towards datascience.com/yolo-v3-object-detection-53fb7d3bfe6b.
    [21] T. Y. Lin et al., “Microsoft COCO: Common objects in context,” in Computer Vision – ECCV 2014, ser. Lecture Notes in Computer Science, 2014, vol. 8693, pp. 740–755.
    [22] J. Redmon and A. Farhadi, “YOLO: Real-Time Object Detection.” [Online]. Available: https://pjreddie.com/darknet/yolo/.
    [23] K. Sun, B. Xiao, D. Liu, and J. Wang, “Deep high-resolution representation learning for human pose estimation,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019, pp. 5693–5703.
    [24] Z. Cao, G. Hidalgo Martinez, T. Simon, S.-E. Wei, and Y. A. Sheikh, “OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019.
    [25] B. Raj, “An Overview of Human Pose Estimation with Deep Learning.” [Online]. Available: https://medium.com/beyondmind/an-overview-of-human-pose-estimation-with-deep-learning-d49eb656739b.
    [26] K. Sun, B. Xiao, D. Liu, and J. Wang, “leoxiaobin/deep-high-resolution-net.pytorch.” [Online]. Available: https://github.com/leoxiaobin/deep-high-resolution-net.pytorch.
    [27] J. Bouguet, “Pyramidal Implementation of the Lucas Kanade Feature Tracker Description of the algorithm.” Intel Corporation, Microprocessor Research Labs, 2000.
    [28] B. D. Lucas and T. Kanade, “Iterative Image Registration Technique With an Application To Stereo Vision,” in Proceedings of the 7th international joint conference on Artificial intelligence, 1981, vol. 2, pp. 674–679.
    [29] N. Ma, X. Zhang, H. T. Zheng, and J. Sun, “Shufflenet V2: Practical Guidelines for Efficient CNN Architecture Design,” in Proceedings of the European Conference on Computer Vision, 2018, pp. 116–131.
    [30] D. Ludl, T. Gulde, and C. Curio, “Simple yet efficient real-time pose-based action recognition,” IEEE Intelligent Transportation Systems Conference, pp. 581–588, 2019.
    [31] S. Yan, Y. Xiong, and D. Lin, “Spatial temporal graph convolutional networks for skeleton-based action recognition,” 32nd AAAI Conference on Artificial Intelligence, pp. 7444–7452, 2018.
    [32] J. Shi and C. Tomasi, “Good Features to Track,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 593–600, 1994.
    [33] R. E. Kalman, “A new approach to linear filtering and prediction problems,” Journal of Basic Engineering, Transactions of the ASME, vol. 82, no. D, pp. 35–45, 1960.
    [34] S. J. Julier and J. K. Uhlmann, “Unscented filtering and nonlinear estimation,” in Proceedings of the IEEE, 2004, vol. 92, no. 3, pp. 401–422.
    [35] I. Ribeiro, “Kalman and Extended Kalman Filters: Concept, Derivation, and Properties.” Institute for Systems and Robotics (ISR), Technical University of Lisbon, 2004.
    [36] A. S. A. Rachman, “Automatic calibration and registration of lidar and stereo camera without calibration objects,” Delft Center for Systems and Control, Delft University of Technology, 2017.
    [37] Y. Bar-Shalom, F. Daum, and J. Huang, “The probabilistic data association filter,” IEEE Control Systems Magazine, vol. 29, no. 6, pp. 82–100, 2009.
    [38] Y. Bar-Shalom and X.-R. Li, Multitarget-multisensor tracking: principles and techniques, vol. 19. YBS Publishing, Storrs, CT, 1995.
    [39] “camera_lidar_calibrator.” [Online]. Available: https://github.com/Autoware-AI /utilities/tree/master/autoware_camera_lidar_calibrator.
    [40] S. Kato, E. Takeuchi, Y. Ishiguro, Y. Ninomiya, K. Takeda, and T. Hamada, “An Open Approach to Autonomous Vehicles,” IEEE Micro, vol. 35, no. 6, pp. 60–68, 2015.
    [41] M. Quigley et al., “ROS: an open-source Robot Operating System,” in ICRA workshop on open source software, 2009.

    下載圖示 校內:2023-12-31公開
    校外:2023-12-31公開
    QR CODE