| 研究生: |
趙宣 Jhao, Syuan |
|---|---|
| 論文名稱: |
結合深度學習之電動代步車物件識別及單鏡頭距離偵測 Object Recognition and Single Camera Distance Estimation System of Electric Scooter using Deep Learning |
| 指導教授: |
戴政祺
Tai, Cheng-Chi |
| 共同指導教授: |
羅錦興
Luo, Ching-Hsing |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
| 論文出版年: | 2017 |
| 畢業學年度: | 105 |
| 語文別: | 英文 |
| 論文頁數: | 61 |
| 中文關鍵詞: | 即時物件識別 、距離估算 、深度學習 、嵌入式系統 、單眼鏡頭 、電動代步車 |
| 外文關鍵詞: | Real-time Object Recognition, Distance Estimation, Deep Learning, Single Webcam, Embedded System, Electric Scooter |
| 相關次數: | 點閱:151 下載:19 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
未來,台灣將面臨到老年化的時代。根據內政部統計,老年人口的比例將在2025年超過總人口數20%。因此,電動代步車的需求量更高,相對地電動代步車之道路安全問題也越來越重要。在早期,道路安全最普及和有效的方式是利用超音波模組或是光達來達到目的。但這些傳統的方法還是有其限制,例如,超音波的優點是低成本以及準確度高,然而缺點則是測量距離範圍過小(通常為2公分至400公分);光達的缺點則是只能掃描單一平面,且造價較昂貴。而兩者最顯著的不足是他們都無法提供使用者直覺的圖形輸出。基於以上原因,本研究提供了以單眼鏡頭結合深度學習來進行物件辨識與距離估算,此系統可以應用於電動代步車之道路駕駛。
物件辨識是基於名為Single Shot MultiBox Detector(SSD)的方法。這種方法使用單一深層神經網路來檢測圖片中的目標。 SSD將每一個特徵圖位置的不同比例和比例之邊界框的輸出分離成一組默認框。選擇使用這種方法的原因是即使輸入較小尺寸之圖片,SSD也能獲得很好的精準度。也因為它完全不用經過提案生成和隨後之像素重新採樣階段,並將所有計算封裝在單一網路中。這些特性使得SSD易於訓練並直接結合到需要物件偵測之系統中。
實驗結果證實,搭配使用PASCAL VOC之數據集可以有效地檢測道路上常見的物體,如行人和車輛。以300像素×300像素之大小輸入,SSD在獨立顯示卡Nvidia GTX 1050 Ti歷經160小時訓練模型。取得每秒24 張禎數並得到高達92.4%的辨識準確率。完成物件識別後,下一步將進行距離之估計。本研究提供了兩種距離估計方法,一個是基於車輛在圖片和現實中的寬度來進行幾何換算。第二種方法是使用圖片中之車輛垂直位置,藉由距離和垂直位置成正比之關係來進行計算。兩種方法之最大測量距離皆可達到40公尺。
此外,本系統可以通過網路與家中的電腦連線,傳送單鏡頭的畫面,在電腦端進行物件識別之後續計算。完成計算後,便可得到即時處理的畫面加上物件識別和距離估計的結果。因此,不僅坐在電動代步車上的使用者,在家中的照顧者也可以透過網路看到即時行車畫面。
本論文結合了最近熱門之深度學習模型,使用常見之單鏡頭達到即時物件辨識以及距離估算之功能,期望能增進未來電動代步車之行車安全,為我們的長輩,也為未來的我們做準備。
In the future, we are facing the issue of aging population in Taiwan. According to the statistics of Ministry of the Interior, the population of the elders will reach more than 20% of the total population in 2025. Thus, the need of electric scooter is higher, and so is the issue of on-road safety. In traditional way, using the ultra-sonic module or light detection and ranging (LIDAR) is a popular and efficient solution for safety driving. But there are still some limitations. For example, the advantages of ultra-sonic is low cost and high precision while the disadvantage is low distance range (usually from 2 cm to 400cm). The LIDAR can only scan a plane and cost a lot if we want real-time computation. The most significant disadvantage is that both of them cannot offer the graphic output which is the most acceptable by human beings. Due to these reasons, this study provides object recognition and distance estimation with single webcam. The study can be applied to on-road driving system for electric scooter.
The object recognition is based on method of Single Shot MultiBox Detector (SSD). This approach uses a single deep neural network for detecting objects in images. SSD discretizes the output space of bounding boxes into a set of default boxes over different ratios and scales per feature map location. It makes SSD has much better accuracy even with a smaller input image size. Because it completely eliminates proposal generation and subsequent pixel resampling stages and encapsulates all computation in a single network. It makes SSD easy to train and straightforward to integrate into systems which requires a detection component. Experimental results on the PASCAL VOC dataset that confirms that it can effectively detect objects on the road such as pedestrians and vehicles. For a 300×300 input, SSD achieves 92.4% accuracy on VOC2007 test at 24 FPS with a Nvidia GTX 1050 Ti in 160 hours of model training. After finishing the object recognition, the next step is going to implement distance estimation. This research provides two methods for distance estimation. The first method is based on the width of vehicles in the picture and reality. The second method is to use the distance proportional to the vertical position in the image. Both methods can reach the maximum detection distance to 40 meters.
In addition, the system is able to connect with the computer in house via the internet and transfer the frame of the webcam for the follow-up calculation of object recognition. After the computation, the system will get the real-time processed image with the result of object recognition and distance estimation. So not only the user sitting on the electric scooter but also the caregiver at home can see the real-time object recognition.
This research combines the deep learning model, low-cost single camera and embedded system. This system can be expanded to other assistive products for personal mobility. We expect the research can enhance the safety for driver of electric scooter.
References
[1] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu and Alexander C. Berg. “SSD: Single Shot MultiBox Detector”. Computer Vision – ECCV 2016. ECCV 2016. Lecture Notes in Computer Science, vol 9905, 2016.
[2] Yu-Chih Cho, Jau-Yuan Shiao, Chuan-Chi Shiao, Fang-Dian Tsai, Pei-Chung Chen and Ching-Hsing Luo. “Embedded Driving Assistive System of Electrical Power Scooter for People with Parkinson’s Disease and the Elderly”. Taiwan Rehabilitation Engineering and Assistive Technology Society, 2015.
[3] Sungji Han, Youngjoon Han and Hernsoo Hahn. “Vehicle Detection Method using Haar-like Feature on Real Time System”. World Academy of Science, Engineering and Technology 59, 2009.
[4] Giseok Kim and Jae-Soo Cho. “Vision-based Vehicle Detection and Inter-Vehicle Distance Estimation”. 12th International Conference on Control, Automation and System, 2012.
[5] Qilong Zhang and Robert Pless. “Extrinsic Calibration of a Camera and Laser Range Finder (improves camera calibration)”. lEEEIi7S.J International Conference on Intelligent Robots and Systems. 2004.
[6] Huieun Kim, Youngwan Lee, Byeounghak Yim, Eunsoo Park and Hakil Kim. “On-road object detection using Deep Neural Network”. IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia), 2016.
[7] Huei-Yung Lin, Li-Qi Chen and Yu-Hsiang Lin. “Lane Departure and Front Collision Warning Using a Single Camera”. IEEE International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS 2012) November 4-7, 2012.
[8] Masataka, H., Tetsuo, S., Satoshi, M., Masanori, S., Shunsuke, K. and Takashi, S. “Development of an intelligent mobility scooter”. International Conference on Mechatronics and Automation, 2012.
[9] Joseph Redmon, Santosh Divvala, Ross Girshick and Ali Farhadi. “You Only Look Once : Unified, Real-Time Object Detection”. Computer Vision and Pattern Recognition, 2016.
[10] Ross Girshick. “Fast R-CNN”. IEEE International Conference on Computer Vision (ICCV), 2015.
[11] Dumitru Erhan, Christian Szegedy, Alexander Toshev and Dragomir Anguelov. “Scalable Object Detection using Deep Neural Networks”. Computer Vision and Pattern Recognition, 2014.
[12] Seokmok Park, Daehee Kim, Sanpil Han, Min-jae Kim and Joonki Paik. “Monocular Fisheye Lens Model-Based Distance Estimation for Forward Collision Warning Systems”. IEEE, 2016.
[13] Alex Krizhevsky, Ilya Sutskever and Geoffrey E. Hinton. “ImageNet Classification with Deep Convolutional Neural Networks”. Neural Information Processing Systems (NIPS). 2012.
[14] Christian Szegedy, Alexander Toshev and Dumitru Erhan. “Deep Neural Networks for Object Detection”. Neural Information Processing Systems (NIPS). 2013.
[15] RS Components, “RPiUsersGuide”. 2012.
[16] 趙永科. “深度學習-21天實戰Caffe”. 2016.
[17] Alexander Mordvintsev and Abid K. “OpenCV-Python Tutorials Documentation”. 2014