成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	李泓哲 Lee, Hung-Che
論文名稱：	單目場景深度預測演算法之硬體實現 Hardware Implementation of Monocular Depth Estimation Algorithm
指導教授：	陳培殷 Chen, Pei-Yin
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering
論文出版年：	2020
畢業學年度：	108
語文別：	英文
論文頁數：	24
中文關鍵詞：	現場可程式化邏輯閘陣列(FPGA) 、單目場景深度預測、區塊匹配法、VLSI硬體實現
外文關鍵詞：	field-programmable gate array(FPGA), monocular depth estimation, block matching, very-large-scale integration
相關次數：	點閱：116 下載：2
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

深度資訊在近年來有許多不同的應用，像是3D重構、手機影片特效、物件偵測、先進駕駛輔助系統(ADAS)等。目前使用傳統影像處理的方式中，在雙目立體視覺有許多較成熟的演算法，也有許多實作在硬體架構上，但是在單目場景深度預測中，現今演算法沒辦法兼顧即時處理(real-time)以及準確度。因此本論文提出了一個適合硬體架構實現的單目場景深度預測演算法。
本論文所提出的單目場景深度預測演算法主要基於區塊匹配法，先利用區塊匹配法估計兩張連續影像之間的移動向量，並在之後將移動向量轉換成深度資訊，另外有幾個特點：(1) 我們提出兩種硬體架構，分別為基於兩層區塊匹配之硬體架構以及基於一層區塊匹配之硬體架構，前者有較佳的準確率，後者有較小的硬體資源使用量以及較快的速度。(2) 提出一個常數項的深度圖來做平滑以及補足背景資訊，藉此提高準確度。
最後與其它實現在硬體電路上的演算法比較，實驗結果證明本論文可以在KITTI資料集上得到更好的準確度，以及更低的硬體成本，並且可以達到即時處理(real-time)的速度。本篇論文利用均方根誤差(root mean square error, RMSE)作為深度評估指標。

The depth information has been used in various applications, such as 3D reconstruction, video effects on mobile, object detection, advanced driver assistance system (ADAS), etc. In the recent studies of depth estimation based on image processing, the binocular stereo vision has many mature algorithms implemented on the hardware architecture. However, in the monocular depth estimation, recent algorithm can’t take into account both real-time processing and accuracy. Therefore, this paper presents a monocular depth estimation algorithm suitable for hardware architecture implementation.
The monocular depth estimation algorithm proposed in this paper is based on the block matching method. We first use the block matching method to estimate the motion vector between two consecutive frames, and then convert the motion vector into depth information. There are several features in this algorithm: (1) We propose two hardware architectures, one based on two-layer block matching and the other based on one-layer block matching. The former has better accuracy, the latter has smaller hardware resource usage and higher speed. (2) We propose a constant depth map to smooth and use it as background information to improve accuracy.
Compared with other algorithms implemented on the hardware architecture, the experimental results prove that this paper can get better accuracy, lower hardware resource usage, real-time processing speed on the KITTI dataset. This paper uses root mean square error (RMSE) to measure accuracy.

摘要 I
Abstract II
誌謝 III
Contents IV
Figure Captions VI
Table VIII
Chapter 1. Introduction and Motivation 1
Chapter 2. Related Work 2
Chapter 3. Proposed Method 3
3.1 Create Pyramid Image 4
3.2 Block Matching 4
3.2.1 Initial Search Position 5
3.2.2 Select Block & Calculate SAD 6
3.2.3 Find Min SAD 7
3.3 Mean Filter 7
3.4 Depth from motion 8
3.5 One-level algorithm 9
Chapter 4. VLSI Implementation 10
4.1 Interpolator 11
4.2 Data Feeder 12
4.3 Block Matching Unit 13
4.4 Mean Filter 15
4.5 Depth from Motion Unit 16
Chapter 5. Experiments and Comparisons 17
5.1 Evaluation Methodology 18
5.2 Implementation Result 18
5.3 Visual Result 20
Chapter 6. Conclusion and Future Work 23
References 24
                                    

[1] EIGEN, David; PUHRSCH, Christian; FERGUS, Rob. Depth map prediction from a single image using a multi-scale deep network. In: Advances in neural information processing systems. 2014. p. 2366-2374.
[2] ZHOU, Tinghui, et al. Unsupervised learning of depth and ego-motion from video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017. p. 1851-1858.
[3] JIAO, Jianbo, et al. Look deeper into depth: Monocular depth estimation with semantic booster and attention-driven loss. In: Proceedings of the European conference on computer vision (ECCV). 2018. p. 53-69.
[4] AGUILAR-GONZÁLEZ, Abiel; ARIAS-ESTRADA, Miguel; BERRY, François. Depth from a motion algorithm and a hardware architecture for smart cameras. Sensors, 2019, 19.1: 53.
[5] SEYID, Kerem, et al. FPGA-based hardware implementation of real-time optical flow calculation. IEEE transactions on circuits and systems for video technology, 2016, 28.1: 206-216.
[6] GEIGER, Andreas, et al. Vision meets robotics: The kitti dataset. The International Journal of Robotics Research, 2013, 32.11: 1231-1237.
[7] BIERLING, Matthias. Displacement estimation by hierarchical blockmatching. In: Visual Communications and Image Processing'88: Third in a Series. International Society for Optics and Photonics, 1988. p. 942-953.
[8] KOENDERINK, Jan J.; VAN DOORN, Andrea J. Affine structure from motion. JOSA A, 1991, 8.2: 377-385.
[9] KOMAREK, Thomas; PIRSCH, Peter. Array architectures for block matching algorithms. IEEE Transactions on Circuits and Systems, 1989, 36.10: 1301-1308.

校內：2025-07-01公開
校外：不公開電子論文尚未授權公開，紙本請查館藏目錄

簡易檢索 / 詳目顯示

相關論文