| 研究生: |
嚴聖閎 Yen, Sheng-Hung |
|---|---|
| 論文名稱: |
快速SIFT特徵點描述器之硬體設計 Hardware Implementation for Fast SIFT Keypoint Descriptor |
| 指導教授: |
陳培殷
Chen, Pei-Yin |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2010 |
| 畢業學年度: | 98 |
| 語文別: | 中文 |
| 論文頁數: | 54 |
| 中文關鍵詞: | 尺度不變特徵轉換 、物件辨識 、VLSI 架構 |
| 外文關鍵詞: | scale invariant feature transform (SIFT), object recognition, VLSI architecture |
| 相關次數: | 點閱:91 下載:6 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
物件匹配與影像辨識在現今電腦視覺相關應用上扮演很重要的角色,例如影像監視、自主式汽車導航系統以及可行動智慧型機器人。在過去有相當多的技術被提出,其中以區域特徵點偵測與描述演算法最為常見。特徵點偵測與描述方法在計算時必須考慮兩部分,第一是在空間中的尺度與座標裡找出特徵點,第二則是對此特徵點作描述,而特徵點描述必須具備獨特性,且能抵抗各種物體在影像中因為角度與光線所造成的影像改變,以維持不變性。
尺度不變特徵轉換 (Scale Invariant Feature Transform, SIFT) 是一個有效率的特徵點偵測與描述演算法,被廣泛的應用在物件辨識與電腦視覺領域上,其對於影像旋轉、縮放、亮度差異等變化皆具有強健性及不變性,然而SIFT卻因為計算複雜度高,因此需要實作成硬體來提升速度。一般而言,SIFT是由尺度空間極值偵測、特徵點定位、特徵點方向性以及特徵點描述等四個步驟組成,經過分析後發現特徵點描述運算約佔整體計算複雜度之65%,因此我們針對特徵點描述部分,提出一個有效的VLSI架構以提升計算速度。
在設計考量上,我們先對特徵點描述運算裡的座標旋轉(Rotation)、高斯權重(Gaussian Weight)、三維線性內插(Trilinear Interpolation)以及向量正規化(Normalization)進行軟體模擬,接著提出合適的硬體電路,最後透過管線化設計加快電路速度。本論文使用TSMC 0.13微米製程來實現此電路,電路的邏輯閘數(Gate Count)為155.13k,佈局後包含記憶體面積為1437×1431μm2,平均功率消秏為72.35mW。在管線化設計考量下,其工作頻率可達到200MHz,每秒可以處理72385個影像中的特徵點,與軟體端的產能比較上平均提升35倍的處理效能,並且保有SIFT演算法的高度匹配率。
The scale invariant feature transform (SIFT) provides an efficient tool to extract and describe distinctive invariant features from images and finds many applications in image analysis and computer vision. The generation of the SIFT descriptor is computationally intensive, so the design and implementation of a high speed hardware accelerator is desired. In this paper, we present a VLSI architecture to compute the SIFT descriptor efficiently. To achieve this objective, we first evaluate the workload of the SIFT algorithm and show that the SIFT descriptor computation demands the highest cost. Then, we design proper hardware to realize this process accordingly and adopt a pipelining technique to enhance the speed of the design furthermore. The circuit of the proposed SIFT descriptor computational unit contains 155.13k gate counts with a core size of 1437x1431μm2 using the cell library in TSMC 0.13μm technology. It can operate at a clock rate of 200MHz with average power consumption of 72.35mW.
[1] C. Stauffer and W. E. L Grimson, “Learning patterns of activity using real-time tracking,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 747-757, Aug. 2000.
[2] U. Ozguner, C. Stiller, and K. Redmill, “Systems for safety and autonomous behavior in cars: The DARPA grand challenge experience,” in Proc. IEEE, vol. 95, no. 2, pp. 397-412, Feb. 2007.
[3] C. Y. Lin, P. C. Jo, and C. K. Tseng, “Multi-functional intelligent robot DOC-2,” in Proc. IEEE-RAS Int. Conf. Humanoid Robots, pp. 530-535, Dec. 2006.
[4] C. Harris and M. Stephens, “A combined corner and edge detector,” in Alvey Vision Conf., pp. 147–151, 1988.
[5] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Computer Vision, vol. 60, no. 2, pp. 91-110, Jan. 2004.
[6] L. Van Gool, T. Moons, and D. Ungureanu, “Affine/photometric invariants for planar intensity patterns, ” in Proc. Eur. Conf. Computer Vision, pp. 642-651, 1996.
[7] R. Fergus, P. Perona, and A. Zisserman, “Object class recognition by unsupervised scale-invariant learning,” in Proc. Computer Vision and Pattern Recognition, Jun. 2003.
[8] D. G. Lowe, “Object recognition from local scale-invariant features,” in Proc. Int. Conf. Computer Vision, pp. 1150–1157, 1999.
[9] K. Mikolajczyk and C. Schmid, “Indexing based on scale invariant interest points,” in Proc. Int. Conf. Computer Vision, pp. 525–531, Jul. 2001.
[10] H. Moravec, “Rover visual obstacle avoidance,” in Int. Joint Conf. Artificial Intelligence, pp.785-790, 1981.
[11] C. Harris, “Geometry from visual motion.” in Active Vision, pp. 263-284, 1992.
[12] C. Schmid and R. Mohr, “Local gray value invariants for image retrieval,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol.19, no. 5, pp. 530-534, 1997.
[13] J.J. Koenderink, “The structure of images,” Biological Cybernetics, vol. 50, no. 5, pp. 363-396, 1984.
[14] T. Lindeberg, “Scale-space theory: A basic tool for analysing structures at different scales,” J. Applied Statistics, vol. 21, no. 2, pp. 224-270, 1994.
[15] A. Shokoufandeh, I. Marsic, and S. J. Dickinson, “View-based object recognition using saliency maps,” Image and Vision Computing, vol. 17, pp. 445-460, 1999.
[16] G. Carneiro and A. D. Jepson, “Phase-Based Local Features,” in Proc. Eur. Conf. Computer Vision, pp. 282-296, 2002.
[17] K. Mikolajczyk and C. Schmid. “A performance evaluation of local descriptors,” in Proc. Computer Vision and Pattern Recognition, Jun. 2003.
[18] K. Mikolajczyk and C. Schmid, “A performance evaluation of local descriptors,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, Oct. 2005.
[19] S. Belongie, J. Malik, and J. Puzicha, “Shape Matching and Object Recognition Using Shape Contexts,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 2, no. 4, pp. 509-522, Apr.2002.
[20] W. Freeman and E. Adelson, “The Design and Use of Steerable Filters,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, no. 9, pp. 891-906, Sep.1991.
[21] S. Lazebnik, C. Schmid, and J. Ponce, “A Sparse Texture Representation Using Affine-Invariant Regions,” in Proc. Conf. Computer Vision and Pattern Recognition, pp. 319-324, jun. 2003.
[22] F. Schaffalitzky and A. Zisserman, “Multi-View Matching for Unordered Image Sets,” in Proc. Eur. Conf. Computer Vision, pp. 414-431, 2002.
[23] H. D. Chati, F. Muhlbauer, T. Braun, C. Bobda, and K. Berns, “Hardware/software co-design of a keypoint detector on FPGA,” in Proc. Annual IEEE Symp. Field-Program. Custom Computer Machine, pp. 355-356, Apr. 2007.
[24] V. Bonato, E. Marques, and G. A. Constantinides, “A parallel hardware architecture for scale and rotation invariant feature detection,” IEEE Trans. Circuits System Video Technology, vol. 18, no. 12, pp. 1703-1712, Dec. 2008.
[25] L. Yao, H. Feng, Y. Zhu, Z. Jiang, D. Zhao, and W. Feng, “An architecture of optimised SIFT feature detection for an FPGA implementation of an image matcher,” in Proc. IEEE Int. Conf. Field-Program. Technology, pp. 30-37, Dec. 2009.
[26] J. Qiu, T. Huang, and T. Ikenaga, “A FPGA-based dual-pixel processing pipelined hardware accelerator for feature point detection part in SIFT,” in Proc. IEEE Int. Joint Conf. INC, IMS and IDC, pp. 1668-1674, Aug. 2009.
[27] J. Qiu, Y. Lu, T. Huang, and T. Ikenaga, “A FPGA-based real-time accelerator for orientation calculation part in SIFT,” in Proc. IEEE Int. Conf. Intelligent Information Hiding and Multimedia Signal Processing, pp. 1334-1337, Sept. 2009.
[28] D. Kim, K. Kim, J. Y. Kim, S. Lee, S. J. Lee, and H. J. Yoo, “81.6 GOPS object recognition processor based on a memory-centric NoC,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 17, no. 3, pp. 370-383, Mar. 2009.
[29] Q. Zhang, Y. Chen, Y. Zhang, and Y. Xu, “SIFT implementation and optimization for multi-core systems,” in Proc. IEEE Int. Symp. Parallel and Distributed Processing, pp. 1-8, Apr. 2008.
[30] H. Feng, E. Li, Y. Chen, and Y Zhang, “Parallelization and characterization of SIFT on multi-core systems,” in Proc. IEEE Int. Symp. Workload Characterization, pp. 14-23, Sept. 2008.
[31] OpenMP Application Program Interface, Version 2.5, May 2005.
[32] M. Brown and D. G. Lowe, “Invariant features from interest point groups,” in British Machine Vision Conf., pp. 656-665, 2002