| 研究生: |
尤宏恩 Yu, Hung-En |
|---|---|
| 論文名稱: |
以FPGA實現一即時臺灣車牌字元之卷積神經辨識器 An FPGA Implementation of a Real-Time CNN Discriminator for Taiwan License Plate Characters |
| 指導教授: |
陳進興
Chen, Chin-Hsing 張名先 Chang, Ming-Xian |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2025 |
| 畢業學年度: | 113 |
| 語文別: | 英文 |
| 論文頁數: | 83 |
| 中文關鍵詞: | 現場可規劃邏輯電路 、即時 、影像處理 、車牌字元辨識 、卷積神經網路 |
| 外文關鍵詞: | Field-Programmable Gate Array, Real-time, Image Processing, License Plate Character Recognition, Convolutional Neural Network |
| 相關次數: | 點閱:6 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近幾年,由於科技的高速發展以及工業化的發展,許多服務和設備都傾向自動化的調整,而影像辨識也自然是其中一個重點發展的領域。除了傳統的辨識方法以外,得利於網路世代的資料量暴增和圖形處理器(GPU)的技術成熟,也有越來越多人開始使用深度學習進行辨識,其中車牌字元辨識更是一大熱門領域。
本論文利用現場可程式化邏輯閘陣列(FPGA)作為實現媒介,並以深度學習中的卷積神經網路(Convolutional Neural Network,CNN)作為辨識方法,實現即時的車牌字元辨識系統。其中輸入層為一 30x30 的灰階圖片,輸出為 34 個類別,對應所有可能會出現的車牌字元。實現工具還會附上攝影機和 VGA 顯示器,使用者將藉由 VGA 顯示器的提示,將欲辨識的車牌字元對準攝影機中心,攝影機再將字元圖像傳入 CNN 模型進行字元辨識,最後將辨識結果顯示在 VGA 顯示器的右上方,並且整個過程皆為即時運行,因此能模擬真實不間斷的辨識情景。
本論文利用軟體進行 CNN 之參數訓練,軟體端的辨識結果為 83.49 %,並同時利用真實車牌字元截圖對硬體的資料流(Data Flow)進行模擬,模擬出的辨識率為81.46 %,最後在硬體端利用攝影機,接收真實世界中的即時影像進行測試,得出辨識率為77.97 %,證實該系統能表現出一定水準的車牌字元辨識。
In recent years, with the rapid advancement of technology and the progress of industrialization, many services and devices have tended toward automation, and image recognition has naturally become one of the key areas of development. In addition to traditional recognition methods, the explosive growth of data in the Internet era and the maturity of graphics processing unit (GPU) technology have led to the increasing adoption of deep learning for recognition tasks, among which license plate character recognition has emerged as a particularly popular research topic.
This thesis employs a field-programmable gate array (FPGA) as the implementation platform and adopts a convolutional neural network (CNN) from deep learning as the recognition method to realize a real-time license plate character recognition system. The input layer processes 30×30 grayscale images, and the output corresponds to 34 categories, representing all possible license plate characters. The implementation also integrates a camera and a VGA display. With guidance provided on the VGA display, the user aligns the target license plate character with the center of the camera, after which the captured character image is fed into the CNN model for recognition. The recognition result is then displayed in the upper-right corner of the VGA display. The entire process operates in real time, thereby simulating continuous recognition scenarios in real-world applications.
The parameters of the CNN were trained in software, achieving a recognition accuracy of 83.49%. In addition, real license plate character images were used to simulate the hardware data flow, yielding a recognition accuracy of 81.46%. Finally, real-time testing was conducted on hardware using the camera to capture live images, resulting in a recognition accuracy of 77.97%. These results demonstrate that the proposed system achieves a satisfactory level of performance in license plate character recognition.
[1] S. Agatonovic-Kustrin and R. Beresford, “Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research,” J. Pharm. Biomed. Anal., vol. 22, no. 5, pp. 717–727, 2000.
[2] aladdinss, “License Plate Digits Classification Dataset,” Kaggle. [Online]. Available: https://www.kaggle.com/datasets/aladdinss/license-plate-digits-classification-dataset.
[3] Motoki Amagasaki and Yuichiro Shibata, FPGA structure. Principles and Structures of FPGAs, pp. 47–86, 2018.
[4] A. Apicella et al., “A survey on modern trainable activation functions,” Neural Networks, vol. 138, pp. 14–32, 2021.
[5] B. E. Bayer, “Color imaging array,” U.S. Patent 3,971,065, Jul. 20, 1976.
[6] Directorate General of Highways, “English Letters Used in New License Plate Format,” Ministry of Transportation and Communications, Taiwan. [Online]. Available: https://www.thb.gov.tw/. Accessed: Jun. 18, 2025.
[7] IBM Corporation, VGA Technical Reference Manual, 1st ed. Boca Raton, FL, USA: IBM, 1987.
[8] ITU-R, Studio Encoding Parameters of Digital Television for Standard 4:3 and Wide-Screen 16:9 Aspect Ratios, Rec. BT.601-7, 2011.
[9] C. Janiesch, P. Zschech, and K. Heinrich, “Machine learning and deep learning,” Electronic Markets, vol. 31, no. 3, pp. 685–695, 2021.
[10] A. Krenker, J. Bešter, and A. Kos, “Introduction to the artificial neural networks,” in Artificial Neural Networks: Methodological Advances and Biomedical Applications, pp. 1–18, InTech, 2011.
[11] C. Poynton, Digital Video and HDTV: Algorithms and Interfaces. San Francisco, CA, USA: Morgan Kaufmann, 2003.
[12] A. D. Rasamoelina, F. Adjailia, and P. Sinčák, “A review of activation function for artificial neural network,” in Proc. 2020 IEEE 18th World Symp. Applied Machine Intelligence and Informatics (SAMI), pp. 281–286, 2020.
[13] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, no. 6088, pp. 533–536, 1986.
[14] J. C. Russ, The Image Processing Handbook, 6th ed. Boca Raton, FL, USA: CRC Press, 2011.
[15] VESA, Generalized Timing Formula (GTF) Standard, Ver. 1.1, Sep. 1999. [Online]. Available: https://vesa.org/