成功大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	徐百寬 Hsu, Pai-Kuan
論文名稱：	影片中字幕偵測、追蹤及切割方法之研究 A Study on Caption Detection, Tracking and Segmentation in Video
指導教授：	王明習 Wang, Ming-Shi
學位類別：	碩士 Master
系所名稱：	工學院 - 工程科學系 Department of Engineering Science
論文出版年：	2003
畢業學年度：	91
語文別：	中文
論文頁數：	61
中文關鍵詞：	文字切割、文字追蹤、文字偵測
外文關鍵詞：	Text Segmentation, Text Tracking, Text Detection
相關次數：	點閱：106 下載：4
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

影片畫面中之字幕可以幫助我們瞭解影片的內容，若能取得字幕的文字資訊，將有助於建立起影片資料庫之影片內容註解及索引等工作。本論文提出一套用於影片檔案，自動地做字幕文字偵測、追蹤及切割的系統，來得到影片畫面中之內容的資訊。
文字偵測方法是利用邊緣檢測運算子以找出影片中畫面之垂直及水平方向的邊緣圖，首先使用垂直邊緣資訊找出可能的文字區域，然後使用水平邊緣資訊來消除偵測出的雜訊區域。偵測出的文字區域依據影片中同一字幕會持續一段時間來做文字區域的追蹤，利用特徵點之影像比對方式，找出連續畫面中之同一固定不動的字幕文字區域。使用文字偵測搭配文字追蹤方法，其實驗結果顯示對影片可以做即時（real-time）處理。
文字切割方法是利用多畫面整合的方式來加大文字區域內文字部分和背景部分的差異性，再使用影像二值化來取得文字部分。其實驗結果顯示一般應用於影像的中文文字辨識軟體，當使用在影片的畫面上做文字辨識時，辨識正確率可以由34%提升至約80%。

The captions in a video frame can help us to understand the content of the video. If the captions in a video can be detected, segmented, and recognized automatically, it is valuable for making video annotation and indexing. In this thesis, an automatic caption detection, tracking, and segmentation system has been proposed to get the information about the video content.
The text detection method uses edge detection operators on the video frames to obtain its vertical and horizontal edge maps. The vertical edge information is used to detect these candidate text areas. Then horizontal edge information is applied to eliminate some of false candidates. The detected areas are tracked in the next frame according to the fact that the same caption in video is stayed on several successive frames. Feature based image matching technique is used for tracking the same text area. Experimental results show that the proposed scheme for the text detection and tracking in video can process in real time.
Multi-frame integration method is used to enhance the difference between the text image and the background of the text area. The enhanced image is then binarized to get the text in the text area. It can segment out the characters of the text. Commercial Chinese OCR software was applied to recognize the segmented characters of the text. It is showing that the recognition rate is raised from 34% to 80%.

中文摘要 I
英文摘要 II
誌謝 III
目錄 IV
表目錄 VI
圖目錄 VII

第一章 緒論1
1.1研究動機1
1.2研究目的2
1.3論文架構3

第二章 相關文獻探討4
2.1文字偵測方法5
2.1.1以顏色相似度為基礎的方法7
2.1.2以邊緣基礎的方法9
2.1.3以材質分類為基礎的方法13
2.2文字追蹤方法15
2.2.1以固定位置為基礎的方法15
2.2.2以像素點值做比較為基礎的方法16
2.2.2以特殊統計值做比較為基礎的方法19
2.3文字切割方法15

第三章 字幕文字偵測、追蹤及分割演算法23
3.1系統概要23
3.2文字偵測方法24
3.2.1邊緣圖產生器25
3.2.2連接元件方法27
3.2.3文字區域的驗證30
3.3文字追蹤方法…32
3.3.1特徵點之選擇32
3.3.2以特徵點為基礎的影像比對方法34
3.4文字切割方法35
3.4.1影像的強化─多畫面整合36
3.4.2影像二值化37

第四章 系統實作與實驗結果40
4.1系統環境與操作介面40
4.2實驗結果42
4.2.1文字偵測處理42
4.2.2文字追蹤處理46
4.2.3文字切割處理47

第五章 結論與未來研究方向51
5.1結論51
5.2未來研究方向52

參考文獻53
附錄一 影片中同一字幕持續時間之統計56
附錄二 影片中字幕文字偵測及切割之執行結果圖58
                                    

[1]U. Gargi, S. Antani and R. Kasturi., “Indexing Text Events in Digital Video Databases”, 14th International Conference on Pattern Recognition(ICPR), pp. 916-918, 1998
[2]Rainer Lienhart and Frank Stuber, “Automatic Text Recognition in Digital Videos”, in Proceedings of ACM Multimedia, pp. 11-20, 1996
[3]Jae-Chang Shim, Chitra Dorai and Ruud Bolle, ”Automatic Text Extraction from Video for Content-Based Annotation and Retrieval”, in Proc. 14th International Conference on Pattern Recognition, Vol. 1, pp. 618-620, 1998.
[4] A. K. Jain and B. Yu. “Automatic Text Localization in Images and Video Frames”, Pattern Recognition, Vol. 31, No. 12, pp. 2055-2076, 1998
[5] Byung Tae Chun, Yonglae Bae and Tai-Yun Kim, “Caption Segmentation Method in Videos using Isodata Clustering of Topographical Features”, TENCON 99. Proceedings of the IEEE Region 10 Conference, Vol. 2, pp. 915 –918, 1999
[6]Byung Tae Chun, Yonglae Bae and Tai-Yun Kim, “A Method for Original Image Recovery for Caption Areas in Video”, IEEE SMC '99 Conference Proceedings. IEEE International Conference on , Vol. 2, pp. 930 -935, 1999
[7]Yu Zhong, Hongjiang Zhang and Jain A.K., ”Automatic Caption Localization in Compressed Video”, Pattern Analysis and Machine Intelligence, IEEE Transactions on , Vol. 22, No. 4, pp. 385 –392, April 2000
[8]Yi Zhang and Tat-Seng Chua, “Detection of Text Captions in Compressed Domain Video”, ACM Multimedia Workshops, pp. 201-204, 2000
[9]Crandall, D. and Kasturi, R., “Robust Detection of Stylized Text Events in Digital Video”, Document Analysis and Recognition Proceedings. Sixth International Conference on, pp. 865-869, 2001
[10]Agnihotri, L. and Dimitrova, N., “Text Detection for Video Analysis”, Content-Based Access of Image and Video Libraries, IEEE Workshop on , pp. 109-113, 1999
[11]T. Sato, T. Kanade, E. Hughes, M. Smith and S. Satoh, “Video OCR: Indexing Digital News Libraries by Recognition of Superimposed Caption”, ACM Multimedia Systems Special Issue on Video Libraries, pp. 385-394, 1998
[12]Victor Wu, R. Manmatha and Edward M.Riseman, “Finding Text In Images”, Proceedings of the Second ACM International Conference onDigital Libraries, pp. 3 -12, 1997
[13]Xian-Sheng Hua, Xiang-Rong Chen, Liu Wenyin and Hong-Jiang Zhang, “Automatic Location of Text in Video Frames”, 3rd Intl Workshop on Multimedia Information Retrieval (MIR2001), Ottawa, Canada, 2001
[14]Xiangrong Chen and Hong Jiang Zhang, “Text Area Detection from Video Frames”, IEEE Pacific Rim Conference on Multimedia, pp. 222-228, 2001
[15]H. Li, D. Doermann and O. Kia., “Automatic Text Detection and Tracking in Digital Video”, IEEE Transactions on Image Processing - Special Issue on Image and Video Processing for Digital Libraries, Vol.9, No.1, pp. 147-156, 2000
[16]H. Li, D. Doermann and O. Kia., “Text Extraction, Enhancement and OCR in Digital Video”, Springer-Verlag, 1999 .
[17]H. Li and D. Doermann., “Automatic Identification of Text In Digital Video Key Frames”, Proceedings of International Conference on Pattern Recognition, pp. 129-132, 1998
[18]Datong Chen, Juergen Luettin and Kim Shearer, “A Survey of Text Detection and Recognition in Images and Videos”, IDIAP Research Report, pp. 00-38, 2000
[19]K. Y. Jeong, K. Jung, E. Y. Kim and H. J. Kim, “Neural Network Based Text Location for News Video Indexing”, in Proc. Int. Conf. Image Processing, Vol. 3, pp. 319–323, 1999
[20]Xiaoou Tang, Xinbo Gao, Jianzhuang Liu and Hongjiang Zhang , “A Spatial-Temporal Approach for Video Caption Detection and Recognition”, Neural Networks, IEEE Transactions on , Vol. 13, No. 4 , pp. 961-971, 2002
[21]Christian Wolf , Jean-Michel Jolion and Francoise Chassaing, ” Text Localization, Enhancement and Binarization in Multimedia Documents”, In Proceedings of the International Conference on Pattern Recognition (ICPR), Vol.4, pp. 1037-1040, 2002
[22]Lienhart R. and Wernicke A., “Localizing and segmenting text in images and videos”, Circuits and Systems for Video Technology, IEEE Transactions on , Vol. 12, No. 4, pp. 256-268 , 2002
[23]Sangshin Kwak, Kyusik Chung and Yeongwoo Choi, “Video Caption Image Enhancement for an Efficient Character Recognition”, In Proceedings of the International Conference on Pattern Recognition (ICPR), pp. 2606-2609, 2000

2004-07-10公開

簡易檢索 / 詳目顯示

相關論文