| 研究生: |
徐百寬 Hsu, Pai-Kuan |
|---|---|
| 論文名稱: |
影片中字幕偵測、追蹤及切割方法之研究 A Study on Caption Detection, Tracking and Segmentation in Video |
| 指導教授: |
王明習
Wang, Ming-Shi |
| 學位類別: |
碩士 Master |
| 系所名稱: |
工學院 - 工程科學系 Department of Engineering Science |
| 論文出版年: | 2003 |
| 畢業學年度: | 91 |
| 語文別: | 中文 |
| 論文頁數: | 61 |
| 中文關鍵詞: | 文字切割 、文字追蹤 、文字偵測 |
| 外文關鍵詞: | Text Segmentation, Text Tracking, Text Detection |
| 相關次數: | 點閱:67 下載:4 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
影片畫面中之字幕可以幫助我們瞭解影片的內容,若能取得字幕的文字資訊,將有助於建立起影片資料庫之影片內容註解及索引等工作。本論文提出一套用於影片檔案,自動地做字幕文字偵測、追蹤及切割的系統,來得到影片畫面中之內容的資訊。
文字偵測方法是利用邊緣檢測運算子以找出影片中畫面之垂直及水平方向的邊緣圖,首先使用垂直邊緣資訊找出可能的文字區域,然後使用水平邊緣資訊來消除偵測出的雜訊區域。偵測出的文字區域依據影片中同一字幕會持續一段時間來做文字區域的追蹤,利用特徵點之影像比對方式,找出連續畫面中之同一固定不動的字幕文字區域。使用文字偵測搭配文字追蹤方法,其實驗結果顯示對影片可以做即時(real-time)處理。
文字切割方法是利用多畫面整合的方式來加大文字區域內文字部分和背景部分的差異性,再使用影像二值化來取得文字部分。其實驗結果顯示一般應用於影像的中文文字辨識軟體,當使用在影片的畫面上做文字辨識時,辨識正確率可以由34%提升至約80%。
The captions in a video frame can help us to understand the content of the video. If the captions in a video can be detected, segmented, and recognized automatically, it is valuable for making video annotation and indexing. In this thesis, an automatic caption detection, tracking, and segmentation system has been proposed to get the information about the video content.
The text detection method uses edge detection operators on the video frames to obtain its vertical and horizontal edge maps. The vertical edge information is used to detect these candidate text areas. Then horizontal edge information is applied to eliminate some of false candidates. The detected areas are tracked in the next frame according to the fact that the same caption in video is stayed on several successive frames. Feature based image matching technique is used for tracking the same text area. Experimental results show that the proposed scheme for the text detection and tracking in video can process in real time.
Multi-frame integration method is used to enhance the difference between the text image and the background of the text area. The enhanced image is then binarized to get the text in the text area. It can segment out the characters of the text. Commercial Chinese OCR software was applied to recognize the segmented characters of the text. It is showing that the recognition rate is raised from 34% to 80%.
[1]U. Gargi, S. Antani and R. Kasturi., “Indexing Text Events in Digital Video Databases”, 14th International Conference on Pattern Recognition(ICPR), pp. 916-918, 1998
[2]Rainer Lienhart and Frank Stuber, “Automatic Text Recognition in Digital Videos”, in Proceedings of ACM Multimedia, pp. 11-20, 1996
[3]Jae-Chang Shim, Chitra Dorai and Ruud Bolle, ”Automatic Text Extraction from Video for Content-Based Annotation and Retrieval”, in Proc. 14th International Conference on Pattern Recognition, Vol. 1, pp. 618-620, 1998.
[4] A. K. Jain and B. Yu. “Automatic Text Localization in Images and Video Frames”, Pattern Recognition, Vol. 31, No. 12, pp. 2055-2076, 1998
[5] Byung Tae Chun, Yonglae Bae and Tai-Yun Kim, “Caption Segmentation Method in Videos using Isodata Clustering of Topographical Features”, TENCON 99. Proceedings of the IEEE Region 10 Conference, Vol. 2, pp. 915 –918, 1999
[6]Byung Tae Chun, Yonglae Bae and Tai-Yun Kim, “A Method for Original Image Recovery for Caption Areas in Video”, IEEE SMC '99 Conference Proceedings. IEEE International Conference on , Vol. 2, pp. 930 -935, 1999
[7]Yu Zhong, Hongjiang Zhang and Jain A.K., ”Automatic Caption Localization in Compressed Video”, Pattern Analysis and Machine Intelligence, IEEE Transactions on , Vol. 22, No. 4, pp. 385 –392, April 2000
[8]Yi Zhang and Tat-Seng Chua, “Detection of Text Captions in Compressed Domain Video”, ACM Multimedia Workshops, pp. 201-204, 2000
[9]Crandall, D. and Kasturi, R., “Robust Detection of Stylized Text Events in Digital Video”, Document Analysis and Recognition Proceedings. Sixth International Conference on, pp. 865-869, 2001
[10]Agnihotri, L. and Dimitrova, N., “Text Detection for Video Analysis”, Content-Based Access of Image and Video Libraries, IEEE Workshop on , pp. 109-113, 1999
[11]T. Sato, T. Kanade, E. Hughes, M. Smith and S. Satoh, “Video OCR: Indexing Digital News Libraries by Recognition of Superimposed Caption”, ACM Multimedia Systems Special Issue on Video Libraries, pp. 385-394, 1998
[12]Victor Wu, R. Manmatha and Edward M.Riseman, “Finding Text In Images”, Proceedings of the Second ACM International Conference onDigital Libraries, pp. 3 -12, 1997
[13]Xian-Sheng Hua, Xiang-Rong Chen, Liu Wenyin and Hong-Jiang Zhang, “Automatic Location of Text in Video Frames”, 3rd Intl Workshop on Multimedia Information Retrieval (MIR2001), Ottawa, Canada, 2001
[14]Xiangrong Chen and Hong Jiang Zhang, “Text Area Detection from Video Frames”, IEEE Pacific Rim Conference on Multimedia, pp. 222-228, 2001
[15]H. Li, D. Doermann and O. Kia., “Automatic Text Detection and Tracking in Digital Video”, IEEE Transactions on Image Processing - Special Issue on Image and Video Processing for Digital Libraries, Vol.9, No.1, pp. 147-156, 2000
[16]H. Li, D. Doermann and O. Kia., “Text Extraction, Enhancement and OCR in Digital Video”, Springer-Verlag, 1999 .
[17]H. Li and D. Doermann., “Automatic Identification of Text In Digital Video Key Frames”, Proceedings of International Conference on Pattern Recognition, pp. 129-132, 1998
[18]Datong Chen, Juergen Luettin and Kim Shearer, “A Survey of Text Detection and Recognition in Images and Videos”, IDIAP Research Report, pp. 00-38, 2000
[19]K. Y. Jeong, K. Jung, E. Y. Kim and H. J. Kim, “Neural Network Based Text Location for News Video Indexing”, in Proc. Int. Conf. Image Processing, Vol. 3, pp. 319–323, 1999
[20]Xiaoou Tang, Xinbo Gao, Jianzhuang Liu and Hongjiang Zhang , “A Spatial-Temporal Approach for Video Caption Detection and Recognition”, Neural Networks, IEEE Transactions on , Vol. 13, No. 4 , pp. 961-971, 2002
[21]Christian Wolf , Jean-Michel Jolion and Francoise Chassaing, ” Text Localization, Enhancement and Binarization in Multimedia Documents”, In Proceedings of the International Conference on Pattern Recognition (ICPR), Vol.4, pp. 1037-1040, 2002
[22]Lienhart R. and Wernicke A., “Localizing and segmenting text in images and videos”, Circuits and Systems for Video Technology, IEEE Transactions on , Vol. 12, No. 4, pp. 256-268 , 2002
[23]Sangshin Kwak, Kyusik Chung and Yeongwoo Choi, “Video Caption Image Enhancement for an Efficient Character Recognition”, In Proceedings of the International Conference on Pattern Recognition (ICPR), pp. 2606-2609, 2000