| 研究生: |
王偉丞 Wang, Wei-Cheng |
|---|---|
| 論文名稱: |
基於時空一致性之監視影片標記演算法 Spatiotemporal Coherence based Annotation Placement for Surveillance Videos |
| 指導教授: |
詹寶珠
Chung, Pau-Choo |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 電腦與通信工程研究所 Institute of Computer & Communication Engineering |
| 論文出版年: | 2016 |
| 畢業學年度: | 104 |
| 語文別: | 英文 |
| 論文頁數: | 56 |
| 中文關鍵詞: | 監視器影片 、特徵描述 、馬可夫隨機場 |
| 外文關鍵詞: | Video surveillance, Feature representation, Markov random fields |
| 相關次數: | 點閱:90 下載:1 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在本篇論文中,我們提出利用物體與其標記間的時間與空間相關性以呈現監視影片中前景物體的資訊。此外我們將計算標記的位置問題化為非線性最佳化的問題,並進一步使用馬可夫隨機場來解決此最佳化問題。據我們所知,此篇論文為首篇以前景物體的軌跡為單位,藉由前景物體與其標記關係來探討並計算標記位置的論文。如實驗所示,本論文所提出的方法可根據前景物體軌跡,並且在避免標記跟前景物體之間的重疊的條件下,計算標記在影片中的位置。我們的結果無論是在圖像上或是數據上,都較現有相關領域類似的研究更為優異。
In this paper, we propose a novel annotation placement approach for revealing information of foreground objects in surveillance videos. To arrange positions of annotations, spatiotemporal coherence between annotations and foreground objects is applied. The annotation placement problem is formulated as an optimization problem with respect to spatiotemporal coherence of annotations and foreground objects. The optimization problem is effectively solved by using Markov random fields (MRFs). To the best of our knowledge, this paper is the first work to discuss and solve the annotation placement problem for surveillance videos by considering the relationships between annotations and foreground objects with trajectories. As shown in the experiments, the proposed approach can arrange annotations based on the moving trajectories of foreground objects and prevent the occlusions between different annotations and foreground objects. It also achieves better quantitative and qualitative results compared to state-of-the-art approaches.
[1] S. M. Khan, and M. Shah, "Tracking Multiple Occluding People by Localizing on Multiple Scene Planes," IEEE Trans Pattern Analysis and Machine Intelligence, vol. 31, no. 3, pp. 505−519, 2009.
[2] P.V.K. Borges, N. Conci, and A. Cavallaro, "Video-Based Human Behavior Understanding: A Survey," IEEE Trans. on Circuits and Systems for Video Technology, vol. 23, no. 11, pp. 1993−2008, Nov. 2013.
[3] C.-R. Huang, P.-C. Chung, D.-K. Yang, H.-C. Chen, and G.-J. Huang, "Maximum a Posteriori Probability Estimation for Online Surveillance Video Synopsis," IEEE Trans. on Circuits and Systems for Video Technology, vol.24, no.8, pp.1417−1429, Aug. 2014.
[4] Teng Li; Huan Chang; Meng Wang; Bingbing Ni; Richang Hong; Shuicheng Yan, "Crowded Scene Analysis: A Survey," IEEE Trans. on Circuits and Systems for Video Technology, vol. 25, no. 3, pp.367−386, Mar. 2015.
[5] J. Xu, S. Denman, S. Sridharan, and C. Fookes, "An Efficient and Robust System for Multiperson Event Detection in Real-World Indoor Surveillance Scenes," IEEE Trans. on Circuits and Systems for Video Technology, vol. 25, no. 6, pp.1063−1076, Jun. 2015.
[6] P. DeCamp, G. Shaw, R. Kubat, and D. Roy, “An immersive system for browsing and visualizing surveillance video,” in Proc. ACM Int. Conf. on Multimedia, pp. 371−380, 2010.
[7] R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen, and C. Rother, "A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 6, pp.1068−1080, 2008.
[8] B. Bell, S. Feiner, and T. Höllerer, "View Management for Virtual and Augmented Reality," in Proc. Annual ACM symposium on User Interface Software and Technology, pp. 101−110, 2001.
[9] R. Azuma, and C. Furmanski, "Evaluating Label Placement for Augmented Reality View Management," in Proc. IEEE/ACM Int’l Symp. Mixed and Augmented Reality, pp.66−75, 2003.
[10] B. Bell, S. Feiner, and T. Höllerer, "View Management for Virtual and Augmented Reality," in Proc. Annual ACM Symposium on User interface software and technology, pp. 101−110, 2001.
[11] L. Čmolík and J. Bittner, "Layout-aware Optimization for Interactive Labeling of 3d Models," Comput. Graph., vol. 34, no. 4, pp. 378−387, 2010.
[12] R. Tenmoku, M. Kanbara, and N. Yokoya. "Annotating User-viewed Objects for Wearable AR Systems," in Proc. IEEE/ACM Int’l Symp. Mixed and Augmented Reality, pages 192−193, 2005.
[13] K. Makita, M. Kanbara, and N. Yokoya. "View Management of Annotations for Wearable Augmented Reality," in Proc. IEEE Int’l Conf. Multimedia and Expo, pp. 982−985, 2009.
[14] B. Zhang, Q. Li, H. Chao, B. Chen, E. Ofek, and Y.-Q. Xu. "Annotating and Navigating Tourist Videos," in Proc. SIGSPATIAL Int’l Conf. Advances in Geographic Information Systems, pp. 260−269, 2010.
[15] D. Iwai, T. Yabiki, and K. Sato, "View Management of Projected Labels on Nonplanar and Textured Surfaces," IEEE Trans. Visualization and Computer Graphics, vol. 19, no. 8, pp. 1415−1424, 2013.
[16] M. Tatzgern, D. Kalkofen, and D. Schmalstieg, "Dynamic Compact Visualizations for Augmented Reality," in Proc. IEEE Virtual Reality Conf., pp. 27−30, 2013.
[17] M. Tatzgern, D. Kalkofen, R. Grasset, and D. Schmalstieg, "Hedgehog Labeling: View Management Techniques for External Labels in 3D Space," in Proc. IEEE Virtual Reality Conf., pp. 27−32, 2014.
[18] N. Kishishita, K. Kiyokawa, J. Orlosky, T. Mashita, H. Takemura, and E. Kruijff, "Analysing the Effects of a Wide Field of View Augmented Reality Display on Search Performance in Divided Attention Tasks," in Proc. IEEE/ACM Int’l Symp. Mixed and Augmented Reality, pp. 177−186, 2014.
[19] F. Lauber, and A. Butz, "View Management for Driver Assistance in an HMD," in Proc. IEEE/ACM Int’l Symp. Mixed and Augmented Reality, pp. 1−6, 2013.
[20] J. Orlosky, K. Kiyokawa, and H. Takemura, "Towards Intelligent View Management: A Study of Manual Text Placement Tendencies in Mobile Environments Using Video See-through Displays," in Proc. IEEE/ACM Int’l Symp. Mixed and Augmented Reality, pp. 281−282, 2013.
[21] A. Leykin, and M. Tuceryan, "Automatic Determination of Text Readability over Textured Backgrounds for Augmented Reality Systems," in Proc. IEEE/ACM Int’l Symp. Mixed and Augmented Reality, pp. 224−230, 2004.
[22] E. Rosten, G. Reitmayr, and T. Drummond, "Real-time Video Annotations for Augmented Reality," in Proc. Int’l Conf. Advances in Visual Computing, pp. 294−302, 2005.
[23] K. Tanaka, Y. Kishino, M. Miyamae, T. Terada, and S. Nishio, "An Information Layout Method for an Optical See-through Head Mounted Display Focusing on the Viewability," in Proc. IEEE/ACM Int’l Symp. Mixed and Augmented Reality, pp. 139−142, 2008.
[24] R. Grasset, T. Langlotz, D. Kalkofen, M. Tatzgern, and D. Schmalstieg. "Image-driven View Management for Augmented Reality Browsers," in Proc. IEEE/ACM Int’l Symp. Mixed and Augmented Reality, pp. 177−186, 2012.
[25] M.-H. Yang, C.-R. Huang, W.-C. Liu, S.-Z. Lin, and K.-T. Chuang, "Binary Descriptor based Nonparametric Background Modeling for Foreground Extraction Using Detection Theory," IEEE Trans. Circuits and Systems for Video Technology, vol. 25, no. 4, pp. 595−608, 2015.
[26] D. G. Lowe, "Distinctive Image Features from Scale-invariant Keypoints," International Journal of Computer Vision, vol. 60, no. 2, pp. 91−110, 2004.
[27] M. Calonder, V. Lepetit, M. Ozuysal, T. Trzcinski, C. Strecha, and P. Fua, "BRIEF: Computing a Local Binary Descriptor Very Fast," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 34, no. 7, pp. 1281−1298, 2012.
[28] G.-H. Huang and C.-R. Huang, "Binary Invariant Cross Color Descriptor Using Galaxy Sampling," in Proc. Int. Conf. Pattern Recognit., pp. 2610−2613, 2012.
[29] E. Rosten, R. Porter, and T. Drummond, "Faster and Better: A Machine Learning Approach to Corner Detection," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 1, pp. 105−119, 2010.
[30] K. E. A. van de Sande, T. Gevers, and C.G.M. Snoek, “Evaluating Color Descriptors for Object and Scene Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1582−1596, 2010.
[31] J. L. Gabbard, J. E. Swan II, D. Hix, R. S. Schulman, J. Lucas, and D. Gupta, "An Empirical User-based Study of Text Drawing Styles and Outdoor Background Textures for Augmented Reality," in Proc. IEEE Virtual Reality Conf., pp. 11−18, 2005.
[32] J. Jankowski, K. Samp, I. Irzynska, M. Jozwowicz, and S. Decker, "Integrating Text with Video and 3d Graphics: The Effects of Text Drawing Styles on Text Readability," in Proc. Int’l Conf. on Human Factors in Computing Systems, pp. 1321−1330, 2010.
[33] J. Ferryman, and A. Shahrokni, “PETS2009: Dataset and challenge,” in Proc. IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, pp. 1−6, 2009.
[34] L. Leal-Taixé, A. Milan, K. Schindler, S. Roth, and I. Reid, “Benchmarking Multi-target Tracking,” in Proc. Workshop on Benchmarking Multi-target Tracking, 2015.
[35] B. Benfold, and I. Reid, “Stable multi-target tracking in real-time surveillance video,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3457−3464, 2011.
[36] Y. Meng, H. Zhang, M. Liu, and S. Liu, “Clutter-aware label layout,” in Proc. IEEE Pacific Visualization Symposium, pp. 207−214, 2015.
[37] A. Braunstein, M. Mezard, and R. Zecchina, "Survey propagation: An algorithm for satisfiability." Random Struct. Algorithms, vol. 27, no. 2, 201−226, 2005.
[38] V. Kolmogorov and R. Zabih, “What Energy Functions Can Be Minimized via Graph Cuts?” in Proc. European Conf. Computer Vision, pp. 65−81, 2002.