研究生: |
林士勛 Lin, Shih-Syun |
---|---|
論文名稱: |
以內容感知為主最佳化於圖像與影片之應用 Content-aware Optimization for Image and Video Applications |
指導教授: |
李同益
Lee, Tong-Yee |
學位類別: |
博士 Doctor |
系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2015 |
畢業學年度: | 103 |
語文別: | 英文 |
論文頁數: | 101 |
中文關鍵詞: | 非線性縮放 、網格變形 、最佳化 、資訊視覺化 、QR碼 |
外文關鍵詞: | Content-aware media retargeting, mesh warping, cropping, optimization, map generation, information visualization, QR code |
相關次數: | 點閱:108 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來由於不同尺寸顯示設備之興起,如電視、智慧型手機、平板電腦、電腦螢幕及立體顯示器等,帶動(立體)圖像與(立體)影片縮放技術的發展。大多數非線性調整解析度方法會偵測圖像中重要區域,維持重要區域的形狀,但同時造成圖像中不重要的區域產生較大的變形,在這些非線性變形研究中,忽略物體變形一致性,有時會造成物體或結構線明顯扭曲、立體視差失真及時間相干性不一致。因此本論文提出一個三維物體導向之變形和裁剪方法來解決物體不一致形變的問題和保持物體的形狀、立體視差及時間相干性。在最佳化方法設計上,影像中物體都經由相似轉換來保持一致性與時間相干性,顯著內容物維持原本的比例,而低重要物體被扭曲盡可能接近線性轉換;此外,靈活邊界約束融入最佳化解法中,讓顯著物體落在重定視頻內,裁剪掉重定視頻以外的區域。本研究方法可適用於單一圖像、影片及立體影像之縮放。由定性與量化分析結果顯示,驗證此方法之實驗結果均優於近年先進方法所產生的結果。
此外,亦把此技術應用到地圖中。旅遊地圖和指引地圖都是屬於特殊目的之地圖,這些地圖一般都是為了特定的主題而設計,而這些含有特殊目的地圖的道路網路拓撲結構通常會比道路的幾何精度更來的重要,因此本論文中道路網路變形方法的目標是為了更容易產生各種特殊目的地圖和改變地圖中所代表的主題。這個想法基於使用者的心理地圖來變形道路網路,同時使用最佳化程序來執行道路網絡的扭曲變形。提出的方法中包括新穎的算法用於估計道路顯著性、各種幾何約束、美學約束及用戶指定約束來完成道路網絡變形。這些地圖能夠用於旅遊地圖和指引地圖,所以不僅能夠提供直觀路線規劃和導航任務,也能夠把主題性地圖視覺化呈現。在實驗結果中所生成的旅遊地圖和指引地圖,顯示出此系統能夠有效產生各種專用地圖。
QR碼通常用於嵌入訊息,人們可以方便地使用解碼設備來解讀QR碼而獲取訊息。近年來隨著QR碼的快速崛起,越來越多形式在生活上出現,除了一般常見的黑白兩色之外,藝術家和設計師把它變換成其他顏色或花樣,例如文字、商標或圖像轉換成美觀的QR碼,層出不窮的創意皆可在QR碼上展現出來。在最近幾年,有研究者賦予美學元素來自動產生美觀的QR碼,然而他們的方法所生成的QR碼仍然不夠美觀,因此本論文提出了兩階段方法生成高品質美觀的QR碼。在第一階段中,利用高斯-約旦法改變原始QR碼產生出概略性QR碼;在第二階段中,用最佳化渲染機制來把圖像與概略性QR碼合成高品質美觀的QR碼,同時維持QR碼的可讀性以及保持原本彩色影像的特徵。從實驗結果中,本方法提高了QR碼的外觀且最佳化過程是即時的。
In recent years, the (stereoscopic) image/video retargeting fast rises to become an important technology due to many display devices with different resolutions, such as TVs, smart phones, tablet PCs, 3D displays, and so on. Most non-linear warping methods used in image/video resizing are to preserve the aspect ratios of prominent regions and distort unimportant contents. However, these methods sometimes cause significant distortion of objects or structure lines, depth distortion, and unnatural temporal motions, because of ignoring the consistency in object deformation. Therefore, a floating boundary with volumetric warping and object-aware cropping is proposed to address those problems. In the proposed scheme, visually salient objects in the space-time domain are deformed as rigidly and as consistently as possible by using information from matched objects and content-aware boundary constraints. The content-aware boundary constraints can retain visually salient contents in a fixed region with a desired resolution and aspect ratio, called critical region, during warping. Volumetric cropping with the fixed critical region is then performed to adjust stereoscopic videos to the desired aspect ratios. The strategies of warping and cropping using floating boundaries and spatiotemporal constraints enable our method to consistently preserve the temporal motions and spatial shapes of visually salient volumetric objects in the left and right videos as much as possible, thus leading to good content-aware retargeting. In addition, by considering shape, motion, and disparity preservation, the proposed scheme can be applied to various media, including images, stereoscopic images, videos, and stereoscopic videos. Qualitative and quantitative analyses on stereoscopic videos with diverse camera and considerable motions demonstrate a clear superiority of the proposed method over related methods in terms of retargeting quality.
In addition to the resizing method, we extend our warping technique to maps. Tourist and destination maps are thematic maps designed to represent specific themes in maps. The road network topologies in these maps are generally more important than the geometric accuracy of roads. A road network warping method is proposed to facilitate map generation and improve theme representation in maps. The basic idea is deforming a road network to meet a user-specified mental map while an optimization process is performed to propagate distortions originating from road network warping. To generate a map, the proposed method includes algorithms for estimating road significance and for deforming a road network according to various geometric and aesthetic constraints. The proposed method can produce an iconic mark of a theme from a road network and meet a user-specified mental map. Therefore, the resulting map can serve as a tourist or destination map that not only provides visual aids for route planning and navigation tasks, but also visually emphasizes the presentation of a theme in a map for the purpose of advertising. In the experiments, the demonstrations of map generations show that our method enables map generation systems to generate deformed tourist and destination maps efficiently.
QR code is generally used for embedding messages such that people can conveniently use mobile devices to capture the QR code and acquire information through a QR code reader. In the past, the design of QR code generators only aimed to achieve high decodability and the produced QR codes usually look like random black-and-white patterns without visual semantics. In recent years, researchers have been tried to endow the QR code with aesthetic elements and QR code beautification has been formulated as an optimization problem that minimizes the visual perception distortion subject to acceptable decoding rate. However, the visual quality of the QR code generated by existing methods still leaves much to be desired. In this work, we propose a two-stage approach to generate QR code with high quality visual content. In the first stage, a baseline QR code with reliable decodability but poor visual quality is first synthesized based on the Gauss-Jordan elimination procedure. In the second stage, a rendering mechanism is designed to improve the visual quality while avoid affecting the decodability of the QR code. The experimental results show that the proposed method substantially enhances the appearance of the QR code and the processing complexity is near real-time.
[1] Unitag qr code generator. https://www.unitag.io/qrcode.
[2] S. Afzal, R. Maciejewski, Y. Jang, N. Elmqvist, and D. S. Ebert. Spatial text visualization using automatic typographic maps. IEEE Trans. Vis. Comput. Graph., 18(12):2556--2564, 2012.
[3] M. Agrawala and C. Stolte. Rendering effective route maps: improving usability through generalization. In Proceedings of ACM SIGGRAPH, pages 241--249, 2001.
[4] T. Anezaki, K. Eimon, S. Tansuriyavong, and Y. Yagi. Development of a human-tracking robot using qr code recognition. In Proceedings of the 17th Korea-Japan Joint Workshop on Frontiers of Computer Vision (FCV), pages 1--6, 2011.
[5] S. Avelar and L. Hurni. On the design of schematic transport maps. Cartographica, 41(3):217--228, 2006.
[6] S. Avidan and A. Shamir. Seam carving for content-aware image resizing. ACM Trans. Graph., 26(3), 2007.
[7] J. Bottger, U. Brandes, O. Deussen, and H. Ziezold. Map warping for the annotation of metro maps. IEEE Comput. Graph. Appl., 28(5):56--65, 2008.
[8] T. Basha, Y. Moses, and S. Avidan. Stereo seam carving a geometrically consistent approach. IEEE Trans. Pattern Anal. Mach. Intell., 35(10):2513--2525, 2013.
[9] T. Beier and S. Neely. Feature-based image metamorphosis. In Proceedings of ACM SIGGRAPH, pages 35--42, 1992.
[10] F. L. Bookstein. Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Trans. Pattern Anal. Mach. Intell., 11(6):567--585, 1989.
[11] M. S. T. Carpendale, D. J. Cowperthwaite, and F. D. Fracchia. 3-dimensional pliable surfaces: for the effective presentation of visual information. In Proceedings of the 8th annual ACM symposium on User interface and software technology, pages 217--226, 1995.
[12] C.-H. Chang, C.-K. Liang, and Y.-Y. Chuang. Content-aware display adaptation and interactive editing for stereoscopic images. IEEE Trans. Multi., 13(4):589--601, 2011.
[13] W.-C. Chen, A. Battestini, N. Gelfand, and V. Setlur. Visual summaries of popular landmarks from community photo collections. In Proceedings of the 17th ACM international conference on Multimedia, pages 789--792, 2009.
[14] C.-K. Chiang, S.-F. Wang, Y.-L. Chen, and S.-H. Lai. Fast jnd-based video carving with gpu acceleration for real-time video retargeting. IEEE Trans. Circuits Syst. Video Techn., 19(11):1588--1597, 2009.
[15] H.-K. Chu, C.-S. Chang, R.-R. Lee, and N. J. Mitra. Halftone qr codes. ACM Trans. Graph., 32(6):217:1--217:8, 2013.
[16] M. A. Cobb, M. J. Chung, H. Foley, F. E. Petry, K. B. Shaw, and H. V. Miller. A rule-based approach for the conflation of attributed vector data. GeoInformatica, 2:7--35, 1998.
[17] R. Cox. Finite field arithmetic and reed-solomon coding. http://research.swtch.com/field, 2012. Accessed: 2012-4-10.
[18] R. Cox. Qartcodes. http://research.swtch.com/qart, 2012.
[19] M. Daszykowski, K. Kaczmarek, Y. V. Heyden, and B. Walczak. Robust statistics in data analysis -- a review: Basic concepts. Chemometrics and Intelligent Laboratory Systems, 85(2):203 -- 219, 2007.
[20] M. Denis. The description of routes: A cognitive approach to the production of spatial discourse. Cahiers Psychologie Cognitive, 16(4):409--458, 1997.
[21] V. Eppell, J. Bunker, and B. McClurg. A four level road hierarchy for network planning and management. In Proceedings of the 20th Australian Road Research Board Conference, pages 1--15, 2001.
[22] B. Erol, J. Graham, J. J. Hull, and P. E. Hart. A modern day video flip-book: Creating a printable representation from time-based media. In Proceedings of the 15th International Conference on Multimedia, pages 819--822, 2007.
[23] A. Falcon. 40 gorgeous qr code artworks that rock. http://www.hongkiat.com/blog/qr-code-artworks/, 2013.
[24] S. Goferman, L. Zelnik-Manor, and A. Tal. Context-aware saliency detection. IEEE Trans. Pattern Anal. Mach. Intell., 34(10):1915--1926, 2012.
[25] F. Grabler, M. Agrawala, R. W. Sumner, and M. Pauly. Automatic generation of tourist maps. ACM Trans. Graph., 27(3):100:1--100:11, 2008.
[26] M. Grundmann, V. Kwatra, M. Han, and I. Essa. Efficient hierarchical graph-based video segmentation. In Proceedings of IEEE Computer Vision and Pattern Recognition, pages 2141--2148, 2010.
[27] Y. Guo, F. Liu, J. Shi, Z.-H. Zhou, and M. Gleicher. Image retargeting using mesh parametrization. IEEE Trans. Multi., 11(5):856--867, 2009.
[28] B. Guthier, J. Kiess, S. Kopf, and W. Effelsberg. Seam carving for stereoscopic video. In 11th IVMSP Workshop: 3D Image/Video Technologies and Applications, pages 1--4, 2013.
[29] D. Haisler and P. Tate. Physical hyperlinks for citizen interaction. In Proceedings of the International Conference on Multimedia, pages 1529--1530, 2010.
[30] L. Harrie. An Optimisation Approach to Cartographic Generalisation. Dissertation, Department of Surveying, Lund Institute of Technology, Lund University (Schweden), 2001.
[31] J.-H. Haunert and L. Sering. Drawing road networks with focus regions. IEEE Trans. Vis. Comput. Graph., 17(12):2555--2562, 2011.
[32] H. Hirschmuller. Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell., 30(2):328--341, 2008.
[33] T. Isenberg. Visual abstraction and stylisation of maps. The Cartographic Journal, 50:8--18, 2013.
[34] ISO/IEC. Information Technology, Automatic Identification and Data Capture Techniques, Qr Code 2005 Bar Code Symbology Specification, International Organization for Standardization, ISO/IEC 18004:2006, 2006.
[35] B. Jiang and C. Claramunt. A structural approach to the model generalization of an urban street network. Geoinformatica, 8(2):157--171, 2004.
[36] T.-W. Kan, C.-H. Teng, and W.-S. Chou. Applying qr code in augmented reality applications. In Proceedings of the 8th International Conference on Virtual Reality Continuum and Its Applications in Industry, pages 253--257, 2009.
[37] T. A. Keahey and E. L. Robertson. Techniques for non-linear magnification transformations. In Proceedings of the IEEE Symposium on Information Visualization, pages 38--45, 1996.
[38] W. Kim and C. Kim. Spatiotemporal saliency detection using textural contrast and its applications. IEEE Trans. Circuits Syst. Video Techn., 24(4):646--659, 2014.
[39] J. Kopf, M. Agrawala, D. Bargeron, D. Salesin, and M. Cohen. Automatic generation of destination maps. ACM Trans. Graph., 29(6):158:1--158:12, 2010.
[40] P. Krahenbuhl, M. Lang, A. Hornung, and M. Gross. A system for retargeting of streaming video. ACM Trans. Graph., 28(5):126:1--126:10, 2009.
[41] J. Krygier and D. Wood. Making Maps: A Visual Guide to Map Design for GIS. New York: The Guilford Press, 2005.
[42] M. Lang, A. Hornung, O. Wang, S. Poulakos, A. Smolic, and M. Gross. Nonlinear disparity mapping for stereoscopic 3d. ACM Trans. Graph., 29:75:1--75:10, 2010.
[43] K.-Y. Lee, C.-D. Chung, and Y.-Y. Chuang. Scene warping: Layer-based stereoscopic image resizing. In Proceedings of IEEE Computer Vision and Pattern Recognition, pages 49--56, 2012.
[44] B. Li, L.-Y. Duan, J. Wang, R. Ji, C.-W. Lin, and W. Gao. Spatiotemporal grid flow for video retargeting. IEEE Trans. Image Processing, 23(4):1615--1628, 2014.
[45] S.-S. Lin, C.-H. Lin, S.-H. Chang, and T.-Y. Lee. Object-coherence warping for stereoscopic image retargeting. IEEE Trans. Circuits Syst. Video Techn., 24(5):759--768, 2014.
[46] S.-S. Lin, C.-H. Lin, I.-C. Yeh, S.-H. Chang, C.-K. Yeh, and T.-Y. Lee. Content-aware video retargeting using object-preserving warping. IEEE Trans. Vis. Comput. Graph., 19(10):1677--1686, 2013.
[47] Y.-H. Lin, Y.-P. Chang, and J.-L. Wu. Appearance-based qr code beautifier. IEEE Trans. Multi., 15(8):2198--2207, 2013.
[48] Y.-S. Lin, S.-J. Luo, and B.-Y. Chen. Artistic qr code embellishment. Comput. Graph. Forum, 32(7):137--146, 2013.
[49] X. Liu, D. Tao, M. Song, Y. Ruan, C. Chen, and J. Bu. Weakly supervised multiclass video segmentation. In Proceedings of IEEE Computer Vision and Pattern Recognition, pages 57--64, 2014.
[50] A. M. MacEachren. How Maps Work: Representation, Visualization, and Design. Guilford Press, 1995.
[51] A. Nealen, T. Igarashi, O. Sorkine, and M. Alexa. Laplacian mesh optimization. In Proceedings of the 4th international conference on Computer graphics and interactive techniques in Australasia and Southeast Asia, pages 381--389, 2006.
[52] T. Nikolaos and T. Kiyoshi. Qr-code calibration for mobile augmented reality applications: Linking a unique physical location to the digital world. In ACM SIGGRAPH 2010 Posters, pages 144:1--144:1, 2010.
[53] Y. Niu, F. Liu, W. chi Feng, and H. Jin. Aesthetics-based stereoscopic photo cropping for heterogeneous displays. IEEE Trans. Multi., 14:783--796, 2012.
[54] M. Nollenburg and A. Wolff. Drawing and labeling high-quality metro maps by mixed-integer programming. IEEE Trans. Vis. Comput. Graph., 17(5):626--641, 2011.
[55] U. Peled. Visualead. http://www.visualead.com/, 2012.
[56] M. Rubinstein, D. Gutierrez, O. Sorkine, and A. Shamir. A comparative study of image retargeting. ACM Trans. Graph., 29(6):160:1--160:10, 2010.
[57] M. Rubinstein, A. Shamir, and S. Avidan. Improved seam carving for video retargeting. ACM Trans. Graph., 27(3):16:1--16:9, 2008.
[58] M. Rubinstein, A. Shamir, and S. Avidan. Multi-operator media retargeting. ACM Trans. Graph., 28(3):23:1--23:11, 2009.
[59] A. Shamir and S. Avidan. Seam carving for media retargeting. Commun. ACM, 52(1):77--85, 2009.
[60] M. M. Silvania Avelar. Generating topologically correct schematic maps. In Proceedings of the 9th International Symposium on Spatial Data Handling, pages 4--28, 2000.
[61] I. Simon, N. Snavely, and S. M. Seitz. Scene summarization for online image collections. In Proceedings of the 11th IEEE International Conference on Computer Vision, pages 1--8, 2007.
[62] N. Snavely, S. M. Seitz, and R. Szeliski. Photo tourism: exploring photo collections in 3d. ACM Trans. Graph., 25(3):835--846, 2006.
[63] M. Song, C. Chen, S. Wang, and Y. Yang. Low-level and high-level prior learning for visual saliency estimation. Inf. Sci., 281:573--585, 2014.
[64] J. Stott, P. Rodgers, J. C. Martinez-Ovando, and S. G. Walker. Automatic metro map layout using multicriteria optimization. IEEE Trans. Vis. Comput. Graph., 17(1):101--114, 2011.
[65] G. Touya, A. Coupe, J. L. Jollec, O. Dorie, and F. Fuchs. Conflation optimized by least squares to maintain geographic shapes. ISPRS International Journal of Geo-Information, 2(3):621--644, 2013.
[66] B. Tversky. Distortions in memory for maps. Cognitive Psychology, 13(3):407--433, 1981.
[67] Y.-S. Wang and M.-T. Chi. Focus+context metro maps. IEEE Trans. Vis. Comput. Graph., 17(12):2528--2535, 2011.
[68] Y.-S. Wang, H. Fu, O. Sorkine, T.-Y. Lee, and H.-P. Seidel. Motion-aware temporal coherence for video resizing. ACM Trans. Graph., 28(5):127:1--127:10, 2009.
[69] Y.-S. Wang, J.-H. Hsiao, O. Sorkine, and T.-Y. Lee. Scalable and coherent video resizing with per-frame optimization. ACM Trans. Graph., 30(4):88:1--88:8, 2011.
[70] Y.-S. Wang, T.-Y. Lee, and C.-L. Tai. Focus+context visualization with distortion minimization. IEEE Trans. Vis. Comput. Graph., 14(6):1731--1738, 2008.
[71] Y.-S. Wang, H.-C. Lin, O. Sorkine, and T.-Y. Lee. Motion-based video retargeting with optimized crop-and-warp. ACM Trans. Graph., 29(4):90:1--90:9, 2010.
[72] Y.-S. Wang, C.-L. Tai, O. Sorkine, and T.-Y. Lee. Optimized scale-and-stretch for image resizing. ACM Trans. Graph., 27(5):118:1--118:8, 2008.
[73] B. Yan, K. Sun, and L. Liu. Matching-area-based seam carving for video retargeting. IEEE Trans. Circuits Syst. Video Techn., 23(2):302--310, 2013.
[74] J. W. Yoo, S. Yea, and I. K. Park. Content-driven retargeting of stereoscopic images. IEEE Signal Processing Letters, 20(5):519--522, 2013.
[75] Z. Yuan, T. Lu, Y. Huang, D. Wu, and H. Yu. Addressing visual consistency in video retargeting: A refined homogeneous approach. IEEE Trans. Circuits Syst. Video Techn., 22(6):890--903, 2012.
[76] G.-X. Zhang, M.-M. Cheng, S.-M. Hu, and R. R. Martin. A shape-preserving approach to image resizing. Comput. Graph. Forum, 28(7):1897--1906, 2009.
[77] L. Zhang, M. Song, Y. Yang, Q. Zhao, C. Zhao, and N. Sebe. Weakly supervised photo cropping. IEEE Trans. Multi., 16(1):94--107, 2014.