簡易檢索 / 詳目顯示

研究生: 王昱舜
Wang, Yu-Shuen
論文名稱: 基於內容為主之圖片與影像縮放最佳化研究
Content Aware Image and Video Resizing
指導教授: 李同益
Lee, Tong-Yee
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2010
畢業學年度: 98
語文別: 英文
論文頁數: 73
中文關鍵詞: 不等比例縮放裁切最佳化技術
外文關鍵詞: Content Aware, Image and Video Resizing, Warping, Temporal Coherence, SIFT features, Optimization
相關次數: 點閱:80下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 由於不同螢幕的解析度和長寬比例上的差異,如傳統電視、數位電視和手持電話,近年來圖片與影片縮放的研究變得愈來愈熱門。傳統的做法是內插圖片中的內容來改變解析度。然而這種將全部內容做單一縮放的方法,很容易改變圖片中的人或是一些具有結構性物體的比例,產生縮放上的失真。於是,有一些學者利用裁切的方法,移除圖片週圍一些比較不重要的區域來改變圖片的大小。他們利用一些自動的方法偵測每一個區域的重要性,然後尋找一個可以包含最多資訊的裁切視窗,將內容的損失降至最低。

    然而這種裁切的方法是有其限制的,如果重要的物體都緊鄰著圖片的週圍,那不論怎麼裁切都會將這些重要的物體移除。因此,基於內容為主的圖片縮放技術接著被提出來了。這些方法利用移除或擠壓圖片中比較不重要的內容,來隱藏不等比例縮放所產生的瑕疵。例如壓扁天空中的雲就不會產生視覺上的不適。基於這項論點,我們的目的是將圖片中的每個區域作不等比例的變形,達到減少瑕疵的目的。與前人的方法相比,我們對每一個區域的大小與長寬比例作最佳化,使得縮放後的圖片更加自然。

    此外,我們還將這樣的技術應用到影片上。除了維持住重要物體的比例外,我們還必需維持相同的物體在不同的時間點的縮放是一致的。否則物體會忽大忽小,產生另一種時間上的瑕疵。為了達成這個目的,我們偵測物的運動軌跡,找出每個物體在不同時間點的關連性,然後限制這些套用在相關物體的變形是一致的。最後,我們利用最佳化技術,在時間和空間上尋找一個平衡點,將視覺上的失真降至最低。

    Research on automatic resizing of images and videos is becoming ever more important with the proliferation of display units, such as television, notebooks, PDAs and cell phones, which all come in different aspect ratios and resolutions. To achieve full presentation of images and videos, we introduce a content aware technique which considers the interior pixels while resizing the images and videos. Specifically, we represent an image/frame with a grid mesh and then warp the mesh based on the saliency measure. Unlike the previous methods, which strove to preserve the prominent objects untouched, our method allows them to be scaled uniformly, enabling the distortion propagation in multiple directions. In addition to the resizing of static images, we extend our resizing technique to videos. The most important issue on this extension is the temporal coherence since the interior contents keep changing when the video is played. Due to the camera and object motions, simply preserving consistent resizing of temporally adjacent pixels cannot achieve temporal coherence and thus, resulting in flickering or waving artifacts. To solve this problem, we detect the camera motion based on the SIFT features and then decompose the scene into foreground and background regions. Obviously, the background motions depend on the camera while the foreground motions are arbitrary. We introduce different constraints to preserve their temporal coherences due to their different natures. All the criteria are formulated into energy terms and we solve for the resized images and videos by minimizing the objective function.

    中文摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3 Optimized Scale-and-Stretch for Image Resizing . . . . . . . . . . . . . . . . . . . 11 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2 Arbitrary image resizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2.1 Quad signicance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2.2 Mesh-based image resizing . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2.3 Initial guess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2.4 Signicance-aware initial mesh . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4 Motion-Aware Temporal Coherence for Video Resizing . . . . . . . . . . . . . . . . . 28 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.2 Video Importance Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.2.1 Frame Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.2.2 Aligned Importance Map Blending . . . . . . . . . . . . . . . . . . . . . . . . 31 4.3 Grid-based Resizing Optimization . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.3.1 Spatial Content Preservation Energies . . . . . . . . . . . . . . . . . . . . . 34 4.3.2 Temporal Coherence Energies . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.3.3 Minimization of Energy Functions . . . . . . . . . . . . . . . . . . . . . . . . 40 4.3.4 Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 41 4.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 5 Application: Focus+Context Visualization with Distortion Minimization. . . . . . . . 49 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.2 Focus+Context Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.2.1 Space Deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.2.2 Model Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 5.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.3.1 Distortion Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 5.3.2 Soft Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.3.3 Pros and Cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

    S.~Avidan and A.~Shamir, ``Seam carving for content-aware image resizing,'
    {em ACM Trans. Graph.}, vol.~26, no.~3, p.~10, 2007.

    M.~Rubinstein, A.~Shamir, and S.~Avidan, ``Improved seam carving for video
    retargeting,' {em ACM Trans. Graph.}, vol.~27, no.~3, 2008.

    R.~Gal, O.~Sorkine, and D.~Cohen-Or, ``Feature-aware texturing,' in {em
    Proceedings of Eurographics Symposium on Rendering}, pp.~297--303, 2006.

    L.~Wolf, M.~Guttmann, and D.~Cohen-Or, ``Non-homogeneous content-driven
    video-retargeting,' in {em Proceedings of IEEE ICCV}, pp.~1--6, 2007.

    Y.-F. Zhang, S.-M. Hu, and R.~R. Martin, ``Shrinkability maps for content-aware
    video resizing,' in {em PG '08}, 2008.

    T.~A. Keahey and E.~L. Robertson, ``Nonlinear magnification fields,' in {em
    INFOVIS '97: Proceedings of the 1997 IEEE Symposium on Information
    Visualization (InfoVis '97)}, p.~51, IEEE Computer Society, 1997.

    T.~A. Keahey and E.~L. Robertson, ``Techniques for non-linear magnification
    transformations,' in {em INFOVIS '96: Proceedings of the 1996 IEEE
    Symposium on Information Visualization (INFOVIS '96)}, p.~38, IEEE Computer
    Society, 1996.

    T.~A. Keahey, ``The generalized detail-in-context problem,' in {em INFOVIS
    '98: Proceedings of the 1998 IEEE Symposium on Information Visualization},
    pp.~44--51, IEEE Computer Society, 1998.

    M.~S.~T. Carpendale, D.~J. Cowperthwaite, and F.~D. Fracchia, ``Distortion
    viewing techniques for 3-dimensional data,' in {em INFOVIS '96: Proceedings
    of the 1996 IEEE Symposium on Information Visualization (INFOVIS '96)},
    p.~46, IEEE Computer Society, 1996.

    M.~S.~T. Carpendale, D.~J. Cowperthwaite, and F.~D. Fracchia, ``Extending
    distortion viewing from 2d to 3d,' {em IEEE Comput. Graph. Appl.}, vol.~17,
    no.~4, pp.~42--51, 1997.

    E.~LaMar, B.~Hamann, and K.~I. Joy, ``A magnification lens for interactive
    volume visualization,' in {em PG '01: Proceedings of the 9th Pacific
    Conference on Computer Graphics and Applications}, p.~223, IEEE Computer
    Society, 2001.

    L.~Wang, Y.~Zhao, K.~Mueller, and A.~E. Kaufman, ``The magic volume lens: An
    interactive focus+context technique for volume rendering,' in {em IEEE
    Visualization}, p.~47, 2005.

    L.~Q. Chen, X.~Xie, X.~Fan, W.~Y. Ma, H.~J. Zhang, and H.~Q. Zhou, ``A visual
    attention model for adapting images on small displays,' {em ACM Multimedia
    Systems Journal}, vol.~9, no.~4, pp.~353--364, 2003.

    H.~Liu, X.~Xie, W.-Y. Ma, and H.-J. Zhang, ``Automatic browsing of large
    pictures on mobile devices,' in {em Proceedings of ACM International
    Conference on Multimedia}, pp.~148--155, 2003.

    A.~Santella, M.~Agrawala, D.~DeCarlo, D.~Salesin, and M.~Cohen, ``Gaze-based
    interaction for semi-automatic photo cropping,' in {em Proceedings of CHI},
    pp.~771--780, 2006.

    B.~Suh, H.~Ling, B.~B. Bederson, and D.~W. Jacobs, ``Automatic thumbnail
    cropping and its effectiveness,' in {em Proceedings of UIST}, pp.~95--104,
    ACM, 2003.

    P.~Viola and M.~J. Jones, ``Robust real-time face detection,' {em Int. J.
    Comput. Vision}, vol.~57, no.~2, pp.~137--154, 2004.

    L.~Itti, C.~Koch, and E.~Niebur, ``A model of saliency-based visual attention
    for rapid scene analysis,' {em IEEE Trans. Pattern Anal. Mach. Intell.},
    vol.~20, no.~11, pp.~1254--1259, 1998.

    D.~DeCarlo and A.~Santella, ``Stylization and abstraction of photographs,'
    {em ACM Trans. Graph.}, vol.~21, no.~3, pp.~769--776, 2002.

    T.~S. Cho, M.~Butman, S.~Avidan, and W.~T. Freeman, ``The patch transform and
    its applications to image editing,' in {em CVPR '08}, 2008.

    D.~Simakov, Y.~Caspi, E.~Shechtman, and M.~Irani, ``Summarizing visual data
    using bidirectional similarity,' in {em CVPR '08}, 2008.

    V.~Kraevoy, A.~Sheffer, D.~Cohen-Or, and A.~Shamir, ``Non-homogeneous resizing
    of complex models,' {em ACM Trans. Graph.}, vol.~27, no.~5, p.~111, 2008.

    Y.-S. Wang, T.-Y. Lee, and C.-L. Tai, ``Focus+context visualization with
    distortion minimization,' {em IEEE Trans. Visualization and Computer
    Graphics}, vol.~14, no.~6, 2008.

    F.~Liu and M.~Gleicher, ``Video retargeting: automating pan and scan,' in {em
    Multimedia '06}, pp.~241--250, 2006.

    C.~Tao, J.~Jia, and H.~Sun, ``Active window oriented dynamic video
    retargeting,' in {em Workshop on Dynamical Vision, ICCV '07}, 2007.

    M.~Rubinstein, A.~Shamir, and S.~Avidan, ``Multi-operator media retargeting,'
    {em ACM Trans. Graph.}, vol.~28, no.~3, p.~23, 2009.

    I.~Viola, A.~Kanitsar, and M.~E. Groller, ``Importance-driven volume
    rendering,' in {em VIS '04: Proceedings of the conference on Visualization
    '04}, pp.~139--146, IEEE Computer Society, 2004.

    J.~Zhou, M.~Hinz, and K.~D. Tonnies, ``Focal region-guided feature-based volume
    rendering.,' in {em 3DPVT}, pp.~87--90, IEEE Computer Society, 2002.

    M.~J. McGuffin, L.~Tancau, and R.~Balakrishnan, ``Using deformations for
    browsing volumetric data,' in {em VIS '03: Proceedings of the 14th IEEE
    Visualization 2003 (VIS'03)}, p.~53, IEEE Computer Society, 2003.

    E.~A. Bier, M.~C. Stone, K.~Pier, W.~Buxton, and T.~D. DeRose, ``Toolglass and
    magic lenses: the see-through interface,' in {em SIGGRAPH '93: Proceedings
    of the 20th annual conference on Computer graphics and interactive
    techniques}, pp.~73--80, ACM, 1993.

    H.~Fang and J.~C. Hart, ``Detail preserving shape deformation in image
    editing,' {em ACM Trans. Graph.}, vol.~26, no.~3, p.~12, 2007.

    R.~Szeliski, ``Image alignment and stitching: a tutorial,' {em Foundations
    and Trends in Computer Graphics and Vision}, vol.~2, no.~1, pp.~1--104, 2006.

    B.-Y. Chen, K.-Y. Lee, W.-T. Huang, and J.-S. Lin, ``Capturing intention-based
    full-frame video stabilization,' {em Computer Graphics Forum}, vol.~27,
    no.~7, pp.~1805--1814, 2008.

    M.~L. Gleicher and F.~Liu, ``Re-cinematography: Improving the camerawork of
    casual video,' {em ACM Trans. Multimedia Comput. Commun. Appl.}, vol.~5,
    no.~1, pp.~1--28, 2008.

    D.~G. Lowe, ``Distinctive image features from scale-invariant keypoints,' {em
    Int. J. Comput. Vision}, vol.~60, no.~2, pp.~91--110, 2004.

    M.~A. Fischler and R.~C. Bolles, ``Random sample consensus: a paradigm for
    model fitting with applications to image analysis and automated
    cartography,' {em Commun. ACM}, vol.~24, no.~6, pp.~381--395, 1981.

    T.~Deselaers, P.~Dreuw, and H.~Ney, ``Pan, zoom, scan - time-coherent, trained
    automatic video cropping,' in {em CVPR '08}, 2008.

    O.~Sorkine, Y.~Lipman, D.~Cohen-Or, M.~Alexa, C.~R"{o}ssl, and H.-P. Seidel,
    ``Laplacian surface editing,' in {em SGP '04}, pp.~179--188, 2004.

    H.-W. Kang, Y.~Matsushita, X.~Tang, and X.-Q. Chen, ``Space-time video
    montage,' in {em CVPR '06}, 2006.

    Z.~Rasheed and M.~Shah, ``Scene detection in hollywood movies and tv shows,'
    in {em CVPR '03}, vol.~2, pp.~II--343--8, 2003.

    Y.-S. Wang, C.-L. Tai, O.~Sorkine, and T.-Y. Lee, ``Optimized scale-and-stretch
    for image resizing,' {em ACM Trans. Graph.}, vol.~27, no.~5, p.~118, 2008.

    P.~Kr"ahenb"uhl, M.~Lang, A.~Hornung, and M.~Gross, ``A system for
    retargeting of streaming video,' {em ACM Trans. Graph.}, vol.~28, no.~5,
    2009.

    T.~Ju, S.~Schaefer, and J.~Warren, ``Mean value coordinates for closed
    triangular meshes,' {em ACM Trans. Graph.}, vol.~24, no.~3, pp.~561--566,
    2005.

    K.~Madsen, H.~Nielsen, and O.~Tingleff, ``Methods for nonlinear least squares
    problems,' {em Tech. rep., Informatics and Mathematical Modelling}, 2004.

    K.~Madsen, H.~Nielsen, and O.~Tingleff, ``Optimization with constraints,' {em
    Tech. rep., Informatics and Mathematical Modelling}, 2004.

    S.~Toledo, D.~Chen, and V.~Rotkin, ``Taucs: A library of sparse linear solvers,
    version 2.2.,' {em Tel-Aviv University, Available online at
    http://www.tau.ac.il/~stoledo/taucs/}.

    無法下載圖示 校內:2011-02-04公開
    校外:不公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE