簡易檢索 / 詳目顯示

研究生: 查爾斯
Morace, Charles C.
論文名稱: 視頻重排序的感知流形學習
Learning a Perceptual Manifold for Animation Video Resequencing
指導教授: 李同益
Lee, Tong-Yee
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 32
外文關鍵詞: Computer Graphics, Video Re-­sequencing, Transfer Learning, Manifold Learn­ing
相關次數: 點閱:138下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • This work proposes a framework for animation video re­sequencing using deep learning and optimal graph traversal techniques. The proposed system produces new animation sequences by reordering a collection of animation images or existing animation video. To maintain tem­ poral coherence in the generated animation sequences, a perceptual distance is utilized so that adjacent frames in the re­sequenced animations are as perceptually similar as possible. To measure perceptual distance, we extract image features using activations of deep convolu­ tional neural networks and learn a perceptual distance by training these activation features on a small network with data comprised of human perceptual judgments. With this perceptual metric and graph­based manifold learning techniques, the framework can produce smooth and visually appealing animation results for a variety of animation styles. In contrast to pre­ vious work on animation re­sequencing, the proposed framework applies to a broader range of image styles and does not require hand­crafted feature extraction, background subtrac­ tion, or feature correspondence. The framework has additional applications to sequencing unstructured collections of images.

    Abstract i Acknowledgements ii Table of Contents iii List of Tables v List of Figures vi Chapter 1. Introduction 1 Chapter 2. Related Work 3 Chapter 3. System Overview 7 Chapter 4. Method 9 Chapter 5. Results 15 Chapter 6. Conclusion 28 References 29 Appendix 32

    [1] Hadar Averbuch­Elor and Daniel Cohen­Or. Ringit: Ring­ordering casual photos of a temporal event. ACM Trans. Graph., 34(3):33:1–33:11, May 2015.
    [2] Hadar Averbuch­Elor, Daniel Cohen­Or, and Johannes Kopf. Smooth image sequences for data­driven morphing. Comput. Graph. Forum, 35(2):203–213, May 2016.
    [3] Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput., 15(6):1373–1396, June 2003.
    [4] Qifeng Chen and Vladlen Koltun. Photographic image synthesis with cascaded refine­ ment networks. CoRR, abs/1707.09405, 2017.
    [5] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Intro­ duction to Algorithms, Third Edition. The MIT Press, 3rd edition, 2009.
    [6] Christina de Juan and Bobby Bodenheimer. Cartoon textures. In Proceedings of the 2004 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’04, pages 267–276, Aire­la­Ville, Switzerland, Switzerland, 2004. Eurographics Associa­ tion.
    [7] Alexey Dosovitskiy and Thomas Brox. Generating images with perceptual similarity metrics based on deep networks. CoRR, abs/1602.02644, 2016.
    [8] OhadFried,ShaiAvidan,andDanielCohen­Or.Patch2Vec:GloballyConsistentImage Patch Representation. Computer Graphics Forum, 2016.
    [9] Michael R. Garey and David S. Johnson. Computers and Intractability; A Guide to the Theory of NP­Completeness. W. H. Freeman & Co., New York, NY, USA, 1990.
    [10] L. A. Gatys, A. S. Ecker, and M. Bethge. Image style transfer using convolutional neu­ ral networks. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2414–2423, June 2016.
    [11] Daniel Holden, Jun Saito, Taku Komura, and Thomas Joyce. Learning motion mani­ folds with convolutional autoencoders. In SIGGRAPH Asia 2015 Technical Briefs, SA ’15, pages 18:1–18:4, New York, NY, USA, 2015. ACM.
    [12] D. P. Huttenlocher, G. A. Klanderman, and W. J. Rucklidge. Comparing images using the hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelli­ gence, 15(9):850–863, Sep 1993.
    [13] Justin Johnson, Alexandre Alahi, and Fei­Fei Li. Perceptual losses for real­time style transfer and super­resolution. CoRR, abs/1603.08155, 2016.
    [14] Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. Reinforcement learning: A survey. J. Artif. Int. Res., 4(1):237–285, May 1996.
    [15] M. G. Kendall. A new measure of rank correlation. Biometrika, 30(1/2):81–93, 1938.
    [16] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Sys­ tems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Pro­ ceedings of a meeting held December 3­6, 2012, Lake Tahoe, Nevada, United States., pages 1106–1114, 2012.
    [17] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th International Con­ ference on Neural Information Processing Systems ­ Volume 1, NIPS’12, pages 1097– 1105, USA, 2012. Curran Associates Inc.
    [18] J. B. Kruskal. On the shortest spanning subtree of a graph and the traveling salesman problem. In Proceedings of the American Mathematical Society, 7, 1956.
    [19] J. B. Kruskal. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29(1):1–27, Mar 1964.
    [20] A. Kushal, B. Self, Y. Furukawa, D. Gallup, C. Hernandez, B. Curless, and S. M. Seitz. Photo tours. In 2012 Second International Conference on 3D Imaging, Modeling, Pro­ cessing, Visualization Transmission, pages 57–64, Oct 2012.
    [21] Gilbert Laporte. The traveling salesman problem: An overview of exact and approxi­ mate algorithms. European Journal of Operational Research, 59(2):231 – 247, 1992.
    [22] Haibin Ling and David W. Jacobs. Shape classification using the inner­distance. IEEE Trans. Pattern Anal. Mach. Intell., 29(2):286–299, February 2007.
    [23] Margarita Osadchy, Yann Le Cun, and Matthew L. Miller. Synergistic face detection and pose estimation with energy­based models. J. Mach. Learn. Res., 8:1197–1215, May 2007.
    [24] Sam T. Roweis and Lawrence K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323–2326, 2000.
    [25] Arno Schödl and Irfan A. Essa. Machine learning for video­based rendering. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, Advances in Neural Information Process­ ing Systems 13, pages 1002–1008. MIT Press, 2001.
    [26] Arno Schödl and Irfan A. Essa. Controlled animation of video sprites. In Proceedings of the 2002 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’02, pages 121–127, New York, NY, USA, 2002. ACM.
    [27] Arno Schödl, Richard Szeliski, David H. Salesin, and Irfan Essa. Video textures. In Proceedings of the 27th Annual Conference on Computer Graphics and Interac­ tive Techniques, SIGGRAPH ’00, pages 489–498, New York, NY, USA, 2000. ACM Press/Addison­Wesley Publishing Co.
    [28] K. Schoeffmann and D. Ahlstrom. Similarity­based visualization for image browsing revisited. In 2011 IEEE International Symposium on Multimedia, pages 422–427, Dec 2011.
    [29] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large­ scale image recognition. CoRR, abs/1409.1556, 2014.
    [30] E. W. Stacy. A generalization of the gamma distribution. The Annals of Mathematical Statistics, 33(3):1187–1192, 1962.
    [31] Wolfram Research, Inc. Mathematica 11.3, 2018.
    [32] J. Yu, D. Liu, D. Tao, and H. S. Seah. On combining multiple features for cartoon char­ acter retrieval and clip synthesis. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 42(5):1413–1427, Oct 2012.
    [33] J. Yu, M. Wang, and D. Tao. Semisupervised multiview distance metric learning for cartoon synthesis. IEEE Transactions on Image Processing, 21(11):4636–4648, Nov 2012.
    [34] Jun Yu, Jun Cheng, and Dacheng Tao. Interactive cartoon reusing by transfer learning. Signal Process., 92(9):2147–2158, September 2012.
    [35] RichardZhang,PhillipIsola,AlexeiA.Efros,EliShechtman,andOliverWang.Theun­ reasonable effectiveness of deep features as a perceptual metric. CoRR, abs/1801.03924, 2018.
    [36] Shang­Wei Zhang, Charles C. Morace, Thi Ngoc Hanh Le, Chih­Kuo Yeh, Sheng­Yi Yao, Shih­Syun Lin, and Tong­Yee Lee. Animation video resequencing with a con­ volutional autoencoder. In SIGGRAPH Asia 2019 Posters, SA 2019, Brisbane, QLD, Australia, November 17­20, 2019, pages 19:1–19:2. ACM, 2019.

    下載圖示 校內:2021-07-29公開
    校外:2023-10-15公開
    QR CODE