簡易檢索 / 詳目顯示

研究生: 張尚瑋
Chang, Shang-Wei
論文名稱: 影格重新編排系統
Frame Resequencing System
指導教授: 李同益
Lee, Tong-Yee
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 40
中文關鍵詞: 非線性降維流型學習自動編碼器動畫排序
外文關鍵詞: Nonlinear Dimension Reduction, Manifold Learning, Autoencoder, Animation Sequencing
相關次數: 點閱:48下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 我們提出了一個資料驅使(data-driven) 的方法來解決一般二維動畫重新排序的問題。給定一個無序的影像集,我們所提的方法能夠創造出任意長度或透過選擇適當的中間影格(in-betweens) 來當作關鍵影格(key-frames) 的新動畫,而且這個動畫必須盡可能地看起來平滑。我們的架構包含兩個階段。首先訓練一個去噪自動編碼器(denoising autoencoder) 萃取影像在低維空間中的表示方式(representation),以致於有效地測量影像間的時間連貫性。接著,利用已訓練完的自動編碼器(autoecoder) 之編碼器(encoder) 的部分將我們輸入的新影像集映射至他們的低維度嵌入(embedding)中。再藉由遍歷(traverse) 其中近似而來的動畫流型(manifold) 來產生各種不同的動畫。我們詳盡地描述自動編碼器的網路架構以及訓練的程序,並且提供兩個產生路徑的演算法。一個用來選擇關鍵影格的中間影格,另一個則是用來做動畫的自動合成。跟先前的研究相反的是,我們所提議的技術不需要針對不同風格的動畫來微調網路的參數。實驗評估的部分證明了我們所提的方法可以產生具吸引力的結果。

    We introduce a data-driven solution to the problem of general 2D animation resequencing. Given an unordered collection of images, the proposed method can create new ”as smooth-as-possible” animations of arbitrary length or select suitable in-between images for a set of key-frames. Our framework involves two phases. First, a denoising autoencoder is trained to extract a lower dimensional representation of an image so that the temporal coherence of images can be sufficiently measured. Then, the trained encoding network maps a new collection of images to their lower dimensional embedding, where we generate a variety of animations by traversing an approximated animation manifold. We describe the autoencoder’s network architecture and training procedure in detail and give two path-finding algorithms, one for key-frame in-between selection and another for animation synthesis. In contrast to previous works, our proposed technique does not require fine-tuning of parameters and applies to a variety of image styles. Experimental evaluation proves our proposed method can generate appealing results.

    摘要 i Abstract ii 誌謝 iii Table of Contents iv List of Figures v Chapter 1. Introduction 1 Chapter 2. Related Work 4 2.1. Non-linear Dimension Reduction . . . . . . . . . . . . . . . . . . . . . . . 4 2.2. Sequential Ordering of Images . . . . . . . . . . . . . . . . . . . . . . . . 7 Chapter 3. System Overview 10 Chapter 4. Method 12 4.1. Autoencoder of deep learning . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.1.1. Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.1.2. Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1.3. Encoding Network . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.2. Animation Sequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.2.1. Key-frame Pathfinding . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2.2. Path Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Chapter 5. Experiment Results 19 5.1. Autoencoder Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.2. Animation Synthesis Results . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.2.1. Running times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.2.2. Key-frame Results . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.2.3. Path Exploration Results . . . . . . . . . . . . . . . . . . . . . . . 27 5.2.4. Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 5.2.5. Additional Applications . . . . . . . . . . . . . . . . . . . . . . . . 34 Chapter 6. Conclusion 37 References 38

    [1] Hadar Averbuch-Elor and Daniel Cohen-Or. Ringit: Ring-ordering casual photos of a temporal event. ACM Transactions on Graphics, 34(3), 2015.
    [2] Hadar Averbuch-Elor, Daniel Cohen-Or, and Johannes Kopf. Smooth image sequences for data-driven morphing. Computer Graphics Forum, (Proceedings Eurographics 2016), 35(2):to appear, 2016.
    [3] M. Belkin and P. N iyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6):1373–1396, June 2003.
    [4] Erin W. Chambers, Vin de Silva, Jeff Erickson, and Robert Ghrist. Vietoris–rips complexes of planar point sets. Discrete & Computational Geometry, 44(1):75–90, Jul 2010.
    [5] Christina de Juan and Bobby Bodenheimer. Cartoon textures. In Proceedings of the 2004 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’04, pages 267–276, Aire-la-Ville, Switzerland, Switzerland, 2004. Eurographics Association.
    [6] O. Fried, Shai Avidan, and Daniel Cohen-Or. Patch2vec: Globally consistent image patch representation. Comput. Graph. Forum, 36:183–194, 2017.
    [7] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. Generative adversarial networks. CoRR, abs/1406.2661, 2014.
    [8] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. CoRR, abs/1512.03385, 2015.
    [9] G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507, 2006.
    [10] Daniel Holden, Jun Saito, Taku Komura, and Thomas Joyce. Learning motion manifolds with convolutional autoencoders. In SIGGRAPH Asia Technical Briefs, 2015.
    [11] D. P. Huttenlocher, G. A. Klanderman, and W. J. Rucklidge. Comparing images using the hausdorff distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(9):850–863, Sep 1993.
    [12] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. CoRR, abs/1502.03167, 2015.
    [13] Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. Reinforcement learning: A survey. CoRR, cs.AI/9605103, 1996.
    [14] M. G. Kendall. A new measure of rank correlation. Biometrika, 30(1/2):81–93, 1938.
    [15] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014.
    [16] Diederik P. Kingma and Max Welling. Auto-encoding variational bayes. CoRR, abs/1312.6114, 2013.
    [17] Günter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. Selfnormalizing neural networks. CoRR, abs/1706.02515, 2017.
    [18] A. Kushal, B. Self, Y. Furukawa, D. Gallup, C. Hernandez, B. Curless, and S. M. Seitz. Photo tours. In 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization Transmission, pages 57–64, Oct 2012.
    [19] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural Comput., 1(4):541–551, December 1989.
    [20] Yann LeCun, Léon Bottou, Genevieve B. Orr, and Klaus-Robert Müller. Efficient backprop. In Neural Networks: Tricks of the Trade, This Book is an Outgrowth of a 1996 NIPS Workshop, pages 9–50, London, UK, UK, 1998. Springer-Verlag.
    [21] H. Ling and D. W. Jacobs. Shape classification using the inner-distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(2):286–299, Feb 2007.
    [22] Margarita Osadchy, Yann Le Cun, and Matthew L. Miller. Synergistic face detection and pose estimation with energy-based models. J. Mach. Learn. Res., 8:1197 1215, May 2007.
    [23] Mathew D. Penrose. A strong law for the longest edge of the minimal spanning tree. The Annals of Probability, 27(1):246–260, 1999.
    [24] Sam T. Roweis and Lawrence K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323–2326, 2000.
    [25] Arno Schödl and Irfan A. Essa. Machine learning for video-based rendering. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems 13, pages 1002–1008. MIT Press, 2001.
    [26] Arno Schödl and Irfan A. Essa. Controlled animation of video sprites. In Proceedings of the 2002 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’02, pages 121–127, New York, NY, USA, 2002. ACM.
    [27] Arno Schödl, Richard Szeliski, David H. Salesin, and Irfan Essa. Video textures. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’00, pages 489–498, New York, NY, USA, 2000. ACM Press/Addison-Wesley Publishing Co.
    [28] Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning, ICML ’08, pages 1096–1103, New York, NY, USA, 2008. ACM.
    [29] Richard Williams. The Animator’s Survival Kit Revised Edition: A Manual of Methods, Principles and Formulas for Classical, Computer, Games, Stop Motion and Internet Animators. Faber & Faber, Inc., 2009.
    [30] J. Yu, D. Liu, D. Tao, and H. S. Seah. On combining multiple features for cartoon character retrieval and clip synthesis. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 42(5):1413–1427, Oct 2012.
    [31] Jun Yu, Jun Cheng, and Dacheng Tao. Interactive cartoon reusing by transfer learning. Signal Processing, 92(9):2147 – 2158, 2012.

    無法下載圖示
    校外:不公開
    電子論文及紙本論文均尚未授權公開
    QR CODE