| 研究生: |
彭建瑋 Peng, Jian-Wei |
|---|---|
| 論文名稱: |
透過模仿學習之序列生成 Sequence Generation Through Imitation Learning |
| 指導教授: |
朱威達
Chu, Wei-Ta |
| 共同指導教授: |
胡敏君
Hu, Min-Chun |
| 學位類別: |
博士 Doctor |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2023 |
| 畢業學年度: | 111 |
| 語文別: | 英文 |
| 論文頁數: | 67 |
| 中文關鍵詞: | 模仿學習 、強化學習 、序列生成 、語句生成 、動作合成 |
| 外文關鍵詞: | Imitation Learning, Reinforcement Learning, Sequence Generation, Sentence Generation, Motion Synthesis |
| 相關次數: | 點閱:60 下載:8 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
人們能夠藉由許多不同的方式進行溝通,其中這些方式大部分皆具有序列性。每一種人類行為的序列皆具其特有的模式及個性。我們該如何開發一個能夠互動的智慧媒介,透過生成如此類似人類的行為序列來與人溝通? 在目前機器學習的領域中,序列生成仍然具有一些固有的問題需要被克服。在本論文中,我們提出了使用模仿學習的方式來進行序列生成。在模仿學習中,我們的模型能夠透過專家的展示來學習生成類似人類的行為序列。我們分析了不同類型的模仿學習模型之間的優缺點,並且提出了更佳的模型來分別解決在離散及連續的序列生成上所面對的問題。為了更近一步釐清模型的效益,我們透過不同的實驗分別在語句生成以及動作合成上進行,並展示其更接近人類的序列生成能力。
在本論文中,我們首先探討不同熱門的模仿學習模型之間的比較,並研究模仿學習在序列生成應用上的發展。接著我們提出一個更加有效率且穩定的模仿學習模型,並將其運用於語句生成之中。其中我們發現,合成語句的品質及多樣性之間的取捨能夠透過模型在學習的過程中被控制。最後,我們提出另一個模仿學習模型用於生成多模態軌跡,並且將其運用於動作合成之任務中。其中我們的模型能成功地分辨展示中不同的模態,並相較於目前最先進的方法能生成更高品質且更長期的軌跡序列。
Humans communicate with each other in many different ways, most of which are especially sequential. Each sequence of human behavior contains its own subtle pattern and personality. How can we develop an interactive artificial agent to generate such human-like sequences in order to communicate with humans? There are still some inherent problems to overcome in machine learning for sequence generation. In this dissertation, we present imitation learning frameworks for sequence generation, in which our models learn to produce sequences like humans from expert demonstrations. We analyze the advantages and drawbacks of different types of imitation learning and develop models to address both discrete and continuous sequence generation problems. To further understand the effectiveness of our proposed models, we conduct experiments of sentence generation and motion synthesis and show that our models can synthesize sequences more closely to humans.
We start with the comparison between different popular imitation learning methods and the development of imitation learning for applications of sequence generation. Next, we propose a more efficient and robust imitation learning framework for the sentence generation task. The trade-off between the quality and the diversity of the output sentences can be controlled during training. Finally, we propose another framework for multi-modal trajectory generation and show that it can be applied to the motion generation task. Our framework successfully separates different modes from demonstrations and generates high-quality long-term trajectories compared with other state-of-the-art methods.
[1] P. Abbeel and A. Y. Ng. Apprenticeship learning via inverse reinforcement learning. In Proceedings of the twenty-first international conference on Machine learning, page 1, 2004.
[2] B. D. Argall, S. Chernova, M. Veloso, and B. Browning. A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5):469–483, 2009.
[3] A. P. Badia, B. Piot, S. Kapturowski, P. Sprechmann, A. Vitvitskyi, Z. D. Guo, and C. Blundell. Agent57: Outperforming the atari human benchmark. In International Conference on Machine Learning, pages 507–517. PMLR, 2020.
[4] D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
[5] M. Bain and C. Sammut. A framework for behavioural cloning. In Machine Intelligence 15, pages 103–129, 1995.
[6] N. Baram, O. Anschel, I. Caspi, and S. Mannor. End-to-end differentiable adversarial imitation learning. In D. Precup and Y. W. Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 390–399. PMLR, 06–11 Aug 2017.
[7] R. Bellman. A markovian decision process. Journal of mathematics and mechanics, pages 679–684, 1957.
[8] R. Bellman. Dynamic programming. Science, 153(3731):34–37, 1966.
[9] S. Bengio, O. Vinyals, N. Jaitly, and N. Shazeer. Scheduled sampling for sequence prediction with recurrent neural networks. Advances in neural information processing systems, 28, 2015.
[10] K. Bergamin, S. Clavet, D. Holden, and J. R. Forbes. Drecon: data-driven responsive control of physics-based characters. ACM Transactions On Graphics (TOG), 38(6):1–11, 2019.
[11] S. R. Bowman, L. Vilnis, O. Vinyals, A. M. Dai, R. Jozefowicz, and S. Bengio. Generating sentences from a continuous space. 2015.
[12] G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba. Openai gym. arXiv preprint arXiv:1606.01540, 2016.
[13] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei. Language models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 1877–1901. Curran Associates, Inc., 2020.
[14] M. Caccia, L. Caccia, W. Fedus, H. Larochelle, J. Pineau, and L. Charlin. Language gans falling short. arXiv preprint arXiv:1811.02549, 2018.
[15] L. Chen, S. Dai, C. Tao, H. Zhang, Z. Gan, D. Shen, Y. Zhang, G. Wang, R. Zhang, and L. Carin. Adversarial text generation via feature-mover's distance. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018.
[16] N. Chentanez, M. Müller, M. Macklin, V. Makoviychuk, and S. Jeschke. Physics-based motion capture imitation with deep reinforcement learning. MIG ’18, New York, NY, USA, 2018. Association for Computing Machinery.
[17] S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 1, pages 539–546 vol. 1, 2005.
[18] S. Dey, A. Dutta, J. I. Toledo, S. K. Ghosh, J. Llad’os, and U. Pal. Signet: Convolutional siamese network for writer independent offline signature verification. CoRR, abs/1707.02131, 2017.
[19] W. Fedus, I. Goodfellow, and A. M. Dai. Maskgan: better text generation via filling in the_. arXiv preprint arXiv:1801.07736, 2018.
[20] C. Fei, B. Wang, Y. Zhuang, Z. Zhang, J. Hao, H. Zhang, X. Ji, and W. Liu. Triple-gail: a multi-modal imitation learning framework with generative adversarial nets. arXiv preprint arXiv:2005.10622, 2020.
[21] K. Fragkiadaki, S. Levine, P. Felsen, and J. Malik. Recurrent network models for human dynamics. In 2015 IEEE International Conference on Computer Vision (ICCV), pages 4346–4354, 2015.
[22] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc., 2014.
[23] K. Grochow, S. L. Martin, A. Hertzmann, and Z. Popovi’c. Style-based inverse kinematics. 23(3):522–531, Aug. 2004.
[24] X. Guo and J. Choi. Human motion prediction via learning local structure representations and temporal dependencies. In AAAI, 2019.
[25] K. Hausman, Y. Chebotar, S. Schaal, G. Sukhatme, and J. J. Lim. Multimodal imitation learning from unstructured demonstrations using generative adversarial nets. Advances in neural information processing systems, 30, 2017.
[26] M. Hessel, J. Modayil, H. Van Hasselt, T. Schaul, G. Ostrovski, W. Dabney, D. Horgan, B. Piot, M. Azar, and D. Silver. Rainbow: Combining improvements in deep reinforcement learning. In Thirty-second AAAI conference on artificial intelligence, 2018.
[27] J. Ho and S. Ermon. Generative adversarial imitation learning. Advances in neural information processing systems, 29, 2016.
[28] J. Ho, J. Gupta, and S. Ermon. Model-free imitation learning with policy optimization. In International Conference on Machine Learning, pages 2760–2769. PMLR, 2016.
[29] D. Holden, T. Komura, and J. Saito. Phase-functioned neural networks for character control. ACM Trans. Graph., 36(4), jul 2017.
[30] D. Holden, J. Saito, T. Komura, and T. Joyce. Learning motion manifolds with convolutional autoencoders. In SIGGRAPH Asia 2015 Technical Briefs, SA ’15, New York, NY, USA, 2015. Association for Computing Machinery.
[31] F. Hsiao, J. Kuo, and M. Sun. Learning a multi-modal policy via imitating demonstrations with mixed behaviors. CoRR, abs/1903.10304, 2019.
[32] Z. Hu, Z. Yang, X. Liang, R. Salakhutdinov, and E. P. Xing. Toward controlled generation of text. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, page 1587–1596. JMLR.org, 2017.
[33] A. Jain, A. R. Zamir, S. Savarese, and A. Saxena. Structural-rnn: Deep learning on spatio-temporal graphs. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5308–5317, 2016.
[34] W. Jeon, S. Seo, and K.-E. Kim. A bayesian approach to generative adversarial imitation learning. Advances in Neural Information Processing Systems, 31, 2018.
[35] I. T. Jolliffe and J. Cadima. Principal Component Analysis for Special Types of Data, pages 338–372. Springer New York, New York, NY, 2002.
[36] N. Kalchbrenner, L. Espeholt, K. Simonyan, A. v. d. Oord, A. Graves, and K. Kavukcuoglu. Neural machine translation in linear time. arXiv preprint arXiv:1610.10099, 2016.
[37] K.-E. Kim and H. S. Park. Imitation learning via kernel mean embedding. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
[38] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
[39] D. P. Kingma and M. Welling. Auto-Encoding Variational Bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014.
[40] T. Kloek and H. K. Van Dijk. Bayesian estimates of equation system parameters: an application of integration by monte carlo. Econometrica: Journal of the Econometric Society, pages 1–19, 1978.
[41] V. Konda and J. Tsitsiklis. Actor-critic algorithms. Advances in neural information processing systems, 12, 1999.
[42] J. N. Kundu, M. Gor, and R. V. Babu. Bihmp-gan: Bidirectional 3d human motion prediction gan. In AAAI, pages 8553–8560, 2019.
[43] A. B. L. Larsen, S. K. Sønderby, H. Larochelle, and O. Winther. Autoencoding beyond pixels using a learned similarity metric. In M. F. Balcan and K. Q. Weinberger, editors, Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, page 1558–1566, New York, New York, USA, 20–22 Jun 2016. PMLR.
[44] S. Lee, S. Lee, Y. Lee, and J. Lee. Learning a family of motor skills from a single motion clip. ACM Trans. Graph., 40(4), jul 2021.
[45] S. Levine, J. M. Wang, A. Haraux, Z. Popović, and V. Koltun. Continuous character control with low-dimensional embeddings. ACM Transactions on Graphics (TOG), 31(4):1–10, 2012.
[46] C. Li, Z. Zhang, W. S. Lee, and G. H. Lee. Convolutional sequence to sequence model for human dynamics. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5226–5234, 2018.
[47] J. Li, W. Monroe, T. Shi, S. Jean, A. Ritter, and D. Jurafsky. Adversarial learning for neural dialogue generation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2157–2169, 2017.
[48] Y. Li, J. Song, and S. Ermon. Infogail: Interpretable imitation learning from visual demonstrations. Advances in Neural Information Processing Systems, 30, 2017.
[49] Y. Li, H. Su, X. Shen, W. Li, Z. Cao, and S. Niu. Dailydialog: A manually labelled multi-turn dialogue dataset. arXiv preprint arXiv:1710.03957, 2017.
[50] K. Lin, D. Li, X. He, Z. Zhang, and M.-T. Sun. Adversarial ranking for language generation. Advances in neural information processing systems, 30, 2017.
[51] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer, 2014.
[52] H. Y. Ling, F. Zinno, G. Cheng, and M. Van De Panne. Character controllers using motion vaes. ACM Trans. Graph., 39(4), July 2020.
[53] N. Mahmood, N. Ghorbani, N. Troje, G. Pons-Moll, and M. J. Black. Amass: Archive of motion capture as surface shapes. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 5441–5450, 2019.
[54] W. Mao, M. Liu, M. Salzmann, and H. Li. Learning trajectory dependencies for human motion prediction. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 9488–9496, 2019.
[55] J. Martinez, M. J. Black, and J. Romero. On human motion prediction using recurrent neural networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4674–4683, 2017.
[56] J. Merel, Y. Tassa, D. TB, S. Srinivasan, J. Lemmon, Z. Wang, G. Wayne, and N. Heess. Learning human behaviors from motion capture by adversarial imitation. arXiv preprint arXiv:1707.02201, 2017.
[57] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26, 2013.
[58] N. Mishra, P. Abbeel, and I. Mordatch. Prediction and control with temporal segment models. In D. Precup and Y. W. Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, page 2459–2468, International Convention Centre, Sydney, Australia, 06–11 Aug 2017. PMLR.
[59] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
[60] T. M. Moerland, J. Broekens, and C. M. Jonker. Model-based reinforcement learning: A survey. arXiv preprint arXiv:2006.16712, 2020.
[61] A. v. d. Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. Senior, and K. Kavukcuoglu. Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499, 2016.
[62] S. Park, H. Ryu, S. Lee, S. Lee, and J. Lee. Learning predict-and-simulate policies from unorganized human motion data. ACM Trans. Graph., 38(6), nov 2019.
[63] S. H. Park, G. Lee, J. Seo, M. Bhat, M. Kang, J. Francis, A. R. Jadhav, P. P. Liang, and L. Morency. Diverse and admissible trajectory forecasting through multimodal context understanding. In A. Vedaldi, H. Bischof, T. Brox, and J. Frahm, editors, Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XI, volume 12356 of Lecture Notes in Computer Science, page 282–298. Springer, 2020.
[64] J.-W. Peng, M.-C. Hu, and C.-W. Chang. Imitation learning for sentence generation with dilated convolutions using adversarial training. In 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pages 435–440. IEEE, 2019.
[65] J.-W. Peng, M.-C. Hu, and W.-T. Chu. An imitation learning framework for generating multi-modal trajectories from unstructured demonstrations. Neurocomputing, 2022.
[66] X. B. Peng, P. Abbeel, S. Levine, and M. Van de Panne. Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. ACM Transactions On Graphics (TOG), 37(4):1–14, 2018.
[67] X. B. Peng, G. Berseth, and M. van de Panne. Dynamic terrain traversal skills using reinforcement learning. ACM Trans. Graph., 34(4):80:1–80:11, July 2015.
[68] X. B. Peng, G. Berseth, and M. van de Panne. Terrain-adaptive locomotion skills using deep reinforcement learning. ACM Transactions on Graphics (Proc. SIGGRAPH 2016), 35(4), 2016.
[69] X. B. Peng, G. Berseth, K. Yin, and M. Van De Panne. Deeploco: Dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Trans. Graph., 36(4):41:1–41:13, July 2017.
[70] X. B. Peng, A. Kanazawa, J. Malik, P. Abbeel, and S. Levine. Sfv: Reinforcement learning of physical skills from videos. ACM Trans. Graph., 37(6), Nov. 2018.
[71] X. B. Peng, A. Kanazawa, S. Toyer, P. Abbeel, and S. Levine. Variational discriminator bottleneck: Improving imitation learning, inverse rl, and gans by constraining information flow. arXiv preprint arXiv:1810.00821, 2018.
[72] M. Plappert, C. Mandery, and T. Asfour. The kit motion-language dataset. Big data, 4(4):236–252, 2016.
[73] A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
[74] N. Rhinehart, R. McAllister, and S. Levine. Deep imitative models for flexible inference, planning, and control. In International Conference on Learning Representations, 2020.
[75] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. Highresolution image synthesis with latent diffusion models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
[76] F. Sasaki, T. Yohira, and A. Kawaguchi. Sample efficient imitation learning for continuous control. In International conference on learning representations, 2018.
[77] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
[78] S. Semeniuta, A. Severyn, and E. Barth. A hybrid convolutional variational autoencoder for text generation. arXiv preprint arXiv:1702.02390, 2017.
[79] A. Sharma, M. Sharma, N. Rhinehart, and K. M. Kitani. Directed-info gail: Learning hierarchical policies from unsegmented demonstrations using directed information. arXiv preprint arXiv:1810.01266, 2018.
[80] D. Shen, Y. Zhang, R. Henao, Q. Su, and L. Carin. Deconvolutional latentvariable model for text sequence matching. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1), Apr. 2018.
[81] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016.
[82] D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller. Deterministic policy gradient algorithms. In International conference on machine learning, pages 387–395. PMLR, 2014.
[83] J. T. Springenberg, K. Hausman, M. Riedmiller, N. Heess, and Z. Wang. Learning an embedding space for transferable robot skills. In International Conference on Learning Representations, 2018.
[84] S. Starke, H. Zhang, T. Komura, and J. Saito. Neural state machine for character-scene interactions. ACM Trans. Graph., 38(6), nov 2019.
[85] S. Starke, Y. Zhao, F. Zinno, and T. Komura. Neural animation layering for synthesizing martial arts movements. ACM Trans. Graph., 40(4), jul 2021.
[86] S. Subramanian, S. R. Mudumba, A. Sordoni, A. Trischler, A. C. Courville, and C. Pal. Towards text generation with adversarially learned neural outlines. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018.
[87] I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. Advances in neural information processing systems, 27, 2014.
[88] R. S. Sutton and A. G. Barto. Reinforcement learning: An introduction. MIT press, 2018.
[89] R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems, 12, 1999.
[90] F. Torabi, G. Warnell, and P. Stone. Generative adversarial imitation from observation. arXiv preprint arXiv:1807.06158, 2018.
[91] A. Van den Oord, N. Kalchbrenner, L. Espeholt, O. Vinyals, A. Graves, et al. Conditional image generation with pixelcnn decoders. Advances in neural information processing systems, 29, 2016.
[92] B. Wang, E. Adeli, H.-k. Chiu, D.-A. Huang, and J. C. Niebles. Imitation learning for human pose prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7124–7133, 2019.
[93] Z. Wang, J. S. Merel, S. E. Reed, N. de Freitas, G. Wayne, and N. Heess. Robust imitation of diverse behaviors. Advances in Neural Information Processing Systems, 30, 2017.
[94] Z. Wang, T. Schaul, M. Hessel, H. Hasselt, M. Lanctot, and N. Freitas. Dueling network architectures for deep reinforcement learning. In International conference on machine learning, pages 1995–2003. PMLR, 2016.
[95] C. J. Watkins and P. Dayan. Q-learning. Machine learning, 8(3):279–292, 1992.
[96] R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3):229–256, 1992.
[97] K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In F. Bach and D. Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 2048–2057, Lille, France, 07–09 Jul 2015. PMLR.
[98] Y. Xu, Z. Piao, and S. Gao. Encoding crowd interaction with deep neural network for pedestrian trajectory prediction. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5275–5284, 2018.
[99] X. Yan, A. Rastogi, R. Villegas, K. Sunkavalli, E. Shechtman, S. Hadap, E. Yumer, and H. Lee. Mt-vae: Learning motion transformations to generate multimodal human dynamics. In Proceedings of the European Conference on Computer Vision (ECCV), September 2018.
[100] Z. Yang, Z. Hu, R. Salakhutdinov, and T. Berg-Kirkpatrick. Improved variational autoencoders for text modeling using dilated convolutions. In International conference on machine learning, pages 3881–3890. PMLR, 2017.
[101] L. Yu, W. Zhang, J. Wang, and Y. Yu. Seqgan: Sequence generative adversarial nets with policy gradient. In Proceedings of the AAAI conference on artificial intelligence, volume 31, 2017.
[102] W. Yu, G. Turk, and C. K. Liu. Learning symmetric and low-energy locomotion. ACM Trans. Graph., 37(4), jul 2018.
[103] Y. Yuan and K. Kitani. 3d ego-pose estimation via imitation learning. In Proceedings of the European Conference on Computer Vision (ECCV), pages 735–750, 2018.
[104] Y. Yuan and K. M. Kitani. Diverse trajectory forecasting with determinantal point processes. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020.
[105] H. Zhang, S. Starke, T. Komura, and J. Saito. Mode-adaptive neural networks for quadruped motion control. ACM Trans. Graph., 37(4), jul 2018.
[106] P. Zhang, W. Ouyang, P. Zhang, J. Xue, and N. Zheng. Sr-lstm: State refinement for lstm towards pedestrian trajectory prediction. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12077–12086, 2019.
[107] W. Zhou, T. Ge, K. Xu, F. Wei, and M. Zhou. Self-adversarial learning with comparative discrimination for text generation. In International Conference on Learning Representations, 2020.
[108] Y. Zhu, S. Lu, L. Zheng, J. Guo, W. Zhang, J. Wang, and Y. Yu. Texygen: A benchmarking platform for text generation models. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pages 1097–1100, 2018.
[109] B. D. Ziebart, A. L. Maas, J. A. Bagnell, A. K. Dey, et al. Maximum entropy inverse reinforcement learning. In Aaai, volume 8, pages 1433–1438. Chicago, IL, USA, 2008.