| 研究生: |
黃粲富 Huang, Tsan-Fu |
|---|---|
| 論文名稱: |
基於RRT變形演算法從360°影片生成一般視角影片 Generating NFOV Video from 360° Video Based on Variant RRT Algorithm |
| 指導教授: |
李同益
Lee, Tong-Yee |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2018 |
| 畢業學年度: | 106 |
| 語文別: | 英文 |
| 論文頁數: | 59 |
| 中文關鍵詞: | 360° 影片 、快速搜索隨機樹 |
| 外文關鍵詞: | 360° Video, Rapidly-Exploring Random Trees |
| 相關次數: | 點閱:67 下載:5 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
360 度影片在近年來興起,其影片結構在同一個時間點包含了所有的視角,可以讓觀看者自由選擇要觀看內容的位置。不過影片中並非每個視角都有內容或是觀看者感興趣的目標,要讓觀看者在不知道影片內容的情況找出有趣的影片內容,觀看者可能要來回觀看同一部影片多次。
我們提供一種新的變形RRT (Rapidly-Exploring Random Trees) 演算法用於360 度影片,將其轉換成一般視角(Normal Field-of-view) 影片,使得使用者可以一次性的觀看其中重要內容。此外,使用者亦可以透過影片中圖像辨識的結果挑選想要觀看的內容,在由我們的系統自動規劃出合適的影片觀看視角序列,並輸出成一般視角影片以便於使用者直接在平面螢幕裝置上觀看。
The structure of 360-degree videos which have emerged in recent years contains all the views at the same time, allowing viewers to freely choose where to view the content. However, not every aspect of the video has interesting content to the viewer. To find interesting video content without knowing where the content in the video, the viewer may have to watch the same video multiple times.
We offer a new variant RRT (Rapidly-Exploring Random Trees) algorithm for 360-degree video, which is converted a 360-degree video into a Normal Field-of-view video, allowing users to view important content in one time. In addition, the user can select what they want to watch through the results of image recognition in the video. And automatically generate a suitable video viewing angle sequence by our system, then output it into a normal field of view video so that the user can directly view on the flat screen.
[1] A. De Abreu, C. Ozcinar, and A. Smolic. Look around you: Saliency maps for omnidirectional images in vr applications. In 2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX), pages 1–6, May 2017.
[2] Jackie Assa, Daniel Cohen-Or, I-Cheng Yeh, and Tong-Yee Lee. Motion overview of human actions. ACM Trans. Graph., 27(5):115:1–115:10, December 2008.
[3] M. Assens, X. Giro i Nieto, K. McGuinness, and N. E. O’Connor. Saltinet: Scan-path prediction on 360 degree images using saliency volumes. In 2017 IEEE International Conference on Computer Vision Workshops (ICCVW), pages 2331–2338, Oct 2017.
[4] A. Bewley, Z. Ge, L. Ott, F. Ramos, and B. Upcroft. Simple online and realtime tracking. In 2016 IEEE International Conference on Image Processing (ICIP), pages 3464–3468, Sept 2016.
[5] I. Bogdanova, A. Bur, and H. Hugli. Visual attention on the sphere. IEEE Transactions on Image Processing, 17(11):2000–2014, Nov 2008.
[6] Jean-Yves Bouguet. Pyramidal implementation of the lucas kanade feature tracker description of the algorithm. 1, 01 2000.
[7] L. Chen, Y. Shan, W. Tian, B. Li, and D. Cao. A fast and efficient double-tree rrt*-like sampling-based planner applying on mobile robotic vehicles. IEEE/ASME Transactions on Mechatronics, pages 1–1, 2018.
[8] S. Choudhury, S. Scherer, and S. Singh. Rrt*-ar: Sampling-based alternate routes planning with applications to autonomous emergency landing of a helicopter. In 2013 IEEE International Conference on Robotics and Automation, pages 3947–3952, May 2013.
[9] H. Deng, Z. Xia, and J. Xiong. Robotic manipulation planning using dynamic rrt. In 2016 IEEE International Conference on Real-time Computing and Robotics (RCAR), pages 500–504, June 2016.
[10] Thomas Eiter and Heikki Mannila. Computing discrete frechet distance. 05 1994.
[11] Facebook. Transform360. "https://github.com/facebook/transform360.
[12] F. Gianfelici. Nearest-neighbor methods in learning and vision (shakhnarovich, g. et al., eds.; 2006) [book review]. IEEE Transactions on Neural Networks, 19(2):377–377, Feb 2008.
[13] Ross Girshick. Fast R-CNN. In Proceedings of the International Conference on Computer Vision (ICCV), 2015.
[14] Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
[15] S. Goferman, L. Zelnik-Manor, and A. Tal. Context-aware saliency detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(10):1915–1926, Oct 2012.
[16] K. He, G. Gkioxari, P. Dollár, and R. Girshick. Mask r-cnn. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2980–2988, Oct 2017.
[17] H. Hu, Y. Lin, M. Liu, H. Cheng, Y. Chang, and M. Sun. Deep 360 pilot: Learning a deep agent for piloting through 360° sports videos. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1396–1405, July 2017.
[18] James Williams Inman. Navigation and nautical astronomy for the use of british seamen. 1835.
[19] F. Islam, J. Nasir, U. Malik, Y. Ayaz, and O. Hasan. Rrt*-smart: Rapid convergence implementation of rrt*towards optimal solution. In 2012 IEEE International Conference on Mechatronics and Automation, pages 1651–1656, Aug 2012.
[20] T. Judd, K. Ehinger, F. Durand, and A. Torralba. Learning to predict where humans look. In 2009 IEEE 12th International Conference on Computer Vision, pages 2106–2113, Sept 2009.
[21] Sertac Karaman and Emilio Frazzoli. Sampling-based algorithms for optimal motion planning. CVPR, abs/1105.1186, 2011.
[22] W. Lai, Y. Huang, N. Joshi, C. Buehler, M. Yang, and S. B. Kang. Semantic-driven generation of hyperlapse from 360 degree video. IEEE Transactions on Visualization and Computer Graphics, 24(9):2610–2621, Sept 2018.
[23] Steven M. Lavalle. Rapidly-exploring random trees: A new tool for path planning. Technical report, 1998.
[24] L. Meng, S. Qing, Z. Qinjun, and Z. Yongliang. Route planning for unmanned aerial vehicle based on rolling rrt in unknown environment. In 2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), pages 1–4, Dec 2016.
[25] O. Le Meur, P. Le Callet, D. Barba, and D. Thoreau. A coherent computational approach to model bottom-up visual attention. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(5):802–817, May 2006.
[26] Frank Nielsen. Surround video: A multihead camera approach. 21:92–103, 02 2005.
[27] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look once: Unified, real-time object detection. 06 2015.
[28] Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement. 04 2018.
[29] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN: Towards realtime object detection with region proposal networks. In Neural Information Processing Systems (NIPS), 2015.
[30] Bernhard Schölkopf, John Platt, and Thomas Hofmann. Graph-Based Visual Saliency. MITP, 2007.
[31] V. Sitzmann, A. Serrano, A. Pavel, M. Agrawala, D. Gutierrez, B. Masia, and G. Wetzstein. Saliency in vr: How do people explore virtual environments? IEEE Transactions on Visualization and Computer Graphics, 24(4):1633–1642, April 2018.
[32] Y. Su and K. Grauman. Making 360° video watchable in 2d: Learning videography for click free viewing. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1368–1376, July 2017.
[33] Yu-Chuan Su, Dinesh Jayaraman, and Kristen Grauman. Pano2vid: Automatic cinematography for watching 360° videos. In Proceedings of the Asian Conference on Computer Vision (ACCV), 2016.
[34] Kevin Toohey and Matt Duckham. Trajectory similarity measures. SIGSPATIAL Special, 7(1):43–50, May 2015.
[35] Chris Veness. Calculate distance and bearing between two latitude/longitude points using haversine formula in javascript. "https://www.movable-type.co.uk/scripts/latlong.html.
[36] T. Vincenty. Direct and inverse solutions of geodesics on the ellipsoid with application of nested equations. Survey Review, 23(176):88–93, 1975.
[37] J. Xiao, K. A. Ehinger, A. Oliva, and A. Torralba. Recognizing scene viewpoint using panoramic place representation. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 2695–2702, June 2012.
[38] Jianxiong Xiao. 3d geometry for panorama.
[39] I. Yeh, W. Lin, T. Lee, H. Han, J. Lee, and M. Kim. Social-event-driven camera control for multicharacter animations. IEEE Transactions on Visualization and Computer Graphics, 18(9):1496–1510, Sept 2012.
校內:2023-08-31公開