簡易檢索 / 詳目顯示

研究生: 陳子權
Chen, Tzu-Chuan
論文名稱: 使用RTSP協議於多個IP攝影機之即時串流同步與千兆像素影像檢視
Real-Time Streaming Synchronization of Multiple IP Cameras Based on RTSP Protocol and Gigapixel Image Viewing
指導教授: 連震杰
Lien, Jenn-Jier James
徐禕佑
Hsu, Yi-Yu Alan
學位類別: 碩士
Master
系所名稱: 敏求智慧運算學院 - 智慧科技系統碩士學位學程
MS Degree Program on Intelligent Technology Systems
論文出版年: 2024
畢業學年度: 112
語文別: 中文
論文頁數: 80
中文關鍵詞: 多攝影機同步RTSP即時串流環形緩衝區Gigapixel
外文關鍵詞: Multi-Camera Synchronization, RTSP, Real-Time Streaming, Ring Buffer, Gigapixel
相關次數: 點閱:53下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著多攝影機技術在體育賽事直播、監控系統及各類活動錄製中的應用日益廣泛,多攝影機即時同步的問題成為了一項重要挑戰。傳統的同步方法,尤其是依賴硬體的同步方式,雖然理論上可以提供精確的時間對齊,但在實際應用中卻面臨諸多困難。硬體同步通常需要精密的計時裝置和專用硬體,不僅顯著增加了系統建置成本,還使系統架構更加複雜。此外,硬體同步的可靠性常常受到物理連接的影響,尤其是在攝影機分布於大範圍或環境複雜的場地時,長距離電纜連接可能導致信號衰減或干擾,從而影響同步精度,而硬體同步系統的設置和維護需要專業技術人員的支持,對於資源有限或技術依賴較低的應用場景來說並不實際。
    為了解決這些硬體同步帶來的問題,本論文提出了一種基於RTSP協議的多IP攝影機即時串流同步系統。該系統利用現有的網路基礎設施進行軟體層面的同步,避免了硬體同步帶來的複雜性和成本問題,並且能夠在不同的網路環境中靈活運行。系統採用了環形緩衝區和UTC絕對時間戳技術,確保多攝影機之間的精確時間對齊,並結合GPU加速解碼技術,顯著提升整體處理效率。此設計特別針對高幀率、多攝影機的應用需求,尤其是在體育賽事直播等需要高同步精度和低延遲的場景中,展現了極高的應用價值。
    此外,本論文還提出了一種基於圖磚抓取算法的千兆像素影像檢視技術,該技術允許用戶高效地顯示和瀏覽超高解析度的影像。這種設計能夠動態獲取並渲染使用者所需的圖磚,確保即使面對龐大的數據集,用戶依然能夠享受流暢的瀏覽體驗。
    經過實驗結果顯示,該系統在120FPS環境下的平均同步誤差約為10幀,該系統不僅在同步精度和延遲方面表現出色,還展示了在不同場域中的極高適應性。由於系統主要依賴RTSP協議進行軟體同步,只要攝影機能夠通過RTSP協議進行畫面傳輸,無論是室內還是室外,系統都能夠實現穩定可靠的同步。這樣的設計使得對場域的要求僅限於網路連線品質的穩定性以及電腦硬體效能(如CPU、GPU、SSD寫入速度),因此本系統可以在多種應用環境中運行,而不受限於特定場域的物理條件。
    本系統在多攝影機同步應用中,尤其是在對同步精度和延遲要求極高的應用場景中,展現了顯著的優勢。未來的研究將集中於進一步優化系統的內存管理和解碼策略,以降低系統資源的佔用,並提升其在大規模應用中的性能和擴展性。隨著技術的不斷發展和優化,我們期待本系統能夠在更多行業和應用場景中發揮重要作用,為用戶提供更精確、更高效的多攝影機同步和千兆像素影像檢視解決方案。

    This paper introduces a multi-IP camera real-time streaming synchronization system that overcomes the challenges of traditional hardware-based methods by leveraging the Real Time Streaming Protocol (RTSP), circular buffers, and Coordinated Universal Time (UTC) absolute timestamps to achieve precise time alignment across multiple cameras, with GPU-accelerated decoding enhancing processing efficiency. The system is particularly effective in high-frame-rate, multi-camera applications like sports broadcasting, where synchronization precision and low latency are critical. Additionally, the paper presents a gigapixel image viewing technology designed to address the challenges of handling ultra-high-resolution images by implementing a tile-fetching algorithm and the Deep Zoom Image (DZI) format, which dynamically loads only the necessary image tiles, thus reducing resource consumption and improving user experience. Experimental results show the system achieves an average synchronization error of approximately 10 frames at 120FPS, demonstrating its effectiveness and adaptability in various environments. The system's reliance on RTSP protocol allows it to perform reliably under different network conditions, making it suitable for diverse applications, both indoors and outdoors. As the technology evolves, its application is expected to expand, offering precise and efficient multi-camera synchronization and seamless handling of ultra-high-resolution images in real-time scenarios.

    摘要 I Abstract III 致謝 XI 目錄 XIII 表目錄 XV 圖目錄 XVI 第1章 緒論 1 1.1 研究動機與目的 1 1.2 論文架構 4 1.3 相關研究 5 1.4 論文貢獻 7 第2章 系統設置 10 2.1 場域資訊 10 2.2 系統設置 12 第3章 使用環形緩衝區和 UTC 絕對時間戳記之即時串流同步 13 3.1 即時串流同步架構 13 3.1.1 NTP同步到公共參考時間戳 15 3.1.2 幀解碼和UTC絕對時間戳計算 16 3.1.3 使用環形緩衝區生成同步幀包 18 3.1.4 保存同步幀包至單獨文件夾 20 3.1.5 同步串流即時應用 23 3.2 幀解碼和 UTC 絕對時間戳計算以創建解碼幀 24 3.2.1 NVIDIA Video Codec SDK Framework 24 3.2.2 Timestamp Format (Unix Time) 27 3.3 同步幀包生成 30 3.3.1 環形緩衝區(Ring Buffer) 30 3.3.2 線程安全(Thread-Safe) 35 3.4 執行時間分析 37 第4章 基於影像圖磚抓取算法之 千兆像素影像查看架構 39 4.1 影像圖磚抓取算法架構 39 4.1.1 千兆像素圖像生成 39 4.1.2 生成深度變焦影像(Deep Zoom Image, DZI) 40 4.1.3 顯示使用者選擇區域 43 第 5章 資料收集以及實驗結果 52 5.1 資料收集 52 5.2 實驗結果 54 5.3 實驗分析 57 第6章 結論及未來展望 58 6.1 結論 58 6.2 未來展望 58 參考資料 60

    [1] P. Mach and Z. Becvar, “Mobile Edge Computing: A Survey on Architecture and Computation Offloading,” in IEEE Communications Surveys & Tutorials, pp. 1628-1656, 2017.
    [2] A. Smolic, "An Overview of 3D Video and Free Viewpoint Video," in Computer Analysis of Images and Patterns, pp. 1-8, 2009.
    [3] A. Smolic, “3D Video and Free Viewpoint Video - From Capture to Display," in Pattern Recognition, Vol. 44, Issue 9, 2011.
    [4] A. Smolic, “3D Video and Free Viewpoint Video - Technologies, Applications and MPEG Standards,” in IEEE International Conference on Multimedia and Expo, pp. 2161-2164, 2006.
    [5] J. Hu, S. Guo, Y. Dong, K. Zhou, J. Xu and L. Song, “A Multi-User Oriented Live Free-Viewpoint Video Streaming System Based on View Interpolation,” in IEEE International Conference on Multimedia and Expo, pp. 1-6, 2022.
    [6] Z. Huang , T. Zhang, W. Heng, B. Shi and S. Zhou, “Real-Time Intermediate Flow Estimation for Video Frame Interpolation,” in European Conference on Computer Vision, pp. 624–642, 2022.
    [7] H.S. Kim, S.B. Nam, S.G. Choi, C.H. Kim, T.T.K. Sung and C.B. Shon, “HLS-Based 360 VR Using Spatial Segmented Adaptive Streaming,” in IEEE International Conference on Consumer Electronics, pp. 1-4, 2018.
    [8] [A.M. Ahrar, and H. Roodaki, “A Mew Tile Boundary Artifact Removal Method for Tile-Based Viewport-Adaptive Streaming in 360∘ Videos,” in Multimedia Tools and Applications, vol. 80, no. 19, pp. 29785-29803, 2021.
    [9] J. Kopf, M. Uyttendaele, O. Deussen, and M.F. Cohen , “Capturing and Viewing Gigapixel Images,” in ACM Transactions on Graphics, vol. 26, pp. 93, 2007.
    [10] K. Ponto, K Doerr and F. Kuester, “Giga-stack: A Method for Visualizing Giga-Pixel Layered Imagery on Massively Tiled Displays,” in Future Generation Computer Systems, vol. 26, pp. 693-700, 2010.
    [11] R. Schaer, S. Otálora, O. Jimenez del Toro, M. Atzori and H. Müller, “Deep Learning Based Retrieval System for Gigapixel Histopathology Cases and the Open Access Literature,” in Journal of Pathology Informatics, vol. 10, pp. 1-19, 2019.
    [12] R. Schaer, S. Otálora, O. Jimenez del Toro, M. Atzori and H. Müller, “Large-Scale Image Retrieval with Attentive Deep Local Features,” in IEEE International Conference on Computer Vision (ICCV), pp. 3476-3485, 2017.
    [13] D. Brady, M. Gehm, R. Stack, D. Marks, D. Kittle, D. Golish, E. Vera, S. Feller, M. Atzori and H. Müller, “Multiscale Gigapixel Photography,” in Nature, vol. 486, pp. 386-389, 2012.
    [14] Z. Chen, “Virtual View Synthesis for Smooth Rotation Replay via Multiple Cameras Using Neural Radiance Fields,” in National Cheng Kung University Computer Science and Information Engineering, 2024.
    [15] L. Wang, “Seamless Image Stitching in Sports Events Using GraphCut Texture, Exposure Error Compensation and Multi-Band Blending Techniques,” in National Cheng Kung University Computer Science and Information Engineering, 2024.
    [16] F. Florian, “Service-Based Out-of-Core Processing and Exploration of High-resolution Images,” in Bachelor of Science in IT-Systems Engineering, 2020.
    [17] R. Singh, L. Chubb, L. Pantanowitz and A. Parwani, “Standardization in Digital Pathology: Supplement 145 of the DICOM Standards,” in Journal of Pathology Informatics, vol. 10, 2011.
    [18] K. Martinez and J. Cupitt, “Vips-A Highly Tuned Image Processing Software Architecture,” in IEEE International Conference on Image Processing, vol. 2, pp. 574-577, 2005.
    [19] L. Zini, A. Cavallaro and F. Odone, “Action-Based Multi-Camera Synchronization,” in IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 3, no. 2, pp. 165-174, 2013.
    [20] C. Lei and Y. Yang, “Tri-focal Tensor-Based Multiple Video Synchronization with Subframe Optimization,” IEEE Transactions on Image Processing, vol. 15, no. 9, pp. 2473-2480, 2006.
    [21] E. Dexter, P. Pérez, and I. Laptev, “Multi-view Synchronization of Human Actions and Dynamic Scenes,” in British Machine Vision Conference, pp. 1–22, 2009.
    [22] T. Tuytelaars and L. Van, “Synchronizing Video Sequences,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 1-762, 2004.
    [23] J. Serrat, F. Diego, F. Lumbreras, and J. Álvarez, “Synchronization of Video Sequences from Free-Moving Cameras,” in Pattern Recognition and Image Analysis, pp. 620–627, 2007
    [24] M. Yang, Y. Liu, and Z. You, “Video Synchronization Based on Events Alignment,” in Pattern Recognition Letters, pp. 1338–1348, 2012.
    [25] H. Dominguez, O. Villegas, V. Sanchez, E. Casas and K. Rao, “The H.264 Video Coding Standard,” in IEEE Potentials, vol. 33, no. 2, pp. 32-38, 2014.
    [26] J. Liang and S. Chen, “The Design and Implementation of RTSP/RTP Multimedia Traffic Identification Algorithm,” in Journal of Physics: Conference Series, 2014.
    [27] J. Rong, W. Zhang, Q. Qiu and Y. Zhang, “A Circular Buffer Mode Design Based on BU-61580,” in International Conference on Computational Intelligence and Software Engineering, pp. 1-4, 2009.
    [28] Y. Ho, C. Lin, P. Chen, M. Chen, C. Chang, W. Peng and H. Hang, “Learned Video Compression for YUV 4:2:0 Content Using Flow-based Conditional Inter-frame Coding,” in arXiv2210.08255, 2022.
    [29] M. Podpora, G. Korbas, K. Aleksandra, “YUV vs RGB - Choosing a Color Space for Human-Machine Interaction,” in Position papers of the 2014 Federated Conference on Computer Science and Information Systems, pp. 29-34, 2014.
    [30] D. Grois, D. Marpe, A. Mulayoff, B. Itzhaky and O. Hadar, “Performance comparison of H.265/MPEG-HEVC, VP9, and H.264/MPEG-AVC encoders,” in Picture Coding Symposium, pp. 394-394, 2013.

    無法下載圖示 校內:2029-08-23公開
    校外:2029-08-23公開
    電子論文尚未授權公開,紙本請查館藏目錄
    QR CODE