簡易檢索 / 詳目顯示

研究生: 陳信嘉
Chen, Xin-Jia
論文名稱: 一個應用於低光源影像增強的區域自適應顏色域轉換模型
A Region-Adaptive Color-Space Transformation Model for Low-Light Image Enhancement
指導教授: 戴顯權
Tai, Shen-Chuan
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2026
畢業學年度: 114
語文別: 英文
論文頁數: 74
中文關鍵詞: 低光源影像增強HVI 色彩空間Mamba 模組超邊特徵融合
外文關鍵詞: LLIE, HVI color space, Mamba, Hyperedge feature fusion
相關次數: 點閱:5下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 低光源影像增強(Low-light Image Enhancement, LLIE)旨在從嚴重退化的場景中恢復具有良好視覺效果與豐富資訊的影像。現有的深度學習式 LLIE 模型多依賴大量注意力機制的架構,導致運算成本高且不利於運算資源有限的裝置使用。

    為了解決這些問題,本論文提出一個基於 HVI 色彩空間的區域自適應低光源影像增強模型,並結合輕量化的 Mamba 狀態空間骨幹網路與超邊特徵融合機制。Mamba 模組取代了傳統的注意力模塊,使模型參數量減少約 46%,同時仍能維持強大的特徵建模能力。此外,超邊融合模組可整合跨越跳接路徑的多尺度特徵,所提出的區域自適應色彩轉換模組則能根據輸入圖片更加靈活的將原始 RGB 色彩空間轉換為 HVI 色彩空間,以提升重建圖片之色彩品質。

    實驗結果顯示,在 LOLv1 與 LOLv2-synthetic 資料集上,本研究所提出的模型在量化指標與視覺品質上均優於原始 HVI-CID Net 架構與多個具代表性的現有方法,展現出在增強性能與計算效率之間的良好平衡。

    Low-light image enhancement (LLIE) aims to restore visually pleasing and information-rich images from severely degraded scenes. Many available deep learning–based LLIE models rely heavily on attention-driven architectures, leading to high computational cost and limiting their applicability on resource-constrained devices.

    To address these challenges, this Thesis proposes a region-adaptive low-light image enhancement model built upon the HVI color space, integrating a lightweight Mamba state-space backbone with a hyperedge-based feature fusion mechanism. By replacing conventional attention modules, the Mamba block reduces the model parameters by approximately 46% while preserving a strong feature modeling capability. In addition, the Hyperedge Fusion Module effectively aggregates multi-scale features across skip connections, and the proposed region-adaptive color-space transformation module flexibly converts RGB images into the HVI domain based on the input content, thereby improving the color fidelity of the reconstructed images.

    Experimental results in the LOLv1 and LOLv2-synthetic datasets show that the proposed model outperforms not only the original HVI-CID Net framework, but also several available representative methods in both quantitative metrics and visual quality, demonstrating a favorable balance between enhancement performance and computational efficiency.

    中文摘要 i Abstract ii Acknowledgements iv Contents v List of Tables vii List of Figures viii 1 Introduction 2 2 Related Works 4 2.1 Retinex theory 4 2.2 Traditional Low-Light Image Enhancement Methods 6 2.3 U-Net 7 2.3.1 Basic U-Net 7 2.3.2 U-Net Variants 8 2.4 Vision Transformer 9 2.4.1 Basic Vision Transformer 9 2.4.2 Lightweight Vision Transformer 12 2.5 State Space Models and Mamba 13 2.6 HVI-CID Net 16 3 The Proposed Method 19 3.1 Proposed Network Architecture 19 3.2 Region-Adaptive HVI color transformation 22 3.3 LCA:Lighten Cross-Attention 27 3.4 PVM:Parallel Vision Mamba 29 3.5 Hypergraph-based Adaptive Correlation Enhancement 32 3.6 Loss Function 35 3.7 Algorithm Flow 37 3.7.1 Training Phase 37 3.7.2 Testing Phase 38 4 Performance Evaluation 39 4.1 Experimental Dataset 39 4.2 Experimental Settings 41 4.3 Experimental Evaluation Metrics 42 4.3.1 Peak Signal-to-Noise Ratio 42 4.3.2 Structural Similarity Index Measure 43 4.4 Experimental Results 44 4.4.1 Quantitative Results on LOLv1 and LOLv2-synthetic dataset 44 4.4.2 Visual Results on LOLv1 and LOLv2-synthetic dataset 45 4.5 Ablation Study 55 5 Conclusions 58 5.1 Conclusions 58 5.2 Future Work 58 References 60

    [1] Q.Yan,Y.Feng,C.Zhang,G.Pang,K.Shi,P.Wu,W.Dong,J.Sun,andY.Zhang,“Hvi: Anew color space for low-light image enhancement,” in Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pp. 5678–5687, June 2025.
    [2] E. H. Land, “The retinex theory of color vision,” Scientific American, vol. 237, no. 6, pp. 108–129, 1977.
    [3] H.Tian,“Noise analysis in cmos image sensors,”Ph.D.dissertation, Stanford University, August 2000.
    [4] S. M. Pizer, “Contrast limited adaptive histogram equalization,” in Proceedings of the First Conference on Visualization in Biomedical Computing. IEEE, 1990.
    [5] O.Ronneberger, P.Fischer, andT.Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention- MICCAI 2015, pp. 234–241, 2015.
    [6] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” ICLR, 2021.
    [7] H. Cao, Y. Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, and M. Wang,“Swin-unet: Unet-like pure transformer for medical image segmentation,” in Proceedings of the European Conference on Computer Vision Workshops (ECCVW), 2022.
    [8] J. Chen, Y. Lu, Q. Yu, X. Luo, E. Adeli, Y. Wang, L. Lu, A. L. Yuille, and Y. Zhou,“Transunet: Transformers make strong encoders for medical image segmentation,”arXiv preprint arXiv:2102.04306, 2021.
    [9] T. Yin, W.Zhang, F.Liu, J.Chen, andZ.Wang, “U-net--: Memory-efficient and feature enhanced network architecture based on u-net with channel fusion,” in Proceedings of the Asian Conference on Computer Vision (ACCV), 2024.
    [10] Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, and J. Liang, “Unet++: Redesigning skip connections to exploit multiscale features in image segmentation,” IEEE Transactions on Medical Imaging, 2019.
    [11] M. Lei, S. Li, Y. Wu, and et al., “Yolov13: Real-time object detection with hypergraph enhanced adaptive visual perception,” arXiv preprint arXiv:2506.17733, 2025.
    [12] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems, 2017.
    [13] J. L. Elman, “Finding structure in time,” Cognitive Science, vol. 14, no. 2, pp. 179–211, 1990.
    [14] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
    [15] C.-W. F. Xiaogang Xu, Ruixing Wang and J. Jia, “Snr-aware low-light image enhancement,” in Computer Vision and Pattern Recognition Conference (CVPR), 2022.
    [16] I. Beltagy, M. E. Peters, and A. Cohan, “Longformer: The long-document transformer,”arXiv:2004.05150, 2020.
    [17] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10012–10022, 2021.
    [18] J. Fu, J. Liu, J. Jiang, Y. Li, Y. Bao, and H. Lu, “Scene segmentation with dual relation aware attention network,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 6, pp. 2547–2560, 2020.
    [19] J. Ho, N. Kalchbrenner, D. Weissenborn, and T. Salimans, “Axial attention in multidimensional transformers,” 2019.
    [20] R. E. Kalman, “A new approach to linear filtering and prediction problems,” Journal of Basic Engineering, 1960.
    [21] A. Gu, K. Goel, and C. R´ e, “Efficiently modeling long sequences with structured state spaces,” in International Conference on Learning Representations (ICLR), 2022.
    [22] A.GuandT.Dao,“Mamba: Linear-time sequence modeling with selective state spaces,”arXiv preprint arXiv:2312.00752, 2023.
    [23] Y. Liu, Y. Tian, Y. Zhao, H. Yu, L. Xie, Y. Wang, Q. Ye, and Y. Liu, “Vmamba: Visual state space model,” arXiv preprint arXiv:2401.10166, 2024.
    [24] R. Wu, Y. Liu, G. Ning, P. Liang, and Q. Chang, “Ultralight vm-unet: Parallel vision mamba significantly reduces parameters for skin lesion segmentation,” Patterns, p. 101298, 2025.
    [25] H. Guo, J. Li, T. Dai, Z. Ouyang, X. Ren, and S.-T. Xia, “Mambair: A simple baseline for image restoration with state-space model,” in European Conference on Computer Vision. Springer, pp. 222–241, 2024.
    [26] Y. Cai, H. Bian, J. Lin, H. Wang, R. Timofte, and Y. Zhang, “Retinexformer: One-stage retinex-based transformer for low-light image enhancement,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 12504–12513, October 2023.
    [27] Y. Niu, F. Li, Y. Li, S. Chen, and Y. Chen, “Adaptive luminance enhancement and high-fidelity color correction for low-light image enhancement,” IEEE Transactions on Computational Imaging, pp. 732–747, 2025.
    [28] Z. Duan, M. Lu, Z. Ma, and F. Zhu, “Opening the black box of learned image coders,”in Picture Coding Symposium (PCS), 2022.
    [29] C. Wei, W. Wang, W. Yang, and J. Liu, “Deep retinex decomposition for low-light enhancement,” arXiv preprint arXiv:1808.04560, 2018.
    [30] J. Bai, Y. Yin, Q. He, Y. Li, and X. Zhang, “Retinexmamba: Retinex-based mamba for low-light image enhancement,” arXiv preprint arXiv:2405.03349, 2024.
    [31] A. S. Jinsong Shi, Pan Gao, “Blind image quality assessment via transformer predicted error map and perceptual quality token,” IEEE Transactions on Multimedia, pp. 4641 4651, 2023.

    QR CODE