簡易檢索 / 詳目顯示

研究生: 梁榮發
Prawiro, Herman
論文名稱: 使用二串流解碼之監控影片異常事件偵測
Abnormal Event Detection in Surveillance Videos using Two-Stream Decoder
指導教授: 胡敏君
Hu, Min-Chun
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 36
外文關鍵詞: Abnormal Event Detection, Surveillance Video, Auto-Encoder, Two-Stream Decoder
相關次數: 點閱:75下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Abnormal event detection in surveillance videos refers to the identification of events that deviate from the normal pattern. We can use autoencoder to learn the normal pattern from normal videos and use the reconstruction error to indicate the presence of abnormalities. As the surveillance cameras are usually static, the surveillance videos can be divided into two components: dynamic objects and a static background. Because of the nature of the static background, we can assume that the source of abnormality is from the objects. In this work, we propose to use a two-stream decoder model to tackle abnormal event detection problem in surveillance videos. The two-stream decoder consists of background stream to model the static background and foreground stream to model the dynamic objects. We also utilized two-stream encoder to learn from optical flow, which contains motion information, and skip connection to improve the detail of output frames. Different constraints and adversarial training are also applied to train a more robust model for abnormality event detection task. Several experiments on publicly available datasets validate the effectiveness of our model.

    Abstract (English) i Table of Contents ii List of Tables iv List of Figures v Chapter 1. Introduction 1 Chapter 2. Related Work 4 2.1 Abnormal Event Detection with Hand-Crafted Features 4 2.2 Abnormal Event Detection with Deep Features 4 Chapter 3. The Proposed Model 7 3.1 Autoencoder 7 3.2 Two-Stream Decoder 8 3.2.1 Background Decoder 10 3.2.2 Foreground Decoder 10 3.2.3 One-Stream Decoder as Baseline 11 3.3 Encoder 12 3.4 Skip Connection 13 3.5 Future Prediction 14 3.6 Objective Function 15 3.7 Adversarial Training 16 3.8 Implementation and Training 16 3.9 Abnormal Event Detection 17 Chapter 4. Experimental Results 19 4.1 Datasets 19 4.1.1 UCSD Pedestrian Dataset 19 4.1.2 CUHK Avenue Dataset 20 4.2 Evaluation Metric 21 4.3 Comparison with One-Stream Decoder 22 4.4 Impact of Two-Stream Encoder and Skip Connection 23 4.5 Impact of Future Prediction 23 4.6 Impact of Upsampling Method 25 4.7 Impact of Constraints 26 4.8 Impact of Adversarial Learning 28 4.9 Impact of Background Estimation Methods 29 4.10 Comparison with Existing Methods 30 Chapter 5. Conclusion 31 References 32

    [1] D. Abati, A. Porrello, S. Calderara, and R. Cucchiara. Latent space autoregression for novelty detection. In International Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019.
    [2] V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey. ACM Computing Surveys, 41(3):15:1–15:58, July 2009.
    [3] T. Fawcett. An introduction to roc analysis. Pattern Recognition Letters, 27(8):861 – 874, 2006.
    [4] C. Feichtenhofer, A. Pinz, and A. Zisserman. Convolutional two-stream network fusion for video action recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1933–1941. IEEE, June 2016.
    [5] D. Gong, L. Liu, V. Le, B. Saha, M. R. Mansour, S. Venkatesh, and A. v. d. Hengel. Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. arXiv preprint arXiv:1904.02639, 2019.
    [6] I. Goodfellow, Y. Bengio, and A. Courville. Deep Learning. MIT Press, 2016. http://www.deeplearningbook.org.
    [7] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS’14, pages 2672–2680, Cambridge, MA, USA, 2014. MIT Press.
    [8] M. Hasan, J. Choi, J. Neumann, A. K. Roy-Chowdhury, and L. S. Davis. Learning temporal regularity in video sequences. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 733–742. IEEE, June 2016.
    [9] E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, and T. Brox. Flownet 2.0: Evolution of optical flow estimation with deep networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1647–1655. IEEE, July 2017.
    [10] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32Nd International Conference on International Conference on Machine Learning - Volume 37, ICML’15, pages 448–456. JMLR.org, 2015.
    [11] R. T. Ionescu, S. Smeureanu, B. Alexe, and M. Popescu. Unmasking the abnormal events in video. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2914–2922. IEEE, Oct 2017.
    [12] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, July 2017.
    [13] J. Kim and K. Grauman. Observe locally, infer globally: A space-time mrf for detecting abnormal activities with incremental updates. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2921–2928. IEEE, June 2009.
    [14] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR), 2015.
    [15] W. Liu, W. Luo, D. Lian, and S. Gao. Future frame prediction for anomaly detection - a new baseline. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6536–6545. IEEE, June 2018.
    [16] C. Lu, J. Shi, and J. Jia. Abnormal event detection at 150 fps in matlab. In 2013 IEEE International Conference on Computer Vision (ICCV), pages 2720–2727. IEEE, Dec 2013.
    [17] W. Luo, W. Liu, and S. Gao. Remembering history with convolutional lstm for anomaly detection. In 2017 IEEE International Conference on Multimedia and Expo (ICME), pages 439–444. IEEE, July 2017.
    [18] W. Luo, W. Liu, and S. Gao. A revisit of sparse coding based anomaly detection in stacked rnn framework. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 341–349. IEEE, Oct 2017.
    [19] V. Mahadevan, W. Li, V. Bhalodia, and N. Vasconcelos. Anomaly detection in crowded scenes. In 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1975–1981. IEEE, June 2010.
    [20] X. Mao, Q. Li, H. Xie, R. Y. K. Lau, Z. Wang, and S. P. Smolley. Least squares generative adversarial networks. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2813–2821. IEEE, Oct 2017.
    [21] X.-J. Mao, C. Shen, and Y.-B. Yang. Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, pages 2810–2818, USA, 2016. Curran Associates Inc.
    [22] M. Mathieu, C. Couprie, and Y. LeCun. Deep multi-scale video prediction beyond mean square error. In International Conference on Learning Representations (ICLR), 2016.
    [23] R. Mehran, A. Oyama, and M. Shah. Abnormal crowd behavior detection using social force model. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 935–942. IEEE, June 2009.
    [24] A. Odena, V. Dumoulin, and C. Olah. Deconvolution and checkerboard artifacts. Distill, 2016.
    [25] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer. Automatic differentiation in pytorch. In NIPS 2017 Autodiff Workshop, NIPS’17, 2017.
    [26] D. Pathak, P. Krähenbühl, J. Donahue, T. Darrell, and A. A. Efros. Context encoders: Feature learning by inpainting. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2536–2544. IEEE, June 2016.
    [27] J. S. Pérez, E. Meinhardt-Llopis, and G. Facciolo. Tv-l1 optical flow estimation. Image Processing On Line, 2013:137–150, 2013.
    [28] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. In International Conference on Learning Representations (ICLR), 2016.
    [29] M. Ravanbakhsh, M. Nabi, E. Sangineto, L. Marcenaro, C. Regazzoni, and N. Sebe. Abnormal event detection in videos using generative adversarial nets. In 2017 IEEE International Conference on Image Processing (ICIP), pages 1577–1581. IEEE, Sep. 2017.
    [30] K. Simonyan and A. Zisserman. Two-stream convolutional networks for action recognition in videos. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 1, NIPS’14, pages 568–576, Cambridge, MA, USA, 2014. MIT Press.
    [31] C. Vondrick, H. Pirsiavash, and A. Torralba. Generating videos with scene dynamics. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, pages 613–621, USA, 2016. Curran Associates Inc.
    [32] S. Wang, Y. Zeng, Q. Liu, C. Zhu, E. Zhu, and J. Yin. Detecting abnormality without knowing normality: A two-stage approach for unsupervised video abnormal event detection. In Proceedings of the 26th ACM International Conference on Multimedia, MM ’18, pages 636–644, New York, NY, USA, 2018. ACM.
    [33] D. Xu, Y. Yan, E. Ricci, and N. Sebe. Detecting anomalous events in videos by learning deep representations of appearance and motion. Computer Vision and Image Understanding, 156:117–127, 2017.
    [34] S. Yan, J. S. Smith, W. Lu, and B. Zhang. Abnormal event detection from videos using a two-stream recurrent variational autoencoder. IEEE Transactions on Cognitive and Developmental Systems, pages 1–1, 2018.
    [35] C. Zach, T. Pock, and H. Bischof. A duality based approach for realtime tvl1 optical flow. In Joint Pattern Recognition Symposium, pages 214–223. Springer, 2007.
    [36] B. Zhao, L. Fei-Fei, and E. P. Xing. Online detection of unusual events in videos via dynamic sparse coding. In 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3313–3320. IEEE, June 2011.

    下載圖示 校內:2024-08-22公開
    校外:2024-08-22公開
    QR CODE