| 研究生: |
梁榮發 Prawiro, Herman |
|---|---|
| 論文名稱: |
使用二串流解碼之監控影片異常事件偵測 Abnormal Event Detection in Surveillance Videos using Two-Stream Decoder |
| 指導教授: |
胡敏君
Hu, Min-Chun |
| 學位類別: |
碩士 Master |
| 系所名稱: |
電機資訊學院 - 資訊工程學系 Department of Computer Science and Information Engineering |
| 論文出版年: | 2019 |
| 畢業學年度: | 107 |
| 語文別: | 英文 |
| 論文頁數: | 36 |
| 外文關鍵詞: | Abnormal Event Detection, Surveillance Video, Auto-Encoder, Two-Stream Decoder |
| 相關次數: | 點閱:75 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
Abnormal event detection in surveillance videos refers to the identification of events that deviate from the normal pattern. We can use autoencoder to learn the normal pattern from normal videos and use the reconstruction error to indicate the presence of abnormalities. As the surveillance cameras are usually static, the surveillance videos can be divided into two components: dynamic objects and a static background. Because of the nature of the static background, we can assume that the source of abnormality is from the objects. In this work, we propose to use a two-stream decoder model to tackle abnormal event detection problem in surveillance videos. The two-stream decoder consists of background stream to model the static background and foreground stream to model the dynamic objects. We also utilized two-stream encoder to learn from optical flow, which contains motion information, and skip connection to improve the detail of output frames. Different constraints and adversarial training are also applied to train a more robust model for abnormality event detection task. Several experiments on publicly available datasets validate the effectiveness of our model.
[1] D. Abati, A. Porrello, S. Calderara, and R. Cucchiara. Latent space autoregression for novelty detection. In International Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019.
[2] V. Chandola, A. Banerjee, and V. Kumar. Anomaly detection: A survey. ACM Computing Surveys, 41(3):15:1–15:58, July 2009.
[3] T. Fawcett. An introduction to roc analysis. Pattern Recognition Letters, 27(8):861 – 874, 2006.
[4] C. Feichtenhofer, A. Pinz, and A. Zisserman. Convolutional two-stream network fusion for video action recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1933–1941. IEEE, June 2016.
[5] D. Gong, L. Liu, V. Le, B. Saha, M. R. Mansour, S. Venkatesh, and A. v. d. Hengel. Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. arXiv preprint arXiv:1904.02639, 2019.
[6] I. Goodfellow, Y. Bengio, and A. Courville. Deep Learning. MIT Press, 2016. http://www.deeplearningbook.org.
[7] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS’14, pages 2672–2680, Cambridge, MA, USA, 2014. MIT Press.
[8] M. Hasan, J. Choi, J. Neumann, A. K. Roy-Chowdhury, and L. S. Davis. Learning temporal regularity in video sequences. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 733–742. IEEE, June 2016.
[9] E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, and T. Brox. Flownet 2.0: Evolution of optical flow estimation with deep networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1647–1655. IEEE, July 2017.
[10] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32Nd International Conference on International Conference on Machine Learning - Volume 37, ICML’15, pages 448–456. JMLR.org, 2015.
[11] R. T. Ionescu, S. Smeureanu, B. Alexe, and M. Popescu. Unmasking the abnormal events in video. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2914–2922. IEEE, Oct 2017.
[12] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, July 2017.
[13] J. Kim and K. Grauman. Observe locally, infer globally: A space-time mrf for detecting abnormal activities with incremental updates. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2921–2928. IEEE, June 2009.
[14] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR), 2015.
[15] W. Liu, W. Luo, D. Lian, and S. Gao. Future frame prediction for anomaly detection - a new baseline. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6536–6545. IEEE, June 2018.
[16] C. Lu, J. Shi, and J. Jia. Abnormal event detection at 150 fps in matlab. In 2013 IEEE International Conference on Computer Vision (ICCV), pages 2720–2727. IEEE, Dec 2013.
[17] W. Luo, W. Liu, and S. Gao. Remembering history with convolutional lstm for anomaly detection. In 2017 IEEE International Conference on Multimedia and Expo (ICME), pages 439–444. IEEE, July 2017.
[18] W. Luo, W. Liu, and S. Gao. A revisit of sparse coding based anomaly detection in stacked rnn framework. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 341–349. IEEE, Oct 2017.
[19] V. Mahadevan, W. Li, V. Bhalodia, and N. Vasconcelos. Anomaly detection in crowded scenes. In 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1975–1981. IEEE, June 2010.
[20] X. Mao, Q. Li, H. Xie, R. Y. K. Lau, Z. Wang, and S. P. Smolley. Least squares generative adversarial networks. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 2813–2821. IEEE, Oct 2017.
[21] X.-J. Mao, C. Shen, and Y.-B. Yang. Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, pages 2810–2818, USA, 2016. Curran Associates Inc.
[22] M. Mathieu, C. Couprie, and Y. LeCun. Deep multi-scale video prediction beyond mean square error. In International Conference on Learning Representations (ICLR), 2016.
[23] R. Mehran, A. Oyama, and M. Shah. Abnormal crowd behavior detection using social force model. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 935–942. IEEE, June 2009.
[24] A. Odena, V. Dumoulin, and C. Olah. Deconvolution and checkerboard artifacts. Distill, 2016.
[25] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer. Automatic differentiation in pytorch. In NIPS 2017 Autodiff Workshop, NIPS’17, 2017.
[26] D. Pathak, P. Krähenbühl, J. Donahue, T. Darrell, and A. A. Efros. Context encoders: Feature learning by inpainting. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2536–2544. IEEE, June 2016.
[27] J. S. Pérez, E. Meinhardt-Llopis, and G. Facciolo. Tv-l1 optical flow estimation. Image Processing On Line, 2013:137–150, 2013.
[28] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. In International Conference on Learning Representations (ICLR), 2016.
[29] M. Ravanbakhsh, M. Nabi, E. Sangineto, L. Marcenaro, C. Regazzoni, and N. Sebe. Abnormal event detection in videos using generative adversarial nets. In 2017 IEEE International Conference on Image Processing (ICIP), pages 1577–1581. IEEE, Sep. 2017.
[30] K. Simonyan and A. Zisserman. Two-stream convolutional networks for action recognition in videos. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 1, NIPS’14, pages 568–576, Cambridge, MA, USA, 2014. MIT Press.
[31] C. Vondrick, H. Pirsiavash, and A. Torralba. Generating videos with scene dynamics. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, pages 613–621, USA, 2016. Curran Associates Inc.
[32] S. Wang, Y. Zeng, Q. Liu, C. Zhu, E. Zhu, and J. Yin. Detecting abnormality without knowing normality: A two-stage approach for unsupervised video abnormal event detection. In Proceedings of the 26th ACM International Conference on Multimedia, MM ’18, pages 636–644, New York, NY, USA, 2018. ACM.
[33] D. Xu, Y. Yan, E. Ricci, and N. Sebe. Detecting anomalous events in videos by learning deep representations of appearance and motion. Computer Vision and Image Understanding, 156:117–127, 2017.
[34] S. Yan, J. S. Smith, W. Lu, and B. Zhang. Abnormal event detection from videos using a two-stream recurrent variational autoencoder. IEEE Transactions on Cognitive and Developmental Systems, pages 1–1, 2018.
[35] C. Zach, T. Pock, and H. Bischof. A duality based approach for realtime tvl1 optical flow. In Joint Pattern Recognition Symposium, pages 214–223. Springer, 2007.
[36] B. Zhao, L. Fei-Fei, and E. P. Xing. Online detection of unusual events in videos via dynamic sparse coding. In 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3313–3320. IEEE, June 2011.