山东工商学院 信息与电子工程学院,山东 烟台 264005
[ "丁昕苗(1979—),女,教授,E-mail:[email protected]; " ]
[ "家兴(1998—),男,山东工商学院硕士研究生,E-mail:[email protected]" ]
纸质出版日期:2024-1-20,
网络出版日期:2023-8-29,
收稿日期:2022-10-31,
扫 描 看 全 文
丁昕苗, 王家兴, 郭文. 三维注意力增强的暴力场景检测算法[J]. 西安电子科技大学学报, 2024,51(1):114-124.
Xinmiao DING, Jiaxing WANG, Wen GUO. Three-dimensional attention-enhanced algorithm for violence scene detection[J]. Journal of Xidian University, 2024,51(1):114-124.
丁昕苗, 王家兴, 郭文. 三维注意力增强的暴力场景检测算法[J]. 西安电子科技大学学报, 2024,51(1):114-124. DOI: 10.19665/j.issn1001-2400.20230206.
Xinmiao DING, Jiaxing WANG, Wen GUO. Three-dimensional attention-enhanced algorithm for violence scene detection[J]. Journal of Xidian University, 2024,51(1):114-124. DOI: 10.19665/j.issn1001-2400.20230206.
为了提升互联网多媒体内容安全检测能力
有效过滤不良信息
提出了一种基于三维注意力增强的视频暴力内容检测算法。该算法以3D-DenseNet为骨干网络
首先利用P3D提取低层次的时空特征信息;其次引入SimAM注意力模块计算通道-空间注意力
增强帧画面重点区域信息;然后设计了时域注意力加强的过渡层突出重点时序信息
如此形成通道-空间-时间三维注意力
提升暴力场景检测性能。实验结果显示
算法在内容单一的小规模暴力行为检测数据集Hockey和Movies上准确率分别达到了98.75%和100%
在内容多样的大规模数据集RWF-2 000上达到了89.25%
综合性能优于同类算法
验证了算法的有效性;在长视频的暴力内容定位实验中
算法在VSD2014数据集上相较同类算法也取得了更好的检测效果
证明了算法在暴力内容检测方面的泛化能力。
In order to improve the ability of multimedia to analyze the security on Web and effectively filter the objectionable content
a violent video scene detection algorithm based on three-dimensional attention is proposed.Taking the 3D DenseNet as the backbone network
the algorithm first uses the P3D to extract low-level spatial-temporal feature information.Second
the SimAM attention module is introduced to calculate channel-spatial attention so as to enhance the feature of the key area in the video frame.Then
a transition layer with temporal attention is designed to highlight the feature of key frames in the video.In this way
the channel-spatial-temporal attention is formed to better detect violent scenes.In the experiments on violence detection
the accuracy reaches 98.75% and 100% on Hockey and Movies
which are small data sets with a single content
and 89.25% on RWF-2000
which is a large data set with a diverse content.Results show that the proposed algorithm can effectively improve the performance of violence detection with 3D attention.In the violent content localization detection experiment on data set VSD2014
the better performance further proves the effectiveness and generalization ability of the algorithm.
暴力检测深度学习注意力机制模式识别P3D3D-DenseNet
violence detectiondeep learningattention mechanismpattern recognitionP3D3D-DenseNet
CLARIN C, DIONISIO J, ECHAVEZ M, et al.DOVE:Detection of Movie Violence Using Motion Intensity Analysis on Skin and Blood(2005) [OL].[2022-01-01].https://www.researchgate.net/publication/249918692. https://www.researchgate.net/publication/249918692https://www.researchgate.net/publication/249918692
NAM J, ALGHONIEMY M, TEWFIK A H. Audio-Visual Content-Based Violent Scene Characterization[C]//Proceedings 1998 International Conference on Image Processing(ICIP98). Piscataway:IEEE, 1998:353-357.
TOFA K N, AHMED F, SHAKIL A. Inappropriate Scene Detection in A Video Stream[D]. Dhaka: BRAC University, 2017.
CHEN M, HAUPTMANN A. MoSIFT:Recognizing Human Actions in Surveillance Videos(2009)[J/OL].[2022-01-01].http://reports-archive.adm.cs.cmu.edu/anon/anon/anon/anon/anon/home/anon/anon/2009/CMU-CS-09-161.pdf. http://reports-archive.adm.cs.cmu.edu/anon/anon/anon/anon/anon/home/anon/anon/2009/CMU-CS-09-161.pdfhttp://reports-archive.adm.cs.cmu.edu/anon/anon/anon/anon/anon/home/anon/anon/2009/CMU-CS-09-161.pdf
PADAMWAR B, PARTANI K. Violence Detection in Surveillance Video Using Computer Vision Techniques[J]. International Journal for Research in Applied Science & Engineering Technology, 2020, 8(VIII):533-536.
DAS S, SARKER A, MAHMUD T. Violence Detection from Videos Using HOG Features[C]//In Proceedings of the 2019 4th International Conference on Electrical Information and Communication Technology(EICT). Piscataway:IEEE, 2019:1-5.
RIBEIRO P C, AUDIGIER R, PHAM Q C. RIMOC, A Feature to Discriminate Unstructured Motions:Application to Violence Detection for Video-Surveillance[J]. Computer Vision and Image Understanding, 2016, 144:121-143.
WON D, STEINERT-THRELKELD Z C, JOO J. Protest Activity Detection and Perceived Violence Estimation from Social Media Images[C]// Proceedings of the 25th ACM International Conference on Multimedia. New York: ACM, 2017:786-794.
SUDHAKARAN S, LANZ O. Learning to Detect Violent Videos Using Convolutional Long Short-Term Memory[C]//In 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance(AVSS). Piscataway:IEEE, 2017:1-6.
ZAHIDUL I, MOHAMMAD R, RAIYAN A, et al. Efficient Two-Stream Network for Violence Detection Using Separable ConvolutionalLstm[C]//2021 International Joint Conference on Neural Networks(IJCNN). Piscataway:IEEE, 2021:1-8.
LI J, JIANG X, SUN T, et al. Efficient Violence Detection Using 3d Convolutional Neural Networks[C]//2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance(AVSS). Piscataway:IEEE, 2019:1-8.
ULLAH F U M, ULLAH A, MUHAMMAD K, et al. Violence Detection Using Spatiotemporal Features with 3D Convolutional Neural Network[J]. Sensors, 2019, 19(11):2472.
ACCATTOLI S, SERNANI P, FALCIONELLI N, et al. Violence Detection in Videos by Combining 3d Convolutional Neural Networks and Support Vector Machines[J]. Applied Artificial Intelligence, 2020, 34(4):329-344.
LIANG Q, LI Y, YANG K, et al. Long-Term Recurrent Convolutional Network ViolentBehaviour Recognition with Attention Mechanism[J]. MATEC Web of Conferences, 2021, 336:05013.
LIANG Q, LI Y, CHEN B, et al. Violence Behavior Recognition of Two-Cascade Temporal ShiftModule with Attention Mechanism[J]. Journal of Electronic Imaging, 2021, 30(4):043009.
REND N-SEGADOR F J, LVAREZ-GARCÍA J A, ENRÍQUEZ F, et al. ViolenceNet:Dense Multi-Head Self-Attention with Bidirectional Convolutional LSTM for Detecting Violence[J]. Electronics, 2021, 10(13):1601.
HUANG G, LIU Z, LAURENS V D M, et al. Densely Connected Convolutional Networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Piscataway:IEEE, 2017:2261-2269.
LEJMI W, KHALIFA A B, MAHJOUB M A. A NovelSpatio-Temporal Violence Classification Framework Based on Material Derivative and Lstm Neural Network[J]. Traitement du Signal, 2020, 37(5):687-701.
WANG P, WANG P, FAN E. Violence Detection and Face Recognition Based on Deep Learning[J]. Pattern Recognition Letters, 2021, 142(Feb.):20-24.
FEBIN I P, JAYASREE K, JOY P T. Violence Detection in Videos for an Intelligent Surveillance System UsingMobsift and Movement Filtering Algorithm[J]. Pattern Analysis and Applications, 2020, 23(2):611-623.
HE K, ZHANG X, REN S, et al. Deep Residual Learning for Image Recognition[C]//2016 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Piscataway:IEEE, 2016:770-778.
ZHANG J, XIE Y, XIA Y, et al. Attention Residual Learning for Skin Lesion Classification[J]. IEEE Transactions on Medical Imaging, 2019, 38(9):2092-2103.
JADERBERG M, SIMONYAN K, ZISSERMAN A. Spatial Transformer Networks[C]//NIPS'15:Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 2. New York:ACM, 2015:2017-2025.
SANGHYUN W, JONGCHAN P, JOON-YOUNG L, et al. CBAM:Convolutional Block AttentionModule[C]//Proceedings of the European Conference on Computer Vision(ECCV). Berlin:Springer, 2018,3-19.
HU J, SHEN L, SUN G, et al. Squeeze-And-Excitation Networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2018:7132-7141.
WANG F, JIANG M, QIAN C, et al. Residual Attention Network for Image Classification[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Piscataway:IEEE, 2017:6450-6458.
刘迪, 郭继昌, 汪昱东, 等. 融合注意力机制的多尺度显著性目标检测网络[J]. 西安电子科技大学学报, 2022, 49(4):118-126.
LIU Di, GUO Jichang, WANG Yudong, et al. Multi-Scale Salient Object Detection Network Combining an Attention Mechanism[J]. Journal of Xidian University, 2022, 49(4):118-126.
高德勇, 康自兵, 王松, 等. 利用卷积块注意力机制识别人体动作的方法[J]. 西安电子科技大学学报, 2022, 49(4):144-155.
GAO Deyong, KANGZibing, WANG Song, et al. Method to Recognize Human Action by Using the Convolutional Block Attention Mechanism[J]. Journal of Xidian University, 2022, 49(4):144-155.
YANG L, ZHANG R, LI L, et al. SimAM:A Simple,Parameter-Free AttentionModule for Convolutional Neural Networks[C]// Proceedings of the 38th International Conference on Machine Learning. San Diego: ICML, 2021:11863-11874.
WEBB B S, DHRUV N T, SOLOMON S G, et al. Early and Late Mechanisms of Surround Suppression in Striate Cortex of Macaque[J]. The Journal of Neuroscience, 2005, 25(50):11666-11675.
DIBA A, FAYYAZ M, SHARMA V, et al. Temporal 3DConvNets:New Architecture and Transfer Learning for Video Classification(2017)[J/OL].[2022-01-01].https://arxiv.org/pdf/1711.08200.pdf. https://arxiv.org/pdf/1711.08200.pdfhttps://arxiv.org/pdf/1711.08200.pdf
QU Z, LIN L, GAO T, et al. An Improved Keyframe Extraction Method Based on HSVColour Space[J]. Journal of Software, 2013, 8(7):1751-1758.
CHENG M, CAI K, LI M.RWF-M.RWF- 2000: An Open Large Scale Video Database for Violence Detection[C]// 2020 25th International Conference on Pattern Recognition(ICPR). Piscataway: IEEE, 2021:4183-4190.
SCHEDI M, SJOBERG M, MIRONICA I, et al. VSD 2014:A Dataset for Violent Scenes Detection in Hollywood Movies and Web Videos[C]// 2015 13th International Workshop on Content-Based Multimedia Indexing(CBMI). Piscataway:IEEE, 2015:1-6.
DAI Q, WU Z, Jiang Y, et al. Fudan-NJUST atMediaEval 2014:Violent Scenes Detection Using Deep Neural Networks(2014)[J/OL].[2022-01-01].https://ceur-ws.org/Vol-1263/mediaeval2014_submission_65.pdf. https://ceur-ws.org/Vol-1263/mediaeval2014_submission_65.pdfhttps://ceur-ws.org/Vol-1263/mediaeval2014_submission_65.pdf
SJOBERG M, MIRONICA I, SCHEDL M, et al. FAR atMediaEval 2014 Violent Scenes Detection:A Concept-based Fusion Approach(2014)[J/OL].[2022-01-01].https://ceur-ws.org/Vol-1263/mediaeval2014_submission_66.pdf. https://ceur-ws.org/Vol-1263/mediaeval2014_submission_66.pdfhttps://ceur-ws.org/Vol-1263/mediaeval2014_submission_66.pdf
0
浏览量
1
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构