天津大学 精密仪器与光电子工程学院 光电信息技术教育部重点实验室,天津 300072
[ "刘博翀(1998—),男,天津大学硕士研究生,E-mail:[email protected];" ]
蔡怀宇(1965—),女,教授,E-mail:[email protected]
[ "汪毅(1981—),女,副教授,E-mail:[email protected];" ]
[ "陈晓冬(1975—),男,教授,E-mail:[email protected]" ]
纸质出版日期:2024-1-20,
网络出版日期:2023-8-22,
收稿日期:2022-10-25,
扫 描 看 全 文
刘博翀, 蔡怀宇, 汪毅, 等. 用于语义分割的自监督对比式表征学习[J]. 西安电子科技大学学报, 2024,51(1):125-134.
Bochong LIU, Huaiyu CAI, Yi WANG, et al. Self-supervised contrastive representation learning for semantic segmentation[J]. Journal of Xidian University, 2024,51(1):125-134.
刘博翀, 蔡怀宇, 汪毅, 等. 用于语义分割的自监督对比式表征学习[J]. 西安电子科技大学学报, 2024,51(1):125-134. DOI: 10.19665/j.issn1001-2400.20230304.
Bochong LIU, Huaiyu CAI, Yi WANG, et al. Self-supervised contrastive representation learning for semantic segmentation[J]. Journal of Xidian University, 2024,51(1):125-134. DOI: 10.19665/j.issn1001-2400.20230304.
为了提升语义分割模型的精度
并减少逐像素标注大规模语义分割数据集的人力和时间成本
研究了自监督对比式表征学习的预训练方法
并结合语义分割任务的特点
设计了全局-局部交叉对比学习(GLCCL)方法。该方法将全局图像和局部分块后的一系列图像块输入到网络中分别编码全局和局部视觉表征
并通过构建包含全局对比、局部对比和全局-局部交叉对比的损失函数来指导模型训练
使得模型能够同时学习全局和局部区域的视觉表征以及跨区域语义相关性。使用该方法预训练BiSeNet再迁移到语义分割任务时
对比现有的自监督对比式表征学习和有监督预训练方法分别具有0.24%和0.9%平均交并比(MIoU)的性能提升。实验结果表明
该方法能够采用无标注的数据训练语义分割模型而实现分割效果的提升
具有一定的实用价值。
To improve the accuracy of the semantic segmentation models and avoid the labor and time costs of pixel-wise image annotation for large-scale semantic segmentation datasets
this paper studies the pre-training methods of self-supervised contrastive representation learning
and designs the Global-Local Cross Contrastive Learning(GLCCL) method based on the characteristics of the semantic segmentation task.This method feeds global images and a series of image patches after local chunking into the network to extract global and local visual representations respectively
and guides the network training by constructing loss function that includes global contrast
local contrast
and global-local cross contrast
enabling the network to learn both global and local visual representations as well as cross-regional semantic correlations.When using this method to pre-train BiSeNet and transfer to the semantic segmentation task
compared with the existing self-supervised contrastive representational learning and supervised pre-training methods
the performance improvement of 0.24% and 0.9% mean intersection over union(MIoU) is achieved.Experimental results show that this method can improve the segmentation results by pre-training the semantic segmentation model with unlabeled data
which has a certain practical value.
语义分割自监督表征学习对比学习深度学习
semantic segmentationself-supervised representation learningcontrastive learningdeep learning
DENG J, DONG W, SOCHER R, et al. Imagenet:A Large-Scale Hierarchical Image Database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2009:248-255.
LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO:Common Objects in Context[C]//European Conference on Computer Vision. Berlin:Springer, 2014:740-755.
CORDTS M, OMRAN M, RAMOS S, et al. The Cityscapes Dataset for Semantic Urban Scene Understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:3213-3223.
周鹏, 杨军. 采用神经网络架构搜索的遥感影像分割方法[J]. 西安电子科技大学学报, 2021, 48(5):47-57.
ZHOU Peng, YANG Jun. Remote Sensing Image Segmentation Method Using Neural Network Architecture Search[J]. Journal of Xidian University, 2021, 48(5):47-57.
LIU X, ZHANG F, HOU Z, et al. Self-Supervised Learning:Generative or Contrastive[J]. IEEE Transactions on Knowledge and Data Engineering, 2021, 35(1):857-876.
LI Y, HU P, LIU Z, et al. Contrastive Clustering[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(10):8547-8555.
LIU S, LI Z, SUN J. Self-EMD:Self-Supervised Object Detection without ImageNet(2020)[J/OL].[2022-01-01].https://arxiv.org/abs/2011.13677v3. https://arxiv.org/abs/2011.13677v3https://arxiv.org/abs/2011.13677v3
WEI F, GAO Y, WU Z, et al. Aligning Pretraining for Detection via Object-Level Contrastive Learning[J]. Advances in Neural Information Processing Systems, 2021, 34:22682-22694.
CARON M, MISRA I, MAIRAL J, et al. Unsupervised Learning of Visual Features by Contrasting Cluster Assignments[J]. Advances in Neural Information Processing Systems, 2020, 33:9912-9924.
史家辉, 郝小慧, 李雁妮. 一种高效的自监督元迁移小样本学习算法[J]. 西安电子科技大学学报, 2021, 48(6):48-56.
SHI Jiahui, HAO Xiaohui, LI Yanni. A Highly Efficient Self-Supervised Meta Transfer Small Sample Learning Algorithm[J]. Journal of Xidian University, 2021, 48(6):48-56.
VAHDAT A, KAUTZ J. NVAE:A Deep Hierarchical Variational Autoencoder[J]. Advances in Neural Information Processing Systems, 2020, 33:19667-19679.
GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative Adversarial Networks[J]. Communications of the ACM, 2020, 63(11):139-144.
王军军, 孙岳, 李颖. 一种生成对抗网络的遥感图像去云方法[J]. 西安电子科技大学学报, 2021, 48(5):23-29.
WANG Junjun, SUN Yue, LI Yin. A Remote Sensing Image Declouding Method for Generating Adversarial Networks[J]. Journal of Xidian University, 2021, 48(5):23-29.
须颖, 刘帅, 邵萌, 等. 一种多尺度GAN的低剂量CT超分辨率重建方法[J]. 西安电子科技大学学报, 2022, 49(2):228-236.
XU Yin, LIU Shuai, SHAO Meng, et al. A Multi-Scale GAN-Low Dose CT Super-Resolution Reconstruction Method[J]. Journal of Xidian University, 2022, 49(2):228-236.
CHEN T, KORNBLITH S, NOROUZI M, et al. A Simple Framework for Contrastive Learning of Visual Representations[C]//International Conference on Machine Learning. San Diego: ICML, 2020:1597-1607.
HE K, FAN H, WU Y, et al. Momentum Contrast for Unsupervised Visual Representation Learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020:9726-9735.
TIAN Y, KRISHNAN D, ISOLA P. Contrastive Multiview Coding[C]//European Conference on Computer Vision. Berlin: Springer, 2020:776-794.
CHEN X, FAN H, GIRSHICK R, et al. Improved Baselines with Momentum Contrastive Learning(2020)[J/OL].[2022-01-01].https://arxiv.org/abs/2003.04297. https://arxiv.org/abs/2003.04297https://arxiv.org/abs/2003.04297
GRILL J B, STRUB F, ALTCHÉ F, et al. Bootstrap Your Own Latent:A New Approach to Self-Supervised Learning[J]. Advances in Neural Information Processing Systems, 2020, 33:21271-21284.
CHEN X, HE K. Exploring Simple Siamese Representation Learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021:15745-15753.
YU C, WANG J, PENG C, et al. BiSeNet:Bilateral Segmentation Network for Real-time Semantic Segmentation[C]// Proceedings of the European Conference on Computer Vision(ECCV). 2018:334-349.
WU T, TANG S, ZHANG R, et al. CGNet:A Light-Weight Context Guided Network for Semantic Segmentation[J]. IEEE Transactions on Image Processing, 2021, 30:1169-1179.
刘博翀, 蔡怀宇, 杨诗远, 等. 一种用于自动驾驶场景的轻量级语义分割网络[J]. 西安电子科技大学学报, 2023, 50(1):118-128.
LIU Bochong, CAI Huaiyu, YANG Shiyuan, et al. Lightweight Semantic Segmentation Network for Automatic Driving Scenarios[J]. Journal of Xidian University, 2023, 50(1):118-128.
LOSHCHILOV I, HUTTER F. SGDR:Stochastic Gradient Descent with Warm Restarts(2017)[J/OL].[2022-01-01].https://arxiv.org/abs/1608.03983v3. https://arxiv.org/abs/1608.03983v3https://arxiv.org/abs/1608.03983v3
LOSHCHILOV I, HUTTER F. Decoupled Weight Decay Regularization(2017)[J/OL].[2022-01-01].https://arxiv.org/abs/1711.05101v1. https://arxiv.org/abs/1711.05101v1https://arxiv.org/abs/1711.05101v1
0
浏览量
5
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构