arXiv:2406.10519 Abstract | arXiv Analytics

arXiv:2406.10519 [cs.CV]Abstract References Reviews Resources

Self Pre-training with Topology- and Spatiality-aware Masked Autoencoders for 3D Medical Image Segmentation

Pengfei Gu, Yejia Zhang, Huimin Li, Hongxiao Wang, Yizhe Zhang, Chaoli Wang, Danny Z. Chen

Published 2024-06-15Version 1

Masked Autoencoders (MAEs) have been shown to be effective in pre-training Vision Transformers (ViTs) for natural and medical image analysis problems. By reconstructing missing pixel/voxel information in visible patches, a ViT encoder can aggregate contextual information for downstream tasks. But, existing MAE pre-training methods, which were specifically developed with the ViT architecture, lack the ability to capture geometric shape and spatial information, which is critical for medical image segmentation tasks. In this paper, we propose a novel extension of known MAEs for self pre-training (i.e., models pre-trained on the same target dataset) for 3D medical image segmentation. (1) We propose a new topological loss to preserve geometric shape information by computing topological signatures of both the input and reconstructed volumes, learning geometric shape information. (2) We introduce a pre-text task that predicts the positions of the centers and eight corners of 3D crops, enabling the MAE to aggregate spatial information. (3) We extend the MAE pre-training strategy to a hybrid state-of-the-art (SOTA) medical image segmentation architecture and co-pretrain it alongside the ViT. (4) We develop a fine-tuned model for downstream segmentation tasks by complementing the pre-trained ViT encoder with our pre-trained SOTA model. Extensive experiments on five public 3D segmentation datasets show the effectiveness of our new approach.

Categories: cs.CV, cs.AI

Keywords: 3d medical image segmentation, spatiality-aware masked autoencoders, self pre-training, preserve geometric shape information, segmentation tasks

Related articles: Most relevant | Search more

arXiv:2011.09608 [cs.CV] (Published 2020-11-19)

Bidirectional RNN-based Few Shot Learning for 3D Medical Image Segmentation

Soopil Kim, Sion An, Philip Chikontwe, Sang Hyun Park

arXiv:2302.05615 [cs.CV] (Published 2023-02-11)

Anatomical Invariance Modeling and Semantic Alignment for Self-supervised Learning in 3D Medical Image Segmentation

Yankai Jiang, Mingze Sun, Heng Guo, Ke Yan, Le Lu, Minfeng Xu

arXiv:2307.12004 [cs.CV] (Published 2023-07-22)

COLosSAL: A Benchmark for Cold-start Active Learning for 3D Medical Image Segmentation

Han Liu et al.

arXiv Analytics

arXiv:2406.10519 [cs.CV]Abstract References Reviews Resources

Self Pre-training with Topology- and Spatiality-aware Masked Autoencoders for 3D Medical Image Segmentation

Links

Toolbox

arXiv:2406.10519 [cs.CV]AbstractReferencesReviewsResources

Self Pre-training with Topology- and Spatiality-aware Masked Autoencoders for 3D Medical Image Segmentation

Links

Toolbox

arXiv:2406.10519 [cs.CV]Abstract References Reviews Resources