{ "id": "2409.07793", "version": "v1", "published": "2024-09-12T06:52:46.000Z", "updated": "2024-09-12T06:52:46.000Z", "title": "Lagrange Duality and Compound Multi-Attention Transformer for Semi-Supervised Medical Image Segmentation", "authors": [ "Fuchen Zheng", "Quanjun Li", "Weixuan Li", "Xuhang Chen", "Yihang Dong", "Guoheng Huang", "Chi-Man Pun", "Shoujun Zhou" ], "comment": "5 pages, 4 figures, 3 tables", "categories": [ "cs.CV", "cs.AI" ], "abstract": "Medical image segmentation, a critical application of semantic segmentation in healthcare, has seen significant advancements through specialized computer vision techniques. While deep learning-based medical image segmentation is essential for assisting in medical diagnosis, the lack of diverse training data causes the long-tail problem. Moreover, most previous hybrid CNN-ViT architectures have limited ability to combine various attentions in different layers of the Convolutional Neural Network. To address these issues, we propose a Lagrange Duality Consistency (LDC) Loss, integrated with Boundary-Aware Contrastive Loss, as the overall training objective for semi-supervised learning to mitigate the long-tail problem. Additionally, we introduce CMAformer, a novel network that synergizes the strengths of ResUNet and Transformer. The cross-attention block in CMAformer effectively integrates spatial attention and channel attention for multi-scale feature fusion. Overall, our results indicate that CMAformer, combined with the feature fusion framework and the new consistency loss, demonstrates strong complementarity in semi-supervised learning ensembles. We achieve state-of-the-art results on multiple public medical image datasets. Example code are available at: \\url{https://github.com/lzeeorno/Lagrange-Duality-and-CMAformer}.", "revisions": [ { "version": "v1", "updated": "2024-09-12T06:52:46.000Z" } ], "analyses": { "keywords": [ "semi-supervised medical image segmentation", "compound multi-attention transformer", "lagrange duality", "effectively integrates spatial attention", "learning-based medical image segmentation" ], "note": { "typesetting": "TeX", "pages": 5, "language": "en", "license": "arXiv", "status": "editable" } } }