Conducting patch contrastive learning with mixture of experts on mixed datasets for medical image segmentation

Jiazhe Wang*, Osamu Yoshie, Yuya Ieiri

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Medical image segmentation is critical for accurate diagnosis, treatment planning, and surgical navigation. In recent years, large multitask segmentation models have often struggled due to the limited size of datasets and significant variability in target structures, image resolutions, and annotation standards. These variations can introduce task competitions during multitask model training, which hinder effective feature learning. To address these challenges, we propose PatchMoE, a unified framework designed to compensate for resolution discrepancies across datasets and feature conflicts arising in mixed-dataset training. PatchMoE is the first to introduce patch-based contrastive learning into medical image segmentation tasks, which divides images into equal-sized patches represented in 3D coordinate space. This novel approach ensures that mixed datasets with varying resolutions can be trained in a unified manner, preserving spatial relationships and enhancing contextual understanding. PatchMoE also incorporates a mixture of experts (MoE) mechanism into the decoder, which dynamically selects dataset-specific expert combinations. This design mitigates parameter conflicts through network sparsification, effectively resolving optimization conflicts in multitask datasets. The effectiveness of the proposed method was demonstrated in four independent segmentation tasks: retinal vessel (DRIVE), near-infrared blurred vessel (HVNIR), abdominal multiorgan (Synapse), and polyp segmentation (Kvasir-SEG). We compared performance using multiple metrics, including Dice score, Intersection over Union (IoU), and Hausdorff distance (HD). Compared with the state-of-the-art (SOTA) GCASCADE model, PatchMoE achieved an improvement of 3.04% in the mean Dice score across all tasks. The proposed method also achieved an average Dice score improvement of 0.88% compared to four independently trained SOTA models for each individual task. In summary, PatchMoE combines patch-based contrastive learning with dataset-informed expert gating to provide promising solutions for dataset conflicts in large transformer-based medical segmentation models.

Original languageEnglish
Pages (from-to)14189-14216
Number of pages28
JournalNeural Computing and Applications
Volume37
Issue number19
DOIs
Publication statusPublished - 2025 Jul

Keywords

  • Medical image segmentation
  • Mixture of experts
  • Patch embedding
  • Position embedding
  • Vision transformer

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Conducting patch contrastive learning with mixture of experts on mixed datasets for medical image segmentation'. Together they form a unique fingerprint.

Cite this