A Transformer-based Semantic Segmentation Model for Street Fashion Images

Dingjie Peng*, Wataru Kameyama

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Semantic segmentation is a pixel-level classification problem in computer vision, in which pixels of the same class are grouped into a single category in order to interpret pictures at the pixel level. In this field, semantic segmentation of street fashion images is a challenging task since the clothing items would appear with wide variations in fabrics, layering, occlusion and viewpoint. To help better understanding the street fashion images, we propose a lightweight Semantic Context Aware Transformer (SCAT) to be applied to the semantic segmentation task for street fashion images, which integrates semantic context into the encoding, and models the relationship between multi-level outputs from transformer layers. Extensive experiments and comparisons show that the proposal achieves the state-of-the-art results on ModaNet dataset with relatively small model size, with over 1.1 point improvement compared to Shunted Transformer, and even surpasses other CNNs and Transformers with a large margin of over 2 point in mIoU.

Original languageEnglish
Title of host publicationInternational Workshop on Advanced Imaging Technology, IWAIT 2023
EditorsMasayuki Nakajima, Jae-Gon Kim, Kwang-deok Seo, Toshihiko Yamasaki, Jing-Ming Guo, Phooi Yee Lau, Qian Kemao
PublisherSPIE
ISBN (Electronic)9781510663084
DOIs
Publication statusPublished - 2023
Event2023 International Workshop on Advanced Imaging Technology, IWAIT 2023 - Jeju, Korea, Republic of
Duration: 2023 Jan 92023 Jan 11

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume12592
ISSN (Print)0277-786X
ISSN (Electronic)1996-756X

Conference

Conference2023 International Workshop on Advanced Imaging Technology, IWAIT 2023
Country/TerritoryKorea, Republic of
CityJeju
Period23/1/923/1/11

Keywords

  • Semantic Context
  • Semantic Segmentation
  • Street Fashion Images
  • Transformer

ASJC Scopus subject areas

  • Electronic, Optical and Magnetic Materials
  • Condensed Matter Physics
  • Computer Science Applications
  • Applied Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A Transformer-based Semantic Segmentation Model for Street Fashion Images'. Together they form a unique fingerprint.

Cite this