Multi-modal Embedding for Main Product Detection in Fashion

Long Long Yu, Edgar Simo-Serra, Francesc Moreno-Noguer, Antonio Rubio

研究成果: Conference contribution

11 被引用数 (Scopus)

抄録

We present an approach to detect the main product in fashion images by exploiting the textual metadata associated with each image. Our approach is based on a Convolutional Neural Network and learns a joint embedding of object proposals and textual metadata to predict the main product in the image. We additionally use several complementary classification and overlap losses in order to improve training stability and performance. Our tests on a large-scale dataset taken from eight e-commerce sites show that our approach outperforms strong baselines and is able to accurately detect the main product in a wide diversity of challenging fashion images.

本文言語English
ホスト出版物のタイトルProceedings - 2017 IEEE International Conference on Computer Vision Workshops, ICCVW 2017
出版社Institute of Electrical and Electronics Engineers Inc.
ページ2236-2242
ページ数7
ISBN(電子版)9781538610343
DOI
出版ステータスPublished - 2017 7月 1
外部発表はい
イベント16th IEEE International Conference on Computer Vision Workshops, ICCVW 2017 - Venice, Italy
継続期間: 2017 10月 222017 10月 29

出版物シリーズ

名前Proceedings - 2017 IEEE International Conference on Computer Vision Workshops, ICCVW 2017
2018-January

Other

Other16th IEEE International Conference on Computer Vision Workshops, ICCVW 2017
国/地域Italy
CityVenice
Period17/10/2217/10/29

ASJC Scopus subject areas

  • コンピュータ サイエンスの応用
  • コンピュータ ビジョンおよびパターン認識

フィンガープリント

「Multi-modal Embedding for Main Product Detection in Fashion」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル