TY - GEN
T1 - Learning and association of synaesthesia phenomenon using deep neural networks
AU - Yamaguchi, Yuki
AU - Noda, Kuniaki
AU - Nishide, Shun
AU - Okuno, Hiroshi G.
AU - Ogata, Tetsuya
N1 - Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2013
Y1 - 2013
N2 - Robots are required to process multimodal information because the information in the real world comes from various modal inputs. However, there exist only a few robots integrating multimodal information. Humans can recognize the environment effectively by cross-modal processing. We focus on modeling synaesthesia phenomenon known to be a cross-modal perception of humans. Recently, deep neural networks (DNNs) have gained more attention and successfully applied to process high-dimensional data composed not only of single modality but also of multimodal information. We introduced DNNs to construct multimodal association model which can reconstruct one modality from the other modality. Our model is composed of two DNNs: one for image compression and the other for audio-visual sequential learning. We tried to reproduce synaesthesia phenomenon by training our model with the multimodal data acquired from psychological experiment. Cross-modal association experiment showed that our model can reconstruct the same or similar images from sound as synaesthetes, those who experience synaesthesia. The analysis of middle layers of DNNs representing multimodal features implied that DNNs self-organized the difference of perception between individual synaesthetes.
AB - Robots are required to process multimodal information because the information in the real world comes from various modal inputs. However, there exist only a few robots integrating multimodal information. Humans can recognize the environment effectively by cross-modal processing. We focus on modeling synaesthesia phenomenon known to be a cross-modal perception of humans. Recently, deep neural networks (DNNs) have gained more attention and successfully applied to process high-dimensional data composed not only of single modality but also of multimodal information. We introduced DNNs to construct multimodal association model which can reconstruct one modality from the other modality. Our model is composed of two DNNs: one for image compression and the other for audio-visual sequential learning. We tried to reproduce synaesthesia phenomenon by training our model with the multimodal data acquired from psychological experiment. Cross-modal association experiment showed that our model can reconstruct the same or similar images from sound as synaesthetes, those who experience synaesthesia. The analysis of middle layers of DNNs representing multimodal features implied that DNNs self-organized the difference of perception between individual synaesthetes.
UR - http://www.scopus.com/inward/record.url?scp=84902491868&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84902491868&partnerID=8YFLogxK
U2 - 10.1109/sii.2013.6776750
DO - 10.1109/sii.2013.6776750
M3 - Conference contribution
AN - SCOPUS:84902491868
SN - 9781479926268
T3 - 2013 IEEE/SICE International Symposium on System Integration, SII 2013
SP - 659
EP - 664
BT - 2013 IEEE/SICE International Symposium on System Integration, SII 2013
PB - IEEE Computer Society
T2 - 2013 6th IEEE/SICE International Symposium on System Integration, SII 2013
Y2 - 15 December 2013 through 17 December 2013
ER -