TY - GEN
T1 - Morphology-specific convolutional neural networks for tactile object recognition with a multi-fingered hand
AU - Funabashi, Satoshi
AU - Yan, Gang
AU - Geier, Andreas
AU - Schmitz, Alexander
AU - Ogata, Tetsuya
AU - Sugano, Shigeki
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - Distributed tactile sensors on multi-fingered hands can provide high-dimensional information for grasping objects, but it is not clear how to optimally process such abundant tactile information. The current paper explores the possibility of using a morphology-specific convolutional neural network (MS-CNN). uSkin tactile sensors are mounted on an Allegro Hand, which provides 720 force measurements (15 patches of uSkin modules with 16 triaxial force sensors each) in addition to 16 joint angle measurements. Consecutive layers in the CNN get input from parts of one finger segment, one finger, and the whole hand. Since the sensors give 3D (x, y, z) vector tactile information, inputs with 3 channels (x, y and z) are used in the first layer, based on the idea of such inputs for RGB images from cameras. Overall, the layers are combined, resulting in the building of a tactile map based on the relative position of the tactile sensors on the hand. Seven different combination variations were evaluated, and an over-95% object recognition rate with 20 objects was achieved, even though only one random time instance from a repeated squeezing motion of an object in an unknown pose within the hand was used as input.
AB - Distributed tactile sensors on multi-fingered hands can provide high-dimensional information for grasping objects, but it is not clear how to optimally process such abundant tactile information. The current paper explores the possibility of using a morphology-specific convolutional neural network (MS-CNN). uSkin tactile sensors are mounted on an Allegro Hand, which provides 720 force measurements (15 patches of uSkin modules with 16 triaxial force sensors each) in addition to 16 joint angle measurements. Consecutive layers in the CNN get input from parts of one finger segment, one finger, and the whole hand. Since the sensors give 3D (x, y, z) vector tactile information, inputs with 3 channels (x, y and z) are used in the first layer, based on the idea of such inputs for RGB images from cameras. Overall, the layers are combined, resulting in the building of a tactile map based on the relative position of the tactile sensors on the hand. Seven different combination variations were evaluated, and an over-95% object recognition rate with 20 objects was achieved, even though only one random time instance from a repeated squeezing motion of an object in an unknown pose within the hand was used as input.
UR - http://www.scopus.com/inward/record.url?scp=85071479203&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85071479203&partnerID=8YFLogxK
U2 - 10.1109/ICRA.2019.8793901
DO - 10.1109/ICRA.2019.8793901
M3 - Conference contribution
AN - SCOPUS:85071479203
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 57
EP - 63
BT - 2019 International Conference on Robotics and Automation, ICRA 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 International Conference on Robotics and Automation, ICRA 2019
Y2 - 20 May 2019 through 24 May 2019
ER -