TY - GEN
T1 - PostMe
T2 - 2022 IEEE International Conference on Big Data, Big Data 2022
AU - Yanagisawa, Ryo
AU - Saito, Susumu
AU - Nakano, Teppei
AU - Kobayashi, Tetsunori
AU - Ogawa, Tetsuji
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Even after over a decade of many crowdsourcing researches, we have no standard framework for low-cost quality assurance in crowdsourced data annotation. This paper proposes an unsupervised learning method for dynamic microtask posting which allows each microtask to adjust their own number of collected responses based on the data difficulty. Since crowdsourced data labels are likely to contain errors, researchers often employ majority voting that aggregates responses from multiple workers to calculate a final l abel. T his t echnique, h owever, i nvolves a trade-off between label accuracy and cost. This paper presents a dynamic microtask posting model that reduces the total number of collected responses while maintaining the labeling accuracy; we also aim to obtain the model with an 'unsupervised' approach, which does not require training through experience of microtask posting for data labeled with ground-truths. Our simulation in annotating livestock surveillance images demonstrated that our approach achieved i) comparable learning performance to that of the supervised approach that required model training with labeled data, and ii) a significant c ost r eduction without degrading accuracy in comparison to simple majority voting.
AB - Even after over a decade of many crowdsourcing researches, we have no standard framework for low-cost quality assurance in crowdsourced data annotation. This paper proposes an unsupervised learning method for dynamic microtask posting which allows each microtask to adjust their own number of collected responses based on the data difficulty. Since crowdsourced data labels are likely to contain errors, researchers often employ majority voting that aggregates responses from multiple workers to calculate a final l abel. T his t echnique, h owever, i nvolves a trade-off between label accuracy and cost. This paper presents a dynamic microtask posting model that reduces the total number of collected responses while maintaining the labeling accuracy; we also aim to obtain the model with an 'unsupervised' approach, which does not require training through experience of microtask posting for data labeled with ground-truths. Our simulation in annotating livestock surveillance images demonstrated that our approach achieved i) comparable learning performance to that of the supervised approach that required model training with labeled data, and ii) a significant c ost r eduction without degrading accuracy in comparison to simple majority voting.
KW - crowdsourcing
KW - dynamic microtask posting
KW - quality control
KW - unsupervised learning
UR - http://www.scopus.com/inward/record.url?scp=85147945381&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85147945381&partnerID=8YFLogxK
U2 - 10.1109/BigData55660.2022.10020590
DO - 10.1109/BigData55660.2022.10020590
M3 - Conference contribution
AN - SCOPUS:85147945381
T3 - Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022
SP - 4049
EP - 4054
BT - Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022
A2 - Tsumoto, Shusaku
A2 - Ohsawa, Yukio
A2 - Chen, Lei
A2 - Van den Poel, Dirk
A2 - Hu, Xiaohua
A2 - Motomura, Yoichi
A2 - Takagi, Takuya
A2 - Wu, Lingfei
A2 - Xie, Ying
A2 - Abe, Akihiro
A2 - Raghavan, Vijay
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 17 December 2022 through 20 December 2022
ER -