Volume 15, Number 1
Exploring Crowdsourced Worker Evaluation Methods in Open-Ended Tasks
Authors
Ryuya Itano , Honoka Tanitsu , Motoki Bamba , Ryota Noseyama , Akihito Kohiga and TakahiroKoita , Doshisha University, Japan
Abstract
Crowdsourcing assumes a transient relationship between task requesters and workers, which makes it hard for workers to improve their skills. In addition, crowdsourcingtasksare shifting from simple to more complex and open-ended, highlighting the importance of training workers to handle such tasks. Although various methods have been proposed to train workers, a method to evaluate their skill levelsin open-ended tasks has not yet been established. Direct evaluation by requesters is desirable, but scaling up tasks is difficult due to the requester’s heavy workload.This study aims to explore methods for evaluating workers without increasing requesters’ workload, comparing and verifying apeer-basedmethod and anLLM-based automated method.The experiment investigated the alignment between evaluations fromthe twomethods andthose from requesters, thereby clarifying the characteristics of each method.The experimental resultsdemonstrated the applicability of LLMs to evaluating workers in open-ended tasks, revealing both their strengths in consistency and limitations in capturing subtle human judgments.
Keywords
Crowdsourcing, Worker Training,Worker Evaluation, Amazon Mechanical Turk, LLM-as-a-Judge..
