Training Set Selection - Speech

Metric

Our evaluation metric is described in the task's README. We also include the number of samples across training and evaluation.


Offline evaluation (MLCube)

In order to begin working, head to the task's main page on Dynabench


To participate, you'll be prompted to create an account on Dynabench. You just need an email account, where you will receive information about your submissions. Once created, we suggest you head directly to the task’s README. Once done, you will have access to the dataset, and instructions on how to run a selection algorithm and evaluate it. 


We've included in the files three baseline examples of a selection algorithm; (1) a random selection algorithm, (2) a cross-fold algorithm, and (3) a CleanLabs provided baseline. You can use these implementations to guide your own implementation and understand how to begin developing your own solution. In short, MLCube allows you to do three different actions: download, select and evaluate. 'Download' retrieves the dataset and stores it in your local 



Online evaluation (Dynabench)


To receive an official score and a place on the leaderboard, users must submit their generated training sets to Dynabench. The Training Set Selection - Speech benchmark has 6 different leaderboards, one for each language and training set size combination. The benchmark currently supports three languages (English, Indonesian, and Portuguese) and two different limits on the max number of training samples (25 and 60). Participants are encouraged to submit to all 6 leaderboards.


To submit for online evaluation, head to the task's main page and go to Submit Train Files. You will be prompted to name your submission (in the Model Name field) and to upload a train.json file.


Once you upload a file, click on the upload button. If done correctly, then you just have to wait until the results come back. This can take anywhere from 10 minutes up to a couple of hours. Don't worry, you don't have to keep waiting - we'll send you an email once the submission has been processed. 


In order for your submission to appear on the leaderboard, click on your user icon on the top right corner of Dynabench, and then click on models. All of your model submissions will be listed here: just click on the name of the one you want to publish, and click the publish button. If you want to, you can include more information using the 'edit' button.