About DataPerf

We, researchers from Coactive.AI, ETH Zurich, Google, Harvard University, Landing.AI, Meta, Stanford University, and TU Eindhoven, are announcing DataPerf, a new benchmark suite for machine learning datasets and data-centric algorithms. We presented DataPerf at the NeurIPS Data-centric AI Workshop. Going forward, we invite you to join us in defining and developing the benchmark suite in the DataPerf Working Group hosted by the MLCommons® Association. If you are interested in using the DataPerf benchmark or participating in leaderboards and challenges based on DataPerf in 2023, please sign up for DataPerf-announce (click link then "Ask to Join").

Introduction. DataPerf is a benchmark suite for ML datasets and data-centric algorithms. Historically, ML research has focused primarily on models, and simply used the largest existing dataset for common ML tasks without considering the dataset’s breadth, difficulty, and fidelity to the underlying problem. This under-focus on data has led to a range of issues, from data cascades in real applications, to saturation of existing dataset-driven benchmarks for model quality impeding research progress. In order to catalyze increased research focus on data quality and foster data excellence, we created DataPerf: a suite of benchmarks that evaluate the quality of training and test data, and the algorithms for constructing or optimizing such datasets, such as core set selection or labeling error debugging, across a range of common ML tasks such as image classification. We plan to leverage the DataPerf benchmarks through challenges and leaderboards.

Inspiration. We are motivated by a number of prior efforts including: efforts to develop adversarial data such as Cats4ML and Dynabench, efforts to develop specific benchmarks or similar suites such as the DCAI competition and DCBench, and the MLPerf™ benchmarks for ML speed. We aim to provide clear evaluation and encourage rapid innovation aimed at conferences and workshops such as the NeurIPS Datasets and Benchmarks track. Similar to the MLPerf effort, we’ve brought together the leaders of these motivating efforts to build DataPerf.

Goals. DataPerf has these goals:

General Approach. Our general approach is to define benchmark types for training sets, test sets, and a range of data-centric algorithms, then define specific benchmarks by applying a benchmark type to a common ML task such as image classification or spoken keyword identification. In this way, the DataPerf benchmark suite is defined as the cross product of { benchmark types } x { ML tasks }. 

Benchmark Types and Metrics. The DataPerf suite includes the benchmark types listed below. Each benchmark type uses a different metric, though all in principle either maximize the efficacy of a training set or the breadth/difficulty of a test set.

Leaderboards and Challenges. In 2023, we will launch leaderboards and challenges based on the DataPerf benchmarks to encourage constructive competition, identify best-of-breed ideas, and inspire the next generation of concepts for building and optimizing datasets. We will operate the leaderboards and challenges using a platform based on Dynabench.

Organization. The DataPerf benchmarks, leaderboards, challenges, and platform will be hosted by the MLCommons Association, which also hosts the MLPerf Benchmarks. The MLCommons Association is a non-profit engineering consortium with over 50 members including large tech companies, startups, and academics. The MLCommons Association’s mission is to make ML better for everyone through benchmarks, public datasets, best practices, and research.

Learn more. You can read a full description of DataPerf in the whitepaper.