Benchmarks and competitions: How do they help us evaluate AI?

Feb 22, 2022 | 12:45 PM - 1:45 PM

Description

Along with the constant development of AI, evaluating what systems can and cannot do has become a challenging necessity for understanding AI’s impact on our societies and guiding future policies. The session will describe some of the ways that computer scientists have evaluated AI systems. It will first consider competitions and benchmarks that have been used in the field, including the well-known Turing Test, work on games such as chess and Go, as well as more specialised datasets. The session will then discuss the more formal evaluation campaigns of the United States National Institute of Standards and Technology (NIST) and the French Laboratoire National de Métrologie et d’Essais (LNE). The speakers will discuss the insights and limitations of these different ways of evaluating AI. Moderator: José Hernandez-Orallo, Professor, Universitat Politècnica de València, Spain; Senior Research Fellow. Leverhulme Centre for the Future of Intelligence, University of Cambridge, UK Panellists: • Anthony Cohn, Professor of Automated Reasoning, University of Leeds • Guillaume Avrin, Manager, LNE • Lucy Cheke, Lecturer, Department of Psychology, University of Cambridge The live session time above reflects your computer's local time zone. The session will be recorded and available on replay the day after the live stream.

Presented by

Anthony G. Cohn

University of Leeds

Guillaume Avrin

LNE - French National...

José Hernández-Orallo

Universitat Politècnica de...

Lucy Cheke

University of Cambridge

More sessions of interest

3:45 PM - 4:45 PM

AI for labour market matching

One of the key ingredients of a well-functioning labour market is the efficiency with which workers are matched to vacancies. One of the reasons this matters, is...

Nicolas Blanc (CFE-CGC, French Confederation of Management - General Confederation of Executives)Anna Banczyk (European Commission: DG Employment, Social Affairs and Inclusion)Emma Nelson (Journalist)Wim Adriaens (VDAB - Flemish Public Employment Service)Glen Cathey (Randstad)Anna Milanez (OECD)Matissa Hollister (McGill University)Stijn Broecke (OECD)

3:30 PM - 5:00 PM

Ethics of AI in the workplace: How should policy respond? 2nd OECD expert meeting on AI in the...

As part of its project studying ethical concerns about the use of AI in the workplace, the OECD will convene a second session of its experts to discuss potential policy...

Stefano Scarpetta (OECD)Angelica Salvi Del Pero (OECD)

3:00 PM - 4:00 PM

AI and social partners

Social dialogue has a fundamental role to play in easing transitions and spreading good practices regarding AI adoption in the labour market. Many social partners...

Sandrine Cazes (OECD)Maureen Hick (UNI Europa Finance)Miriam Pinto (Spanish Confederation of Business Organizations (CEOE))David Barnes (IBM)Christina Colclough (Why Not Lab)Emma Nelson (Journalist)

1:45 PM - 2:45 PM

High-level session - The future of AI: Stakeholder perspectives

The potential impacts of AI span almost every sector of economies and societies, putting a premium on open dialogue and debate between policy makers and stakeholder...

Andrew Wyckoff (OECD, Director, Directorate for Science, Technology and Innovation)Stefano Scarpetta (OECD)Mary Towers (Trades Union Congress)Andreas Schleicher (OECD)Pam Dixon (World Privacy Forum)Carolyn Nguyen (Microsoft)Nicolas Miailhe (The Future Society)Clara Neppel (Institute of Electrical and Electronics Engineers (IEEE))