Registration of Intent

To make your interest in the MRP 2020 task known and to receive updates on data and software, please self-subscribe to the mailing list for (moderated) MRP announcments.  The mailing list archives are available publicly.  To obtain the training data for the task, please (a) make sure your team is subscribed to the above mailing list and (b) fill in and return to the LDC the no-cost license agreement for the task.  A more formal registration of participating teams will be required in early July, as the evaluation period nears (please see the task schedule and below).

System Development

The task operates as what is at times called a closed track.  Beyond the training and ‘companion’ data provided by the co-organizers, participants are restricted in which additional data and pre-trained models are legitimate to use in system development.  These constraints are imposed to improve comparability of results and overall fairness: beyond resources explicitly ‘white-listed’ for the task, no additional data or other knowledge sources must be used in system development, training, or tuning.

The initial white-list of additional resources is inherited from the 2019 predecessor shared task.  Prospective participants are encouraged to nominate additional data or pre-trained models for white-listing; please email nominations to the task co-organizers as early as possible.  The closing data for the 2020 white-list is Monday, June 15, 2020; no additional nominations can be considered after that date.

Evaluation Data

Parsers will be evaluated on unseen, held-out data for which the gold-standard target graphs will not be available to participants before the end of the evaluation period (please see below).  For some of the English parser inputs used in evaluation, target annotations will be available in multiple frameworks, yielding a shared sub-set of inputs with parallel annotations in all frameworks. 

The MRP evaluation data will be distributed in the same file format as the training graphs, but without the nodes, edges, and tops values (essentially presenting empty graphs, which participants are expected to fill in).  Thus, the input property on each evaluation ‘graph’ provides the string to be parsed, and an additional top-level property targets indicates which output framework(s) to predict.  The evaluation data will be packaged as a single file per language, e.g. english.mrp, but in principle each sentence can be processed in isolation.  For each parser input and each of its targets values, participating systems are invited to output one complete semantic graph in MRP format  The MRP evaluation data will be bundled with some of the same ‘companion’ resources as the training data, viz. state-of-the-art morpho-syntactic dependency parses (as a separate file udpipe.mrp).  Unlike for the training data, however, companion AMR ‘alignmnets’ (i.e. partial anchorings, in MRP terms) cannot be provided for the evaluation data, seeing as these would presume knowledge of the gold-standard AMR graphs.

System Ranking

The primary evaluation metric for the task will be cross-framework MRP F1 scores.  Participating parsers will be evaluated separately in the two tracks of the shared task, i.e. (a) the cross-framework English track, and (b) the multi-lingual track. In each track, submissions will ranked by average F1 across all evaluation data and target frameworks.  For broader comparison, additional, per-framework scores will be published, both in the MRP and applicable framework-specific metrics.  Albeit not the primary goal of the task, ‘partial’ submission are possible: Participants are welcome to only submit parser outputs for one of the two tracks, or to provide parser outputs for only a sub-set of the target frameworks in each track.  The training and evaluaton setup in MRP 2020 differs from previous single-framework tasks (e.g. at the Semantic Evaluation Exercises between 2014 and 2018; SemEval); thus, single-framework submissions can help make connections to previously published results.

Evaluation Period

The evaluation period of the task will run from Monday, July 20, to Monday, August 3, 2020, 12:00 noon in Central Europe (CEST).  At the start of the evaluation period, the data will be distributed, again, through the Linguistic Data Consortium (LDC), as a new archive available for download by registered participants who have previously obtained the MRP training data from the LDC. 

Participants will be expected to prepare their submission by processing all parser inputs using the same general parsing system.  All parser outputs have to be serialized in the MRP common interchange format, as multiple, separate graphs for each input string that calls for predicting multiple target frameworks.  Participants must agree to putting their submitted parser outputs into the public domain, such that all submissions can be made available for general download after completion of the evaluation period.

