Registration of Intent

To make your interest in the MRP 2019 task known and to receive updates on data and software, please self-subscribe to the mailing list for (moderated) MRP announcments.  The mailing list archives are available publicly.  To obtain the training data for the task, please (a) make sure your team is subscribed to the above mailing list and (b) fill in and return to the LDC the no-cost license agreement for the task.  A more formal registration of participating teams will be required in early July, as the evaluation period nears (please see the task schedule and below).

System Development

The task operates as what is at times called a closed track.  Beyond the training and ‘companion’ data provided by the co-organizers, participants are restricted in which additional data and pre-trained models are legitimate to use in system development.  These constraints are imposed to improve comparability of results and overall fairness: beyond resources explicitly ‘white-listed’ for the task, no additional data or other knowledge sources must be used in system development, training, or tuning.

Evaluation Data

Parsers will be evaluated on unseen, held-out data for which the gold-standard target graphs will not be available to participants before the end of the evaluation period (please see below).  For some of the parser inputs used in evaluation, target annotations are available in multiple frameworks; a shared sub-set of 100 sentences have been annotated with gold-standard target graphs in all five frameworks. 

Text Type mixedmixedmixedmixedmixed
Sentences 3,3593,3593,3591,1311,998
Tokens 64,85364,85364,85321,64739,520

The MRP evaluation data will be distributed in the same file format as the training graphs, but without the nodes, edges, and tops values (essentially presenting empty graphs, which participants are expected to fill in).  Thus, the input property on each evaluation ‘graph’ provides the string to be parsed, and an additional top-level property targets indicates which output framework(s) to predict.  The evaluation data will be packaged as a single file input.mrp (containing a total of 6288 strings to be parsed), but in principle each sentence can be processed in isolation.  For each parser input and each of its targets values, participating systems are expected to output one complete semantic graph in MRP format (for a total of 13,206 predicted graphs in a complete submissions).  The MRP evaluation data will be bundled with some of the same ‘companion’ resources as the training data, viz. state-of-the-art morpho-syntactic dependency parses (as a separate file udpipe.mrp).  Unlike for the training data, however, companion AMR ‘alignmnets’ (i.e. partial anchorings, in MRP terms) cannot be provided for the evaluation data, seeing as these would presume knowledge of the gold-standard AMR graphs.

Evaluation Period

The evaluation period of the task will run from Monday, July 8, to Thursday, July 25, 2019, 12:00 noon in Central Europe (CEST).  At the start of the evaluation period, the data will be distributed, again, through the Linguistic Data Consortium (LDC), as a new archive available for download by registered participants who have previously obtained the MRP training data from the LDC.  The LDC expects to enable download of the evaluation data starting at 10:00 o'clock (in the morning) at the US East Coast (EST) on July 8, 2019.

Participants will be expected to prepare their submission by processing all parser inputs using the same general parsing system.  All parser outputs have to be serialized in the MRP common interchange format, as multiple, separate graphs for each input string that calls for predicting multiple target frameworks.  Team registration and submissions will be hosted on the CodaLab service, where basic validation will be applied to each submission using the mtool validator.  Access to CodaLab will require at least one team member to self-register for the task (called a ‘competition’ in CodaLab terms), but it should be possible for multiple CodaLab users to jointly form a team.  Participants must agree to putting their submitted parser outputs into the public domain, such that all submissions can be made available for general download after completion of the evaluation period.

System Submission

To make a submission, participants need to obtain the evaluation data package via the LDC (see above) and process parser inputs (from the MRP file input.mrp) according to the instructions in the README.txt file included with the data.  While it is possible to process individual parser inputs separately (for example to parallelize parsing), all parser outputs to be submitted must be concatenated into a single MRP file (for example output.mrp) and compressed into a ZIP archive prior to uploading the submission to the CodaLab site.  On a Un*x system, for example, an archive file suitable for upload to CodaLab can be created as follows:

  $ zip output.mrp 
    adding: output.mrp (deflated 88%)

Validation of parser outputs in MRP serialization is supported in mtool (the Swiss Army Knife of Meaning Representation), and it is strongly recommend that participants validate their graphs prior to submission to CodaLab.

It is possible to make multiple submissions throughout the evaluation period.  For each team, only the most recent submission (made before or on Thursday, July 25, 2019, 12:00 noon in Central Europe) will be considered for scoring; evaluation of multiple, different ‘runs’ (or system configurations) will not be possible during the official evaluation period.  The closing date for the evaluation period is July 25, 2019, 12:00 noon in Central Europe (CEST).

System Ranking

The primary evaluation metric for the task will be cross-framework MRP F1 scores.  Participating parsers will be ranked based on average F1 across all evaluation data and target frameworks.  For broader comparison, additional, per-framework scores will be published, both in the MRP and applicable framework-specific metrics.  Albeit not the primary goal of the task, ‘partial’ submission are possible, in the sense of not providing parser outputs for all target frameworks.  The training and evaluaton setup in MRP 2019 differs from previous tasks for all frameworks involved; thus, single-framework submissions can help make onnections to previously published results.

Publication of Results

XHTML 1.0 | Last updated: 2019-07-24 (15:07)