Publication of Results
The task received submissions from eight teams, of which six participate in the official ranking (one submission was made after the closing deadline and another provides a framework-specific point of reference contributed by one of the co-organizers). An overview of participating teams, their approaches to meaning representation parsing, detailed result statistics, and in-depth system descriptions for each of the submissions are presented in the MRP 2020 proceedings volume, which is published through the (Anthology of the) Association for Computational Linguistics.
Thursday, November 19, 2020 (Chairs: Omri Abend and Nianwen Xue)(Initially Plenary on Zoom, Later Individually in Gather.Town) |
|
10:45–11:15 UTC | MRP 2020: The Second Shared Task on Cross-Framework and Cross-Lingual Meaning Representation Parsing Stephan Oepen, Omri Abend, Lasha Abzianidze, Johan Bos, Jan Hajič, Daniel Hershcovich, Bin Li, Tim O’Gorman, Nianwen Xue, Daniel Zeman |
11:15–11:20 UTC |
DRS at MRP 2020: Dressing up Discourse Representation Structures as Graphs Lasha Abzianidze, Johan Bos, Stephan Oepen |
11:20–11:25 UTC | FGD at MRP 2020: Prague Tectogrammatical Graphs Daniel Zeman, Jan Hajič |
11:25–11:28 UTC | Hitachi at MRP 2020: Text-to-Graph-Notation Transducer Hiroaki Ozaki, Gaku Morio, Yuta Koreeda, Terufumi Morishita, Toshinori Miyoshi |
11:28–11:31 UTC | ÚFAL at MRP 2020: Permutation-invariant Semantic Parsing in PERIN David Samuel, Milan Straka |
11:31–11:34 UTC | HIT-SCIR at MRP 2020: Transition-based Parser and Iterative Inference Parser Longxu Dou, Yunlong Feng, Yuqiu Ji, Wanxiang Che, Ting Liu |
11:34–11:37 UTC | HUJI-KU at MRP 2020: Two Transition-based Neural Parsers Ofir Arviv, Ruixiang Cui, Daniel Hershcovich |
11:37–11:40 UTC | JBNU at MRP 2020: AMR Parsing Using a Joint State Model for Graph-Sequence Iterative Inference Seung-Hoon Na, Jinwoo Min |
11:40–12:10 UTC | Final Discussion: Beyond MRP 2020 (Everyone) |
12:15–12:45 UTC | Poster-Like Session on Gather.Town (Everyone) |
Evaluation Data
Parsers will be evaluated on unseen, held-out data for which the gold-standard target graphs will not be available to participants before the end of the evaluation period (please see below). For some of the English parser inputs used in evaluation, target annotations will be available in multiple frameworks, yielding a shared sub-set of inputs with parallel annotations in all frameworks.
The MRP evaluation data will be distributed in the same file format
as the training graphs, but without the nodes
,
edges
, and tops
values (essentially
presenting empty graphs, which participants are expected to fill in).
Thus, the input
property on each evaluation ‘graph’
provides the string to be parsed, and an additional top-level
property targets
indicates which output
framework(s) to predict.
The evaluation data will be packaged as a single file per language, e.g.
eng.mrp
(for the cross-framework trakc),
but in principle each sentence can be processed in isolation.
For each parser input and each of its targets
values, participating
systems are invited to output one complete semantic graph in MRP format
The MRP evaluation data will be bundled with some of the same
‘companion’ resources as the
training data, viz.
state-of-the-art morpho-syntactic dependency parses (as a separate file
udpipe.mrp
).
Unlike for the training data, however, companion AMR and DRG reference
anchoring (or ‘alignmnets’, in AMR terms) cannot be provided for the
evaluation data, seeing as these would presume knowledge of the gold-standard
AMR graphs.
System Ranking
The primary evaluation metric for the task will be cross-framework MRP F1 scores. Participating parsers will be evaluated separately in the two tracks of the shared task, i.e. (a) the cross-framework English track, and (b) the multi-lingual track. In each track, submissions will ranked by average F1 across all evaluation data and target frameworks. For broader comparison, additional, per-framework scores will be published, both in the MRP and applicable framework-specific metrics. Albeit not the primary goal of the task, ‘partial’ submission are possible: Participants are welcome to only submit parser outputs for one of the two tracks, or to provide parser outputs for only a sub-set of the target frameworks in each track. The training and evaluaton setup in MRP 2020 differs from previous single-framework tasks (e.g. at the Semantic Evaluation Exercises between 2014 and 2018; SemEval); thus, single-framework submissions can help make connections to previously published results.
Evaluation Period
The evaluation period of the task will run from Monday, July 27, to Monday, August 10, 2020, 12:00 noon in Central Europe (CEST). At the start of the evaluation period, the data will be distributed, again, through the Linguistic Data Consortium (LDC), as a new archive available for download by registered participants who have previously obtained the MRP training data from the LDC. The LDC expects to enable download of the evaluation data starting at 12:00 o'clock (noon, mid-day) at the US East Coast (EST) on July 27, 2019.
Participants will be expected to prepare their submission by processing all parser inputs using the same general parsing system. All parser outputs have to be serialized in the MRP common interchange format, as multiple, separate graphs for each input string that calls for predicting multiple target frameworks. Submissions to the cross-framework or the cross-lingual track (or both) can be made in the same file, seeing as graph identifiers (for each framework) are unique across the two tracks. Team registration and submissions will be hosted on the CodaLab service, where basic validation will be applied to each submission using the mtool validator. Access to CodaLab will require at least one team member to self-register for the task (called a ‘competition’ in CodaLab terms), but it should be possible for multiple CodaLab users to jointly form a team. Participants must agree to putting their submitted parser outputs into the public domain, such that all submissions can be made available for general download after completion of the evaluation period.
Registration of Intent
To make your interest in the MRP 2020 task known and to receive updates on data and software, please self-subscribe to the mailing list for (moderated) MRP announcments. The mailing list archives are available publicly. To obtain the training data for the task, please (a) make sure your team is subscribed to the above mailing list and (b) fill in and return to the LDC the no-cost license agreement for the task. A more formal registration of participating teams will be required in early July, as the evaluation period nears (please see the task schedule and below).
System Development
The task operates as what is at times called a closed track. Beyond the training and ‘companion’ data provided by the co-organizers, participants are restricted in which additional data and pre-trained models are legitimate to use in system development. These constraints are imposed to improve comparability of results and overall fairness: beyond resources explicitly ‘white-listed’ for the task, no additional data or other knowledge sources must be used in system development, training, or tuning.
The initial white-list of additional resources was inherited from the 2019 predecessor shared task. Prospective participants were encouraged to nominate additional data or pre-trained models for white-listing (by emailing nominations to the task co-organizers as early as possible). The closing data for the 2020 white-list was Monday, June 15, 2020; no additional nominations can be considered after that date.
System Submission
To make a submission, participants need to obtain the evaluation data package
via the LDC (see above, available on or after Monday, July 27, 2020)
and process parser inputs (from the MRP file
input.mrp
) according to the instructions in the
README.txt
file included with the data.
While it is possible to process individual parser inputs separately (for
example to parallelize parsing), all parser outputs to be submitted must be
concatenated into a single MRP file (for example cf.mrp
,
cl.mrp
, for the cross-framework or cross-lingual track, respectively,
or simply output.mrp
) and
compressed into a
ZIP archive prior to uploading
the submission to the
CodaLab site.
On a Un*x system, for example, an archive file submission.zip
suitable
for upload to CodaLab can be created as follows:
$ zip submission.zip output.mrp adding: output.mrp (deflated 88%)
Validation of parser outputs in MRP serialization is supported in mtool (the Swiss Army Knife of Meaning Representation), and it is strongly recommend that participants validate their graphs prior to submission to CodaLab.
It is possible to make multiple submissions throughout the evaluation period. For each team, only the most recent submission (made before or on Monday, August 10, 2020, 12:00 noon in Central Europe) will be considered for scoring; evaluation of multiple, different ‘runs’ (or system configurations) will not be possible during the official evaluation period. The closing date for the evaluation period is August 10, 2020, 12:00 noon in summer-time Central Europe (CEST).