This Page Pertains to the (Now Closed) 2019 Edition of the MRP Shared Task

Publication of Results

The task received submissions from eighteen teams, of which two involved one of the co-organizers, and another three do not participate in the official ranking because they arrived after the closing deadline or made use of extra training data (beyond the list of white-listed resources for the task). An overview of participating teams, their approaches to meaning representation parsing, detailed result statistics, and in-depth system descriptions for each of the submissions are presented in the MRP 2019 proceedings volume, which is published through the (Anthology of the) Association for Computational Linguistics.

Sunday, November 3, 2019
14:00–14:30	MRP 2019: Cross-Framework Meaning Representation Parsing Stephan Oepen, Omri Abend, Jan Hajic, Daniel Hershcovich, Marco Kuhlmann, Tim O’Gorman, Nianwen Xue, Jayeol Chun, Milan Straka and Zdenka Uresova
14:30–14:33	TUPA at MRP 2019: A Multi-Task Baseline System Daniel Hershcovich and Ofir Arviv
14:33–14:36	The ERG at MRP 2019: Radically Compositional Semantic Dependencies Stephan Oepen and Dan Flickinger
~~14:36–14:39~~	~~SJTU-NICT at MRP 2019: Multi-Task Learning for End-to-End Uniform Semantic Graph Parsing~~ ~~Zuchao Li, Hai Zhao, Zhuosheng Zhang, Rui Wang, Masao Utiyama and Eiichiro Sumita~~
14:39–14:42	ShanghaiTech at MRP 2019: Sequence-to-Graph Transduction with Second-Order Edge Inference for Cross-Framework Meaning Representation Parsing ~~Xinyu Wang, Yixian Liu, Zixia Jia, Chengyue Jiang and Kewei Tu~~
14:42–14:45	Saarland at MRP 2019: Compositional parsing across all graphbanks Lucia Donatelli, Meaghan Fowlie, Jonas Groschwitz, Alexander Koller, Matthias Lindemann, Mario Mina and Pia Weißenhorn
14:45–14:48	HIT-SCIR at MRP 2019: A Unified Pipeline for Meaning Representation Parsing via Efficient Training and Effective Encoding Wanxiang Che, Longxu Dou, Yang Xu, Yuxuan Wang, Yijia Liu and Ting Liu
~~14:48–14:51~~	SJTU at MRP 2019: A Transition-Based Multi-Task Parser for Cross-Framework Meaning Representation Parsing ~~Hongxiao Bai and Hai Zhao~~
14:51–14:54	JBNU at MRP 2019: Multi-level Biaffine Attention for Semantic Dependency Parsing Seung-Hoon Na, Jinwoon Min, Kwanghyeon Park, Jong-Hun Shin and Young-Kil Kim
14:54–14:57	CUHK at MRP 2019: Transition-Based Parser with Cross-Framework Variable-Arity Resolve Action Sunny Lai, Chun Hei Lo, Kwong Sak Leung and Yee Leung
14:57–15:00	Hitachi at MRP 2019: Unified Encoder-to-Biaffine Network for Cross-Framework Meaning Representation Parsing Yuta Koreeda, Gaku Morio, Terufumi Morishita, Hiroaki Ozaki and Kohsuke Yanai
15:00–15:03	ÚFAL MRPipe at MRP 2019: UDPipe Goes Semantic in the Meaning Representation Parsing Shared Task Milan Straka and Jana Straková
15:03–15:06	Amazon at MRP 2019: Parsing Meaning Representations with Lexical and Phrasal Anchoring ~~Jie Cao, Yi Zhang, Adel Youssef and Vivek Srikumar~~
15:06–15:09	SUDA-Alibaba at MRP 2019: Graph-Based Models with BERT Yue Zhang, Wei Jiang, Qingrong Xia, Junjie Cao, Rui Wang, Zhenghua Li and Min Zhang
15:09–15:12	ÚFAL-Oslo at MRP 2019: Garage Sale Semantic Parsing Kira Droganova, Andrey Kutuzov, Nikita Mediankin and Daniel Zeman
15:12–15:15	Peking at MRP 2019: Factorization- and Composition-Based Parsing for Elementary Dependency Structures Yufei Chen, Yajie Ye and Weiwei Sun
15:15–15:30	Final Discussion: Towards MRP 2020 (Everyone)
16:30–18:00	Poster Session (All Participating Teams)

Evaluation Data

Parsers will be evaluated on unseen, held-out data for which the gold-standard target graphs will not be available to participants before the end of the evaluation period (please see below). For some of the parser inputs used in evaluation, target annotations are available in multiple frameworks; a shared sub-set of 100 sentences have been annotated with gold-standard target graphs in all five frameworks.

	DM	PSD	EDS	UCCA	AMR
Text Type	mixed	mixed	mixed	mixed	mixed
Sentences	3,359	3,359	3,359	1,131	1,998
Tokens	64,853	64,853	64,853	21,647	39,520

The MRP evaluation data will be distributed in the same file format as the training graphs, but without the nodes, edges, and tops values (essentially presenting empty graphs, which participants are expected to fill in). Thus, the input property on each evaluation ‘graph’ provides the string to be parsed, and an additional top-level property targets indicates which output framework(s) to predict. The evaluation data will be packaged as a single file input.mrp (containing a total of 6288 strings to be parsed), but in principle each sentence can be processed in isolation. For each parser input and each of its targets values, participating systems are expected to output one complete semantic graph in MRP format (for a total of 13,206 predicted graphs in a complete submissions). The MRP evaluation data will be bundled with some of the same ‘companion’ resources as the training data, viz. state-of-the-art morpho-syntactic dependency parses (as a separate file udpipe.mrp). Unlike for the training data, however, companion AMR ‘alignmnets’ (i.e. partial anchorings, in MRP terms) cannot be provided for the evaluation data, seeing as these would presume knowledge of the gold-standard AMR graphs.

System Ranking

The primary evaluation metric for the task will be cross-framework MRP F₁ scores. Participating parsers will be ranked based on average F₁ across all evaluation data and target frameworks. For broader comparison, additional, per-framework scores will be published, both in the MRP and applicable framework-specific metrics. Albeit not the primary goal of the task, ‘partial’ submission are possible, in the sense of not providing parser outputs for all target frameworks. The training and evaluaton setup in MRP 2019 differs from previous tasks for all frameworks involved; thus, single-framework submissions can help make connections to previously published results.

System Development

The task operates as what is at times called a closed track. Beyond the training and ‘companion’ data provided by the co-organizers, participants are restricted in which additional data and pre-trained models are legitimate to use in system development. These constraints are imposed to improve comparability of results and overall fairness: beyond resources explicitly ‘white-listed’ for the task, no additional data or other knowledge sources must be used in system development, training, or tuning.

Evaluation Period

The evaluation period of the task will run from Monday, July 8, to Thursday, July 25, 2019, 12:00 noon in Central Europe (CEST). At the start of the evaluation period, the data will be distributed, again, through the Linguistic Data Consortium (LDC), as a new archive available for download by registered participants who have previously obtained the MRP training data from the LDC. The LDC expects to enable download of the evaluation data starting at 10:00 o'clock (in the morning) at the US East Coast (EST) on July 8, 2019.

Participants will be expected to prepare their submission by processing all parser inputs using the same general parsing system. All parser outputs have to be serialized in the MRP common interchange format, as multiple, separate graphs for each input string that calls for predicting multiple target frameworks. Team registration and submissions will be hosted on the CodaLab service, where basic validation will be applied to each submission using the mtool validator. Access to CodaLab will require at least one team member to self-register for the task (called a ‘competition’ in CodaLab terms), but it should be possible for multiple CodaLab users to jointly form a team. Participants must agree to putting their submitted parser outputs into the public domain, such that all submissions can be made available for general download after completion of the evaluation period.

System Submission

To make a submission, participants need to obtain the evaluation data package via the LDC (see above) and process parser inputs (from the MRP file input.mrp) according to the instructions in the README.txt file included with the data. While it is possible to process individual parser inputs separately (for example to parallelize parsing), all parser outputs to be submitted must be concatenated into a single MRP file (for example output.mrp) and compressed into a ZIP archive prior to uploading the submission to the CodaLab site. On a Un*x system, for example, an archive file submission.zip suitable for upload to CodaLab can be created as follows:

  $ zip submission.zip output.mrp 
    adding: output.mrp (deflated 88%)

Validation of parser outputs in MRP serialization is supported in mtool (the Swiss Army Knife of Meaning Representation), and it is strongly recommend that participants validate their graphs prior to submission to CodaLab.

It is possible to make multiple submissions throughout the evaluation period. For each team, only the most recent submission (made before or on Thursday, July 25, 2019, 12:00 noon in Central Europe) will be considered for scoring; evaluation of multiple, different ‘runs’ (or system configurations) will not be possible during the official evaluation period. The closing date for the evaluation period is July 25, 2019, 12:00 noon in Central Europe (CEST).

Registration of Intent

To make your interest in the MRP 2019 task known and to receive updates on data and software, please self-subscribe to the mailing list for (moderated) MRP announcments. The mailing list archives are available publicly. To obtain the training data for the task, please (a) make sure your team is subscribed to the above mailing list and (b) fill in and return to the LDC the no-cost license agreement for the task. A more formal registration of participating teams will be required in early July, as the evaluation period nears (please see the task schedule and below).