Syntaxfest Sofia 2021

The SyntaxFest brings together four events in syntactic research

The individual conference sites:

Quasy website

Depling 21 website

TLT 2021 website

UDW 21 website


SyntaxFest Homepage

The Syntaxfest 2021 in Sofia (was held ONLINE from 21 March to 24 March 2022)

4 events for 1 Fest of Empirical Syntax

The second edition of the SyntaxFest brought together again four events with partially overlapping research topics including empirical syntax, linguistic annotation, statistical language analysis, and Natural Language Processing:

IMPORTANT

SyntaxFest took place on Gather, a great platform for which we gladly pay a (small) fee per day per participant; the organisers are happy to offer free registrations within the limits of the budget to anyone wishing to attend who cannot (for some reason) get funded by their institution. In that case please register as Guest.


Contents on this page

Program

All talks and the poster session will take place on Gather. The link will be shared with registered participants shortly.

21 March 2022

Time (UTC) Session
11:40 - 12:00 SyntaxFest welcome
12:00 - 13:00 SyntaxFest/TLT invited talk
  Treebanking and Parsing for Irish
  Jennifer Foster
  Teresa Lynn
  Session chair: Kilian Evang
13:00 - 16:00 Gather practice and social

22 March 2022

Time (UTC) Session
09:50 - 10:00 TLT Opening
  Kilian Evang
10:00 - 11:40 TLT talks
  Talks are 15 minutes + 5 minutes for discussion
  Session chair: Daniel Dakota
  The RigVeda goes “universal”: annotation and analysis of equative constructions in Vedic and beyond
  Erica Biagetti
  [paper]
  Annotation guidelines of UD and SUD treebanks for spoken corpora: A proposal
  Sylvain Kahane, Bernard Caron, Kim Gerdes and Emmett Strickland
  [paper]
  How Universal is Genre in Universal Dependencies?
  Max Müller-Eberstein, Rob van der Goot and Barbara Plank
  [paper]
  Discourse Tree Structure and Dependency Distance in EFL Writing
  Jingting Yuan, Qiuhan Lin and John S. Y. Lee
  [paper]
  Asia Minor Greek in Contact (AMGiC): Towards a dialectal Treebank comprising contact-induced grammatical changes
  Konstantinos Sampanis and Prokopis Prokopidis
  [paper]
11:40 - 12:00 Coffee break
12:00 - 13:00 TLT invited talk
  Widely Interpretable Semantic Representation: Frameless Meaning Representation for Broader Applicability
  Jinho Choi
  Session chair: Kilian Evang
13:00 - 15:00 Joint SyntaxFest poster session
  The following TLT papers will be presented as posters. All posters will be on display on both 22 and 23 March.
  We encourage authors to present them on both days.
  A morph-based and a word-based treebank for Beja
  Sylvain Kahane, Martine Vanhove, Rayan Ziane and Bruno Guillaume
  [paper]
  Is Old French tougher to parse?
  Loïc Grobol, Sophie Prévost and Benoit Crabbé
  [paper]
  Typological Approach to Improve Dependency Parsing for Croatian Language
  Diego Fernando Válio Antunes Alves, Božo Bekavac and Marko Tadić
  [paper]
  Parsing with Pretrained Language Models, Multiple Datasets, and Dataset Embeddings
  Rob van der Goot and Miryam de Lhoneux
  [paper]
  Towards Building a Modern Written Tamil Treebank
  Parameswari Krishnamurthy and Kengatharaiyer Sarveswaran
  [paper]
  The following Depling papers will be presented as posters. All posters will be on display on both 22 and 23 March.
  We encourage authors to present them on both days.
  Number agreement, dependency length, and word order in Finnish traditional dialects
  Kaius Sinnemäki and Akira Takaki
  [paper]
  A Dependency Treebank for Classical Arabic Poetry
  Sharefah Alghamdi, Hend Alkhalifa and Abdulmalik Al-Salman
  [paper]
  On auxiliary verb in Universal Dependencies: untangling the issue and proposing a systematized annotation strategy
  Magali Duran, Adriana Pagano, Amanda Rassi and Thiago Pardo
  [paper]
  Drawing the syntactic space: choices in diagrammatic reasoning
  Nicolas Mazziotta
  [paper]
  BINGO: A Dependency Grammar Framework to Understand Hardware Specifications Written in English
  Rahul Krishnamurthy and Michael Hsiao
  [paper]
  The following Quasy papers will be presented as posters. All posters will be on display on both 22 and 23 March.
  We encourage authors to present them on both days.
  The Linear Arrangement Library. A new tool for research on syntactic dependency structures
  Lluís Alemany-Puig, Juan Luis Esteban and Ramon Ferrer-i-Cancho
  [paper]
  Attributivity and Subjectivity in Contemporary Written Czech
  Miroslav Kubát, Radek Čech and Xinying Chen
  [paper]
  The Menzerath-Altmann law in syntactic structure revisited
  Ján Mačutek, Radek Čech and Marine Courtin
  [paper]
  The following UDW papers will be presented as posters. All posters will be on display on both 22 and 23 March.
  We encourage authors to present them on both days.
  Universal Dependencies for Old Turkish
  Mehmet Oguz Derin
  [paper]
  Word Delimitation Issues in UD Japanese
  Mai Omura, Aya Wakasa and Masayuki Asahara
  [paper]
  UDWiki: guiding the creation of new UD treebanks
  Maarten Janssen
  [paper]
  A Universal Dependencies corpus for Ligurian
  Stefano Lusito and Jean Maillard
  [paper]
  Towards Universal Dependencies for Bribri
  Rolando Coto-Solano, Sofía Flores-Solórzano and Sharid Loáiciga
  [paper]
  For the Purpose of Curry: A UD Treebank for Ashokan Prakri
  Adam Farris and Aryaman Arora
  [paper]
  Numerals and what counts
  Jack Rueter, Niko Partanen and Flammie Pirine
  [paper]
  Date and Time in Universal Dependencies
  Daniel Zeman
  [paper]
15:00 - 18:20 Depling talks
15:00 - 15:10 Introduction
15:10 - 16:00 Depling Invited Talk
  An Information-Theoretic Perspective on Dependency Trees
  Richard Futrell
  Session chair: TBD
16:00 - 16:20 Coffee break
16:20 - 16:40 A monarchy without subjects: on Brassai’s (almost) subject-free dependency grammar
  András Imrényi
  [paper]
16:40 - 17:00 Mutual dependency and Word Grammar: headedness in the noun phrase
  Nikolas Gisborne
  [paper]
17:00 - 17:20 Is one head enough? Mention heads in coreference annotations compared with UD-style heads
  Anna Nedoluzhko, Michal Novák, Martin Popel, Zdeněk Žabokrtský and Daniel Zeman
  [paper]
17:20 - 17:40 Starting a new treebank? Go SUD!
  Kim Gerdes, Bruno Guillaume, Sylvain Kahane and Guy Perrier
  [paper]
17:40 - 18:00 Enhanced Universal Dependencies and semantic interpretation
  Dag T. T. Haug and Jamie Y. Findlay
  [paper]
18:00 - 18:20 Causation (and Some Other) Paraphrasing Patterns in L1 English. A Case Study
  Jasmina Milićević
  [paper]

23 March 2022

Time (UTC) Session
09:50 - 13:00 Quasy talks
09:50 - 10:00 Quasy Opening
10:00 - 11:40 Talks are 15 minutes + 5 minutes for discussion.
  Session chair: Xinying Chen
  Dependency distance minimization predicts compression
  Ramon Ferrer-i-Cancho and Carlos Gómez-Rodríguez
  [paper]
  Corpus-based language universals analysis using Universal Dependencies
  Hee-Soo Choi, Bruno Guillaume and Karën Fort
  [paper]
  Successes and failures of Menzerath’s law at the syntactic level
  Aleksandrs Berdicevskis
  [paper]
  The properties of rare and complex syntactic constructions in English. A corpus-based comparative study
  Ruochen Niu, Yaqin Wang and Haitao Liu
  [paper]
  A Quantitative Approach towards German Experiencer-Object Verbs
  Johanna M. Poppek, Simon Masloch, Amelie Robrecht and Tibor Kiss
  [paper]
11:40 - 12:00 Coffee break
12:00 - 13:00 Quasy invited talk
  Quantitative studies on tree bank collections: Complexity, universals, and typological signature
  Sylvain Kahane
  Session chair: Xinying Chen
13:00 - 15:00 Joint SyntaxFest poster session
  All posters will be on display on both 22 and 23 March. We encourage authors to present them on both days. See above for all posters.
15:00 - 18:00 UDW talks
15:00 - 15:45 Talks session 1
  Session chair: Miryam de Lhoneux
  UD on Software Requirements: Application and Challenges (15 min)
  Pierre André Ménard, Naïma Hassert and Edith Galy
  [paper]
  Bootstrapping Role and Reference Grammar Treebanks via Universal Dependencies (15 min)
  Kilian Evang, Tatiana Bladier, Laura Kallmeyer and Simon Petitjean
  [paper]
  Validation of Universal Dependencies by regeneration (15 min)
  Guy Lapalme
  [paper]
15:45 - 16:00 Coffee break
16:00 - 16:45 Talks session 2
  Session chair: Reut Tsarfaty
  Towards a consistent annotation of nominal person in Universal Dependencies (10 min)
  Georg Höhn
  [paper]
  Minor changes make a difference: a case study on the consistency of UD-based dependency parsers (10 min)
  Dmytro Kalpakchi and Johan Boye
  [paper]
  Formae reformandae: for a reorganisation of verb form annotation in Universal Dependencies illustrated by the specific case of Latin (10 min)
  Flavio Massimiliano Cecchini
  [paper]
  Mischievous nominal constructions in Universal Dependencies (15 min)
  Nathan Schneider and Amir Zeldes
  [paper]
16:45 - 17:00 Coffee break
17:00 - 18:00 UDW invited talk
  Incorporating Compositionality and Morphology into End-to-End Model
  Emily Pitler
  Session chair: Reut Tsarfaty

24 March 2022

Time (UTC) Session
09:00 - 10:00 Joint Discussion (Towards UD guidelines v.3, the role of SUD, the role of the document genre and other
  domain properties, etc.)
10:00 - 11:00 SyntaxFest business meeting
11:00 - 12:00 Business meetings per workshop: Quasy/Depling/TLT/UDW

Invited talks

In this talk, I will present a new semantic representation, WISeR, that overcomes challenges for Abstract Meaning Representation (AMR). Despite its strengths, AMR is not easily applied to languages or domains without predefined semantic frames, and its use of numbered arguments results in semantic role labels not directly interpretable and semantically overloaded for parsers. We examine the numbered arguments of predicates in AMR and convert them into thematic roles which do not require reference to semantic frames. We create a new corpus of 1K English dialogue sentences annotated in both WISeR and AMR. WISeR shows stronger inter-annotator agreement for beginner and experienced annotators, with beginners becoming proficient in WISeR annotation more quickly. Finally, we train a state-of-the-art parser on the AMR 3.0 corpus and a WISeR corpus converted from AMR 3.0. The parser is evaluated on these corpora and our dialogue corpus. The WISeR model exhibits higher accuracy than its AMR counterpart across the board, demonstrating that WISeR is easier for parsers to learn.

In this talk I will discuss recent work on treebanking and parsing for the Irish language, carried out by researchers in the Natural Language Processing group in the School of Computing in DCU. The first part of the talk will be devoted to TwittIrish, a treebank of tweets annotated according to the Universal Dependencies (UD) guidelines. I will discuss phenomena associated with this particular language/genre pair which can make the annotation process challenging, including the way in which the language found in Irish tweets differs from standard Irish, and the effect of English. I will present and analyse the results of parsing TwittIrish using state-of-the-art neural dependency parsers trained on the Irish Universal Dependencies treebank. This will lead on to the second part of the talk in which an Irish BERT language model, gaBERT, will be presented. Design decisions taken in training gaBERT will be presented, and the model will be compared to multilingual BERT on the task of UD dependency parsing and using a manual cloze-test evaluation.

I give an overview of some recent work taking a corpus-based, information-theoretic view on problems of dependency grammar. First, I argue for a connection between syntactic dependencies and the information-theoretic notion of mutual information, a measure of how strongly two words constrain each other, which allows us to quantify the “strength” of the link between a dependent and its head. Next, I present theoretical motivations and empirical evidence for information locality: a generalization of dependency length minimization which holds that words are under a pressure to be close to each other in word order in proportion to their mutual information. Finally I present evidence that crosslinguistic word orders reflect optimization for recoverability of dependency relations from strings of words.

Thanks to the Universal Dependencies database, we now have collections of treebanks annotated according to common guidelines. Such collections allow us to verify properties supposedly common to all languages of the world, but also to contrast the functioning of different languages. In this talk, we will address three points. First, we will focus on syntactic complexity by showing that the study of dependency flux (the set of concomitant dependencies in every inter-word position) allows us to give an alternative interpretation to dependency distance minimization (Liu 2008) and constraints on self-embeding. The asymmetry in potential flux between head-final and head-initial configurations could explain why head-initial languages are rarer and more constrained than head-final languages. (Collaboration with Chunxiao Yan.) We will then see how treebanks allow us to extract quantified grammatical information and to approach typological studies from a new angle, which we call typometrics. We will focus on different properties of word order and their visualization by scatterplots. The quantitative approach allows us to verify and extend the categorical universals of language, but also to propose new kinds of universals. (Collaboration with Kim Gerdes and Xinying Chen). We will conclude by mentioning a new research project that aims at extracting from treebanks the main constructions of each language, thus giving a typological signature of the language. The goal is to extract specific properties of each language, unlike previous works where the same property is studied on all languages and might not be relevant for some languages.

Many neural end-to-end systems today do not rely on syntactic parse trees, as much of the information that parse trees provide is encoded in the parameters of pretrained models. Lessons learned from parsing technologies and from taking a multilingual perspective, however, are still relevant even for end-to-end models. This talk will describe work that relies on compositionality in semantic parsing and in reading comprehension requiring numerical reasoning. We’ll then describe a released dataset that requires advances in multilingual modeling, and some approaches designed to better model morphology than off-the-shelf subword models that make some progress on these challenges.

List of accepted papers

Quasy

Lluís Alemany-Puig, Juan Luis Esteban and Ramon Ferrer-i-Cancho
The Linear Arrangement Library. A new tool for research on syntactic dependency structures

Aleksandrs Berdicevskis
Successes and failures of Menzerath’s law at the syntactic level

Hee-Soo Choi, Bruno Guillaume and Karën Fort
Corpus-based language universals analysis using Universal Dependencies

Ramon Ferrer-i-Cancho and Carlos Gómez-Rodríguez
Dependency distance minimization predicts compression

Miroslav Kubát, Radek Čech and Xinying Chen
Attributivity and Subjectivity in Contemporary Written Czech

Ján Mačutek, Radek Čech and Marine Courtin
The Menzerath-Altmann law in syntactic structure revisited

Ruochen Niu, Yaqin Wang and Haitao Liu
The properties of rare and complex syntactic constructions in English. A corpus-based comparative study

Johanna M. Poppek, Simon Masloch and Tibor Kiss
A Quantitative Approach towards German Experiencer-Object Verbs

Depling

Sharefah Alghamdi, Hend Alkhalifa and Abdulmalik Al-Salman
A Dependency Treebank for Classical Arabic Poetry

Magali Duran, Adriana Pagano, Amanda Rassi and Thiago Pardo
On auxiliary verb in Universal Dependencies: untangling the issue and proposing a systematized annotation strategy

Jamie Y. Findlay and Dag T. T. Haug
Enhanced Universal Dependencies and semantic interpretation

Kim Gerdes, Bruno Guillaume, Sylvain Kahane and Guy Perrier
Starting a new treebank? Go SUD!

Nikolas Gisborne
Loops—or mutual dependency and Word Grammar: headedness in the noun phrase

András Imrényi
A monarchy without subjects: on Brassai’s (almost) subject-free dependency grammar

Rahul Krishnamurthy and Michael Hsiao
BINGO: A Dependency Grammar Framework to Understand Hardware Specifications Written in English

Nicolas Mazziotta
Drawing the syntactic space: choices in diagrammatic reasoning

Jasmina Milićević
Causation (and Some Other) Paraphrasing Patterns in L1 English. A Case Study

Anna Nedoluzhko, Michal Novák, Martin Popel, Zdeněk Žabokrtský and Daniel Zeman
Is one head enough? Mention heads in coreference annotations compared with UD-style heads

Kaius Sinnemäki and Akira Takaki
Number agreement, dependency length, and word order in Finnish traditional dialects

TLT

Diego Fernando Válio Antunes Alves, Božo Bekavac and Marko Tadić
Typological Approach to Improve Dependency Parsing for Croatian Language

Erica Biagetti
The RigVeda goes “universal”: annotation and analysis of equative constructions in Vedic and beyond

Loïc Grobol, Sophie Prévost and Benoit Crabbé
Is Old French tougher to parse?

Sylvain Kahane, Bernard Caron, Emmett Strickland and Kim Gerdes
Annotation guidelines of UD and SUD treebanks for spoken corpora

Sylvain Kahane, Martine Vanhove and Rayan Ziane
A morpheme-based treebank for Beja

Parameswari Krishnamurthy and Kengatharaiyer Sarveswaran
Towards Building a Modern Written Tamil Treebank

Max Müller-Eberstein, Rob van der Goot and Barbara Plank
How Universal is Genre in Universal Dependencies?

Konstantinos Sampanis and Prokopis Prokopidis
Asia Minor Greek in Contact (AMGiC): A dialectal Treebank comprising contact-induced grammatical changes

Rob van der Goot and Miryam de Lhoneux
Dataset Embeddings for Polyglot Language Model-based Parsers

Jingting Yuan, Qiuhan Lin and John S. Y. Lee
Discourse Complexity Measures for EFL Writing

UDW

Flavio Massimiliano Cecchini
Formae reformandae: for a reorganisation of verb form annotation in Universal Dependencies illustrated by the specific case of Latin

Rolando Coto-Solano, Sharid Loáiciga and Sofía Flores-Solórzano
Towards Universal Dependencies for Bribri

Kilian Evang, Tatiana Bladier, Laura Kallmeyer and Simon Petitjean
Bootstrapping Role and Reference Grammar Treebanks via Universal Dependencies

Adam Farris and Aryaman Arora
For the Purpose of Curry: A UD Treebank for Ashokan Prakrit

Naïma Hassert, Pierre André Ménard and Edith Galy
UD on Software Requirements: Application and Challenges

Georg F.K. Höhn
Towards a consistent annotation of nominal person in Universal Dependencies

Maarten Janssen
UDWiki: guiding the creation of new UD treebanks

Dmytro Kalpakchi and Johan Boye
Minor changes make a difference: a case study on the consistency of UD-based dependency parsers

Guy Lapalme
Validation of Universal Dependencies by regeneration

Stefano Lusito and Jean Maillard
A Universal Dependencies corpus for Ligurian

Mehmet Oguz Derin and Takahiro Harada
Universal Dependencies for Old Turkish

Mai Omura, Aya Wakasa and Masayuki Asahara
Word Delimitation Issues in UD Japanese

Jack Rueter, Niko Partanen and Flammie Pirinen
Numerals and what counts

Nathan Schneider and Amir Zeldes
Mischievous nominal constructions in Universal Dependencies

Daniel Zeman
Date and Time in Universal Dependencies

Registration

SyntaxFest Registration Link is OPEN.

For any questions, related to the payments, please write to the following email: syntaxfest@acl-bg.org

Please note that due to the pandemic dynamics SyntaxFest goes fully ONLINE.

For this reason, ONLY the online fee of 50 EUR has to be paid upon registration for the whole event.

For the cancellation policy that applies to other participants, please see the information in the registration form.

Please also note that the payment system is external to the registration, thus you will receive the confirmation document later on, possibly in a few days.

Modality

The next and second edition will be held online. The following platform will be used: Gather town (with ZOOM as only back-up).

Proceedings

Although SyntaxFest will be held in March 2022, the proceedings will be published well ahead of time in the ACL Anthology. Preliminary versions of all four volumes can already be downloaded here:

Important dates

Attendants are encouraged but not obliged to participate in the whole SyntaxFest.

Paper submission information

Submission page

Papers must be submitted in PDF format exclusively through the SyntaxFest joint submission page

Paper length

We invite two types of submissions: long papers and short papers.

Style guidelines

All submissions should follow the common SyntaxFest 2021 stylesheet (based on the one-column COLING 2020 style guidelines). Stylesheets are provided as LaTeX style file and Microsoft Word templates (templates might be subject to slight modifications for compatibility reasons). The files are downloadable from Depling site. See section Style guidelines.

Double-blind reviews

Reviewing of papers will be double-blind. Therefore, the paper must not include the authors’ names and affiliations. Furthermore, self-references that reveal the author’s identity, e.g., “We previously showed (Zeng, 2018) …”, must be avoided. Instead, use citations such as “Zeng (2018) previously showed …”. Papers that do not conform to these requirements will be rejected without review.

Shared reviewing process

On the submission site, authors submit their paper only once for the whole SyntaxFest, composed of 4 conferences, but they can uncheck conferences they do not wish their paper to be considered for. If the paper is deemed appropriate for more than one of the selected conferences, the SyntaxFest joint organization committee decides on the final placement of the paper, which implies the day of the presentation and the proceedings the paper will appear in.

Chairs

The chairs for each event are:

Quasy

Depling

TLT

UDW

Program committee

You can look at the list of our PC members at Depling site. See section SyntaxFest 2021 Program committee.

Local organizing committee

Venue

The event will be held ONLINE but with the flavour of Sofia University “St. Kl. Ohridski”. Photo by Petya Osenova.

Sofia University. Photo by Petya Osenova