Paper Review Process

For the review process at NAACL-HLT 2021, we largely adopted the best practices of other recent virtual conferences, while modifying the format somewhat to treat all papers equally, as described in more detail below. Our paper review process followed the recent trend of a hierarchical organization, with senior area chairs (SACs) that organized coherent research tracks and area chairs (ACs) who shepherded smaller batches of papers within each track.

NAACL-HLT 2021 received 1797 submissions–a record for our conference! We accepted 477 papers, including 350 long and 127 short, for an overall acceptance rate of 26%. The acceptance rate for long papers was higher than short papers (28% vs. 23%), although this gap was smaller than in other recent conferences at least in part due to minor but explicit rebalancing done the the PC chairs. The following graphs show how these numbers break down in comparison to previous years.

Acceptance rates Number of long papers submitted and accepted Number of short papers submitted and accepted

From the accepted papers, and based on the nominations from SACs and review by the best paper committee, we selected the best papers in the long and short paper categories, as well as a small number of outstanding papers in each category. The best paper selection process is described here. NAACL-HLT 2021 will also feature 18 papers that were published at Transactions of the Association for Computational Linguistics(TACL) and 4 papers from the journal of Computational Linguistics(CL).

When designing the call for papers, we made an effort to balance the size of the tracks. Recent NLP conferences have had many tracks that received well over 200 submissions, making them mini-conferences of their own. To make the senior area chairs’ jobs more manageable, we split the machine learning track by areas (into “Classification and Structured Prediction Models” and “Language Modeling and Sequence to Sequence Models”) and separated Machine Translation from Multilingual. We also wrote a brief guide to authors, including descriptions of areas, to help them decide which track was most appropriate for their work. Overall, we felt that this did make the organization more manageable and that other tracks, including NLP applications, could possibly benefit from further splitting in future conferences. Otherwise, we followed recent traditions in track selection, including keeping some of the smaller, more recent additions (e.g. Green NLP and Ethics). The following graph shows a list of the paper submission and acceptance numbers per track.

Papers by track

We also had a special theme for the conference, which we called “New Challenges in NLP: Tasks, Methods, Positions.” This theme was selected to recognize that we have made significant progress in NLP over the last five years, and that the community could benefit from thinking about the new problems and upcoming challenges we should focus on next. Despite the general applicability of the unsupervised pre-training/fine-tuning paradigm, many problems are still very challenging for current models. At the same time, given the recent progress, there are likely broad new classes of problems that can now be studied for the first time. Hence, the special theme targeted papers focusing on: “What tasks or capabilities should we focus on next?” and “What new classes of models should we be investigating?” We envisioned papers falling into this theme including (but not limited to) (1) empirical and dataset papers that propose new challenges that bring us closer to human-level language understanding and generation, and (2) position papers framing an important direction or highlighting an understudied research problem.

We recruited reviewers through a centralized process, designed to minimize workload for senior area chairs (SACs) without sacrificing review quality. We collected a list of likely qualified reviewers based on the reviewer and author pool of other recent NLP conferences. These candidates were invited to sign up to review, and were required to fill out a profile that allowed us to better assess their potential area fit and experience levels. We were fortunate enough to have enough volunteers to not need everyone, and were able to bias the final selection towards more senior reviewers, although many junior and first time reviewers remained in the pool. The final program committee included 54 SACs, 267 ACs, and 1941 committee members. We greatly appreciate the incredible amount of work they all did, and also thank all of the volunteers who were not selected in the end.

There was one part of the review process that we did this year, which we would likely not repeat in hindsight. We coordinated the review process so that authors could see their reviews and withdraw their paper to resubmit to ACL 2021 if they were not happy with the quality. Hundreds of papers were withdrawn, and anecdotal evidence suggests that many of them would have been accepted if they had been left in the pool. We hope that initiatives like the ACL Rolling Review will provide more robust solutions to these types of problems in the future, and otherwise suspect overlapping review periods will just become the norm as conferences continue to grow.

Finally, for reference, here is a full table with all the per-track paper acceptance statistics.

Area\Number of	Long Paper Submissions	Long Papers Accepted	Short Paper Submissions	Short Papers Accepted	All Papers Submissions	All Papers Accepted
Computational Social Science and Social Media	44	11	17	3	61	14
Dialogue and Interactive Systems	90	30	27	5	117	35
Discourse and Pragmatics	19	7	14	3	33	10
Ethics, Bias, and Fairness	15	4	19	6	34	10
Green NLP	8	5	4	3	12	8
Language Generation	73	20	18	1	91	21
Information Extraction	95	15	27	1	122	16
Information Retrieval and Text Mining	38	7	22	7	60	14
Interpretability and Analysis of Models for NLP	70	16	40	12	110	28
Language Grounding to Vision, Robotics and Beyond	45	13	17	8	62	21
Language Resources and Evaluation	53	26	20	2	73	28
Linguistic Theories, Cognitive Modeling and Psycholinguistics	13	3	6	2	19	5
Machine Learning for NLP: Classification and Structured Prediction Models	84	21	41	8	125	29
Machine Learning for NLP: Language Modeling and Sequence to Sequence Models	58	11	27	8	85	19
Machine Translation	60	19	35	9	95	28
Multilinguality	34	10	18	4	52	14
NLP Applications	109	35	53	9	162	44
Phonology, Morphology and Word Segmentation	10	5	6	2	16	7
Question Answering	68	20	31	7	99	27
Semantics: Lexical Semantics	29	9	10	1	39	10
Semantics: Sentence-level Semantics and Textual Inference	66	20	24	8	90	28
Sentiment Analysis and Stylistic Analysis	56	15	16	2	72	17
Speech	10	3	11	2	21	5
Summarization	53	22	22	5	75	27
Syntax: Tagging, Chunking, and Parsing	20	7	9	2	29	9
Special Theme	34	12	10	3	44	15
Total	1254	366	544	123	1798	489

Share on

Twitter Facebook LinkedIn

Program Chairs

Share on