Resources are grouped into four categories: corpora, tools, events and other.

Corpora

The most popular annotated and unannotated corpora created by and/or used by researchers in the field of automatic grammatical error correction.

  1. The NUS Corpus of Learner English — The corpus of about 1,400 students essays on a wide range of topics, such as environmental pollution, healthcare, etc., annotated by professional English instructors with error tags and corrections within 28 error categories. Annotated: yes. Size: about 1,400 essays.
  2. The WikEd error corpus ver. 1.0 — A publicly available large corpus of corrective edits extracted from English Wikipedia. Annotated: no information. Size: about 55M sentences.

Tools

Useful tools that has been used to develop a various grammatical error correction system, for example NLP tools, machine learning toolkits, evaluation scripts, etc.

  1. Moses — A statistical machine translation system that allows you to automatically train translation models for any language pair. Check the fscorer branch for scripts designed for automatic grammatical error correction.