OneStopEnglish corpus: A new corpus for automatic readability assessment and text simplification

Thumbnail Image
Date
2018-01-01
Authors
Vajjala, Sowmya
Lucic, Ivana
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Authors
Person
Vajjala, Sowmya
Graduate Student
Research Projects
Organizational Units
Organizational Unit
English

The Department of English seeks to provide all university students with the skills of effective communication and critical thinking, as well as imparting knowledge of literature, creative writing, linguistics, speech and technical communication to students within and outside of the department.

History
The Department of English and Speech was formed in 1939 from the merger of the Department of English and the Department of Public Speaking. In 1971 its name changed to the Department of English.

Dates of Existence
1939-present

Historical Names

  • Department of English and Speech (1939-1971)

Related Units

Journal Issue
Is Version Of
Versions
Series
Department
English
Abstract

This paper describes the collection and compilation of the OneStopEnglish corpus of texts written at three reading levels, and demonstrates its usefulness for through two applications - automatic readability assessment and automatic text simplification. The corpus consists of 189 texts, each in three versions (567 in total). The corpus is now freely available under a CC by-SA 4.0 license1 and we hope that it would foster further research on the topics of readability assessment and text simplification.

Comments

This proceeding is published as Vajjala, Sowmya, and Ivana Lucic. "OneStopEnglish corpus: A new corpus for automatic readability assessment and text simplification." In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications (2018): 297-304.

Description
Keywords
Citation
DOI
Source
Copyright
Mon Jan 01 00:00:00 UTC 2018