Graphic-prompt tasks for assessment of academic English writing ability: An argument-based approach to investigating validity

Choi, Yundeok

Graphic-prompt tasks for assessment of academic English writing ability: An argument-based approach to investigating validity

File

Choi_iastate_0097E_17226.pdf (3.44 MB)

Date

2018-01-01

Authors

Choi, Yundeok

Advisor

Carol A. Chapelle

Organizational Units

Organizational Unit

English

The Department of English seeks to provide all university students with the skills of effective communication and critical thinking, as well as imparting knowledge of literature, creative writing, linguistics, speech and technical communication to students within and outside of the department.

History
The Department of English and Speech was formed in 1939 from the merger of the Department of English and the Department of Public Speaking. In 1971 its name changed to the Department of English.

Dates of Existence
1939-present

Historical Names

Department of English and Speech (1939-1971)

Related Units

College of Liberal Arts and Sciences (parent college)
Department of English (predecessor, 1898-1939)
Department of Public Speaking (predecessor, 1898-1939)

Department

English

Abstract

The graphic-prompt writing task, where one or multiple visual graphs are provided as source materials, is a type of integrated writing tasks that assesses test takers’ ability to incorporate information from the source(s) into their writing. Compared to reading-to-write and reading-listening-to-write tasks, the graphic-prompt writing task is utilized and researched to a limited extent, even though it holds promise for facilitating multimodal literacy (Jewitt, 2005, 2008) and avoiding construct underrepresentation (Messick, 1989) in L2 academic writing assessment (Lim, 2009). Therefore, the present dissertation study aims to validate score interpretations on the graphic-prompt writing task by taking an argument-based approach; this approach to test validation provides systematic guidance on conducting research to collect validity evidence (Kane, 2001, 2006, 2013).

To examine the score interpretations, computer-mediated graphic-prompt writing tasks, as well as a five-point scale analytic rating rubric, were developed for English Placement Test purposes under the Evidence-Centered Design Framework (e.g., Mislevy & Haertel, 2006), and an interpretation/use argument (Kane, 2013) for the test was crafted. Various types of evidence were sought to justify the three inferences (evaluation, generalization, and explanation), relevant to the test score interpretations, via a mixed-methods research design. A range of data (graphic-prompt writing test scores, test essays, stimulated recall protocols, standardized English writing test scores, and responses on Graph Familiarity and Test Mode Preference questionnaires) were collected from 101 ESL students studying at a large Midwestern university in the U.S.

The three inferences were generally well supported by the evidence yielded from the quantitative analysis, conducted by descriptive statistics, Multi-Faceted Rasch Measurement, Generalizability Theory, disattenuated correlation, and multiple regression, and from a qualitative analysis of the stimulated recalls and discourse features of the test essays. The evaluation inference was upheld by the findings that the raters showed neither central tendency nor halo effects in their ratings, that the rating scale for the graphic-prompt writing test met five of the six criteria of a quality rating scale, and that scores on the graphic-prompt writing test were widely spread across test takers’ different levels of graphic-prompt writing ability. The generalization inference was supported by results showing that variance in test takers’ graphic-prompt writing ability (object of measurement) contributed most to composite and analytic score variances compared with sources of error variance, test score dependability of the three-rater and two-task test design was ≥ .7 for composite and three analytic rating criteria (Graph Description, Organization, and Grammar/Vocabulary) scores, and the required numbers of test tasks and raters needed for Φ ≥ .7 varied depending on score report methods (composite versus analytic) and analytic rating criteria, though using the test design with two tasks and two raters reporting composite scores appeared the most derisible. The explanation inference was backed by findings that writing processes elicited by the graphic-prompt writing test and most discourse features of the test essays were different depending on test takers’ levels of graphic-prompt writing ability, the construct of the graphic-prompt writing test had a moderately strong positive correlation with the construct of standardized English writing tests, and only test takers’ academic writing ability (a construct-relevant factor) significantly contributed to graphic-prompt writing test scores, while their graph familiarity and test mode preference (construct irrelevant factors) did not.

The present study provides implications for the construct underlying the graphic-prompt writing test, dependability and validity of the test score interpretations, and test administration designs for the graphic-prompt writing test, as well as validation of the test.

Copyright

Wed Aug 01 00:00:00 UTC 2018

Collections

Theses and Dissertations

Full item page