TECO: An Eye-Tracking Corpus of Japanese L2 English Learners’ Text Reading

Shingo Nahatame*, Tomoko Ogiso, Yukino Kimura, Yuji Ushiro

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Eye-tracking corpora are invaluable resources for understanding human text processing. In recent years, some corpora have been developed that incorporate second-language (L2) English readers’ eye-movement recordings. However, these face limitations such as small data sizes, the absence of multiple and longer text sources, and a scarcity of data from learners whose first languages are linguistically distinct from English. Addressing these gaps, this study introduces Tsukuba Eye-tracking Corpus (TECO), a dataset of eye-tracking records from Japanese L2 English learners engaged in text reading. TECO encompasses eye-tracking data for over 410,000 tokens, collected from 41 Japanese students who each read 30 English passages ranging in length from 300–400 words. In this article, we detail the design of TECO and report on the reliability of commonly used eye-tracking measures (e.g., skipping, first fixation duration, and regression) along with their descriptive statistics and distribution. We also validate the corpus by illustrating the impact of several lexical and reader factors (e.g., word length and reading proficiency) on some eye-tracking measures. TECO will serve as a valuable resource for researchers who are keen on exploring the cognitive processes involved in L2 reading. The corpus is freely accessible at the Open Science Framework [https://osf.io/wrvj3/], and we are committed to its continuous expansion by adding participants from diverse backgrounds and incorporating more detailed text information.
Original languageEnglish
JournalResearch Methods in Applied Linguistics
Volume3
Issue number100123
DOIs
StatePublished - 2024/06/07

Cite this