TY - JOUR
T1 - TECO: An Eye-Tracking Corpus of Japanese L2 English Learners’ Text Reading
AU - Nahatame, Shingo
AU - Ogiso, Tomoko
AU - Kimura, Yukino
AU - Ushiro, Yuji
PY - 2024/6/7
Y1 - 2024/6/7
N2 - Eye-tracking corpora are invaluable resources for understanding human text processing. In recent years, some corpora have been developed that incorporate second-language (L2) English readers’ eye-movement recordings. However, these face limitations such as small data sizes, the absence of multiple and longer text sources, and a scarcity of data from learners whose first languages are linguistically distinct from English. Addressing these gaps, this study introduces Tsukuba Eye-tracking Corpus (TECO), a dataset of eye-tracking records from Japanese L2 English learners engaged in text reading. TECO encompasses eye-tracking data for over 410,000 tokens, collected from 41 Japanese students who each read 30 English passages ranging in length from 300–400 words. In this article, we detail the design of TECO and report on the reliability of commonly used eye-tracking measures (e.g., skipping, first fixation duration, and regression) along with their descriptive statistics and distribution. We also validate the corpus by illustrating the impact of several lexical and reader factors (e.g., word length and reading proficiency) on some eye-tracking measures. TECO will serve as a valuable resource for researchers who are keen on exploring the cognitive processes involved in L2 reading. The corpus is freely accessible at the Open Science Framework [https://osf.io/wrvj3/], and we are committed to its continuous expansion by adding participants from diverse backgrounds and incorporating more detailed text information.
AB - Eye-tracking corpora are invaluable resources for understanding human text processing. In recent years, some corpora have been developed that incorporate second-language (L2) English readers’ eye-movement recordings. However, these face limitations such as small data sizes, the absence of multiple and longer text sources, and a scarcity of data from learners whose first languages are linguistically distinct from English. Addressing these gaps, this study introduces Tsukuba Eye-tracking Corpus (TECO), a dataset of eye-tracking records from Japanese L2 English learners engaged in text reading. TECO encompasses eye-tracking data for over 410,000 tokens, collected from 41 Japanese students who each read 30 English passages ranging in length from 300–400 words. In this article, we detail the design of TECO and report on the reliability of commonly used eye-tracking measures (e.g., skipping, first fixation duration, and regression) along with their descriptive statistics and distribution. We also validate the corpus by illustrating the impact of several lexical and reader factors (e.g., word length and reading proficiency) on some eye-tracking measures. TECO will serve as a valuable resource for researchers who are keen on exploring the cognitive processes involved in L2 reading. The corpus is freely accessible at the Open Science Framework [https://osf.io/wrvj3/], and we are committed to its continuous expansion by adding participants from diverse backgrounds and incorporating more detailed text information.
KW - second language
KW - reading
KW - eye tracking
KW - corpus
U2 - 10.1016/j.rmal.2024.100123
DO - 10.1016/j.rmal.2024.100123
M3 - 学術論文
SN - 2772-7661
VL - 3
JO - Research Methods in Applied Linguistics
JF - Research Methods in Applied Linguistics
IS - 100123
ER -