TECO: An Eye-Tracking Corpus of Japanese L2 English Learners’ Text Reading

Shingo Nahatame; Tomoko Ogiso; Yukino Kimura; Yuji Ushiro

doi:10.1016/j.rmal.2024.100123

TECO: An Eye-Tracking Corpus of Japanese L2 English Learners’ Text Reading

Shingo Nahatame^*, Tomoko Ogiso, Yukino Kimura, Yuji Ushiro

^*Corresponding author for this work

Joint Institute of Teacher Education

Research output: Contribution to journal › Article › peer-review

Abstract

Eye-tracking corpora are invaluable resources for understanding human text processing. In recent years, some corpora have been developed that incorporate second-language (L2) English readers’ eye-movement recordings. However, these face limitations such as small data sizes, the absence of multiple and longer text sources, and a scarcity of data from learners whose first languages are linguistically distinct from English. Addressing these gaps, this study introduces Tsukuba Eye-tracking Corpus (TECO), a dataset of eye-tracking records from Japanese L2 English learners engaged in text reading. TECO encompasses eye-tracking data for over 410,000 tokens, collected from 41 Japanese students who each read 30 English passages ranging in length from 300–400 words. In this article, we detail the design of TECO and report on the reliability of commonly used eye-tracking measures (e.g., skipping, first fixation duration, and regression) along with their descriptive statistics and distribution. We also validate the corpus by illustrating the impact of several lexical and reader factors (e.g., word length and reading proficiency) on some eye-tracking measures. TECO will serve as a valuable resource for researchers who are keen on exploring the cognitive processes involved in L2 reading. The corpus is freely accessible at the Open Science Framework [https://osf.io/wrvj3/], and we are committed to its continuous expansion by adding participants from diverse backgrounds and incorporating more detailed text information.

Original language	English
Journal	Research Methods in Applied Linguistics
Volume	3
Issue number	100123
DOIs	https://doi.org/10.1016/j.rmal.2024.100123
State	Published - 2024/06/07

Access to Document

10.1016/j.rmal.2024.100123

Cite this

@article{b3ed1af5baab4fcd9485ee64a359a734,

title = "TECO: An Eye-Tracking Corpus of Japanese L2 English Learners{\textquoteright} Text Reading",

abstract = "Eye-tracking corpora are invaluable resources for understanding human text processing. In recent years, some corpora have been developed that incorporate second-language (L2) English readers{\textquoteright} eye-movement recordings. However, these face limitations such as small data sizes, the absence of multiple and longer text sources, and a scarcity of data from learners whose first languages are linguistically distinct from English. Addressing these gaps, this study introduces Tsukuba Eye-tracking Corpus (TECO), a dataset of eye-tracking records from Japanese L2 English learners engaged in text reading. TECO encompasses eye-tracking data for over 410,000 tokens, collected from 41 Japanese students who each read 30 English passages ranging in length from 300–400 words. In this article, we detail the design of TECO and report on the reliability of commonly used eye-tracking measures (e.g., skipping, first fixation duration, and regression) along with their descriptive statistics and distribution. We also validate the corpus by illustrating the impact of several lexical and reader factors (e.g., word length and reading proficiency) on some eye-tracking measures. TECO will serve as a valuable resource for researchers who are keen on exploring the cognitive processes involved in L2 reading. The corpus is freely accessible at the Open Science Framework [https://osf.io/wrvj3/], and we are committed to its continuous expansion by adding participants from diverse backgrounds and incorporating more detailed text information.",

keywords = "second language, reading, eye tracking, corpus",

author = "Shingo Nahatame and Tomoko Ogiso and Yukino Kimura and Yuji Ushiro",

year = "2024",

month = jun,

day = "7",

doi = "10.1016/j.rmal.2024.100123",

language = "英語",

volume = "3",

journal = "Research Methods in Applied Linguistics",

issn = "2772-7661",

number = "100123",

}

TY - JOUR

T1 - TECO: An Eye-Tracking Corpus of Japanese L2 English Learners’ Text Reading

AU - Nahatame, Shingo

AU - Ogiso, Tomoko

AU - Kimura, Yukino

AU - Ushiro, Yuji

PY - 2024/6/7

Y1 - 2024/6/7

N2 - Eye-tracking corpora are invaluable resources for understanding human text processing. In recent years, some corpora have been developed that incorporate second-language (L2) English readers’ eye-movement recordings. However, these face limitations such as small data sizes, the absence of multiple and longer text sources, and a scarcity of data from learners whose first languages are linguistically distinct from English. Addressing these gaps, this study introduces Tsukuba Eye-tracking Corpus (TECO), a dataset of eye-tracking records from Japanese L2 English learners engaged in text reading. TECO encompasses eye-tracking data for over 410,000 tokens, collected from 41 Japanese students who each read 30 English passages ranging in length from 300–400 words. In this article, we detail the design of TECO and report on the reliability of commonly used eye-tracking measures (e.g., skipping, first fixation duration, and regression) along with their descriptive statistics and distribution. We also validate the corpus by illustrating the impact of several lexical and reader factors (e.g., word length and reading proficiency) on some eye-tracking measures. TECO will serve as a valuable resource for researchers who are keen on exploring the cognitive processes involved in L2 reading. The corpus is freely accessible at the Open Science Framework [https://osf.io/wrvj3/], and we are committed to its continuous expansion by adding participants from diverse backgrounds and incorporating more detailed text information.

AB - Eye-tracking corpora are invaluable resources for understanding human text processing. In recent years, some corpora have been developed that incorporate second-language (L2) English readers’ eye-movement recordings. However, these face limitations such as small data sizes, the absence of multiple and longer text sources, and a scarcity of data from learners whose first languages are linguistically distinct from English. Addressing these gaps, this study introduces Tsukuba Eye-tracking Corpus (TECO), a dataset of eye-tracking records from Japanese L2 English learners engaged in text reading. TECO encompasses eye-tracking data for over 410,000 tokens, collected from 41 Japanese students who each read 30 English passages ranging in length from 300–400 words. In this article, we detail the design of TECO and report on the reliability of commonly used eye-tracking measures (e.g., skipping, first fixation duration, and regression) along with their descriptive statistics and distribution. We also validate the corpus by illustrating the impact of several lexical and reader factors (e.g., word length and reading proficiency) on some eye-tracking measures. TECO will serve as a valuable resource for researchers who are keen on exploring the cognitive processes involved in L2 reading. The corpus is freely accessible at the Open Science Framework [https://osf.io/wrvj3/], and we are committed to its continuous expansion by adding participants from diverse backgrounds and incorporating more detailed text information.

KW - second language

KW - reading

KW - eye tracking

KW - corpus

U2 - 10.1016/j.rmal.2024.100123

DO - 10.1016/j.rmal.2024.100123

M3 - 学術論文

SN - 2772-7661

VL - 3

JO - Research Methods in Applied Linguistics

JF - Research Methods in Applied Linguistics

IS - 100123

ER -