On Degrees of Freedom in Defining and Testing Natural Language Understanding

Saku Sugawara, Shun Tsugita

研究成果: 書籍の章/レポート/会議録会議への寄与査読

2 被引用数 (Scopus)

抄録

Natural language understanding (NLU) studies often exaggerate or underestimate the capabilities of systems, thereby limiting the reproducibility of their findings. These erroneous evaluations can be attributed to the difficulty of defining and testing NLU adequately. In this position paper, we reconsider this challenge by identifying two types of researcher degrees of freedom. We revisit Turing's original interpretation of the Turing test and indicate that an NLU test does not provide an operational definition; it merely provides inductive evidence that the test subject understands the language sufficiently well to meet stakeholder objectives. In other words, stakeholders are free to arbitrarily define NLU through their objectives. To use the test results as inductive evidence, stakeholders must carefully assess if the interpretation of test scores is valid or not. However, designing and using NLU tests involve other degrees of freedom, such as specifying target skills and defining evaluation metrics. As a result, achieving consensus among stakeholders becomes difficult. To resolve this issue, we propose a validity argument, which is a framework comprising a series of validation criteria across test components. By demonstrating that current practices in NLU studies can be associated with those criteria and organizing them into a comprehensive checklist, we prove that the validity argument can serve as a coherent guideline for designing credible test sets and facilitating scientific communication.

本文言語英語
ホスト出版物のタイトルFindings of the Association for Computational Linguistics, ACL 2023
出版社Association for Computational Linguistics (ACL)
ページ13625-13649
ページ数25
ISBN(電子版)9781959429623
DOI
出版ステータス出版済み - 2023
イベントFindings of the Association for Computational Linguistics, ACL 2023 - Toronto, カナダ
継続期間: 2023/07/092023/07/14

出版物シリーズ

名前Proceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN(印刷版)0736-587X

学会

学会Findings of the Association for Computational Linguistics, ACL 2023
国/地域カナダ
CityToronto
Period2023/07/092023/07/14

ASJC Scopus 主題領域

  • コンピュータ サイエンスの応用
  • 言語学および言語
  • 言語および言語学

フィンガープリント

「On Degrees of Freedom in Defining and Testing Natural Language Understanding」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル