How Many Data Samples is an Additional Instruction Worth?

Ravsehaj Singh Puri; Swaroop Mishra; Mihir Parmar; Chitta Baral

How Many Data Samples is an Additional Instruction Worth?

Ravsehaj Singh Puri, Swaroop Mishra, Mihir Parmar, Chitta Baral

Engineering, Ira A. Fulton Schools of (IAFSE)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

2 Scopus citations

Abstract

Recently introduced instruction-paradigm empowers non-expert users to leverage NLP resources by defining a new task in natural language. Instruction-tuned models have significantly outperformed multitask learning models (without instruction); however they are far from state-of-the-art task-specific models. Conventional approaches to improve model performance via creating datasets with large number of task instances or architectural changes in the model may not be feasible for non-expert users. However, they can write alternate instructions to represent an instruction task. Is Instruction-augmentation helpful? We augment a subset of tasks in the expanded version of NATURAL INSTRUCTIONS with additional instructions and find that it significantly improves model performance (up to 35%), especially in the low-data regime. Our results indicate that an additional instruction can be equivalent to „200 data samples on average across tasks.

Original language	English (US)
Title of host publication	EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023
Publisher	Association for Computational Linguistics (ACL)
Pages	1012-1027
Number of pages	16
ISBN (Electronic)	9781959429470
State	Published - 2023
Event	17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023 - Findings of EACL 2023 - Dubrovnik, Croatia Duration: May 2 2023 → May 6 2023

Publication series

Name	EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023

Conference

Conference	17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023 - Findings of EACL 2023
Country/Territory	Croatia
City	Dubrovnik
Period	5/2/23 → 5/6/23

ASJC Scopus subject areas

Computational Theory and Mathematics
Software
Linguistics and Language

Cite this

Puri, R. S., Mishra, S., Parmar, M., & Baral, C. (2023). How Many Data Samples is an Additional Instruction Worth? In EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023 (pp. 1012-1027). (EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023). Association for Computational Linguistics (ACL).

How Many Data Samples is an Additional Instruction Worth? / Puri, Ravsehaj Singh; Mishra, Swaroop; Parmar, Mihir et al.
EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023. Association for Computational Linguistics (ACL), 2023. p. 1012-1027 (EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Puri, RS, Mishra, S, Parmar, M & Baral, C 2023, How Many Data Samples is an Additional Instruction Worth? in EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023. EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023, Association for Computational Linguistics (ACL), pp. 1012-1027, 17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023 - Findings of EACL 2023, Dubrovnik, Croatia, 5/2/23.

Puri RS, Mishra S, Parmar M, Baral C. How Many Data Samples is an Additional Instruction Worth? In EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023. Association for Computational Linguistics (ACL). 2023. p. 1012-1027. (EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023).

Puri, Ravsehaj Singh ; Mishra, Swaroop ; Parmar, Mihir et al. / How Many Data Samples is an Additional Instruction Worth?. EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023. Association for Computational Linguistics (ACL), 2023. pp. 1012-1027 (EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023).

@inproceedings{c55b2c5039434a66bc26a5fa8cd9bbb2,

title = "How Many Data Samples is an Additional Instruction Worth?",

abstract = "Recently introduced instruction-paradigm empowers non-expert users to leverage NLP resources by defining a new task in natural language. Instruction-tuned models have significantly outperformed multitask learning models (without instruction); however they are far from state-of-the-art task-specific models. Conventional approaches to improve model performance via creating datasets with large number of task instances or architectural changes in the model may not be feasible for non-expert users. However, they can write alternate instructions to represent an instruction task. Is Instruction-augmentation helpful? We augment a subset of tasks in the expanded version of NATURAL INSTRUCTIONS with additional instructions and find that it significantly improves model performance (up to 35%), especially in the low-data regime. Our results indicate that an additional instruction can be equivalent to „200 data samples on average across tasks.",

author = "Puri, {Ravsehaj Singh} and Swaroop Mishra and Mihir Parmar and Chitta Baral",

note = "Publisher Copyright: {\textcopyright} 2023 Association for Computational Linguistics.; 17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023 - Findings of EACL 2023 ; Conference date: 02-05-2023 Through 06-05-2023",

year = "2023",

language = "English (US)",

series = "EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023",

publisher = "Association for Computational Linguistics (ACL)",

pages = "1012--1027",

booktitle = "EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023",