View article

[PDF] from arxiv.org

MUTANT: A Training Paradigm for Out-of-Distribution Generalization in Visual Question Answering

Authors

Tejas Gokhale, Pratyay Banerjee, Chitta Baral, Yezhou Yang

Publication date

2020/9/18

Conference

EMNLP 2020

Description

While progress has been made on the visual question answering leaderboards, models often utilize spurious correlations and priors in datasets under the i.i.d. setting. As such, evaluation on out-of-distribution (OOD) test samples has emerged as a proxy for generalization. In this paper, we present MUTANT, a training paradigm that exposes the model to perceptually similar, yet semantically distinct mutations of the input, to improve OOD generalization, such as the VQA-CP challenge. Under this paradigm, models utilize a consistency-constrained training objective to understand the effect of semantic changes in input (question-image pair) on the output (answer). Unlike existing methods on VQA-CP, MUTANT does not rely on the knowledge about the nature of train and test answer distributions. MUTANT establishes a new state-of-the-art accuracy on VQA-CP with a improvement. Our work opens up avenues for the use of semantic input mutations for OOD generalization in question answering.

Total citations

Cited by 135

202020212022202320244 31 31 44 25

Scholar articles

Mutant: A training paradigm for out-of-distribution generalization in visual question answering

T Gokhale, P Banerjee, C Baral, Y Yang - arXiv preprint arXiv:2009.08566, 2020

Combatting Cybersecurity Issues Embedded in the Fourth Industrial Revolution*

C Gokhale - JOURNAL OF BANKING, INSURANCE AND …, 2020

Cited by 1 Related articles