Options
Exploration of transfer learning capability of multilingual models for text classification
Date Issued
28-07-2023
Author(s)
Abstract
The use of multilingual models for natural language processing is becoming increasingly popular in industrial and business applications, particularly in multilingual societies. In this study, we investigate the transfer learning capabilities of multilingual language models like mBERT and XLM-R across several Indian languages. We study the performance characteristics of a classifier model with mBERT/XLM-R as the front-end, which is trained only in one language for two tasks: text categorization of news articles and sentiment analysis of product reviews. News articles, on the same event but in different languages, are representative of what may be termed as inherently parallel' data; i.e. data that exhibits similar content across multiple languages, though not necessarily in parallel sentences. Other examples of such data would be customer inquiries/reviews about the same product, social media activity pertaining to the same topic, etcetera. After training in one language, we study the performance characteristics of this classifier model when applied to other languages. Our experiments reveal that by exploiting the inherently parallel nature of the data, XLM-R performs remarkably well when adapted for any Indian language dataset. Further, our study reveals the importance of simultaneously fine-tuning multilingual models with in-domain data from one language in order to express their cross-lingual and domain transfer learning abilities together.