There is still a lack of the amounts of labeled data required to feed data-hungry neural models, and in some domains and languages even unlabeled data is scarce. In addition, variation across different domains makes it difficult to adapt machine learning models trained on data from a certain domain to data from different domains. Together, these factors result in a considerable decrease in the portability of many NLP models. To address this challenge, various methods of domain adaptation have been proposed and adapted for many natural language processing applications.
Domain adaptation is a sub-discipline of transfer learning that deals with scenarios in which a statistical or neural model trained on a source distribution is used in the context of a different (but related) target distribution. When this happens, we usually speak of a domain shift.
Technically, domain shift is a violation of the general principle in (supervised) machine learning that the training dataset should be drawn from the same distribution in which the trained classifier will be applied to make predictions for previously unseen data instances. In simple terms, this means that the training and test sets must be sufficiently similar to each other.
For instance, if one’s goal is to predict the sentiment of tweets using an ML algorithm, it is required that the algorithm be trained on tweets in the first place as opposed to some other kind of text, such as news articles or movie reviews. In cases where this principle is violated, it is expected that the performance of learning algorithms is going to drop significantly, as they are no longer capable of generalizing beyond the training data.
It is important to note here that the word “domain” is rather loosely defined within the NLP community. Most often it refers to some coherent kind of text collection, such as texts that can be grouped together according to topic, style, genre, or linguistic register. In this article, we are not going to attempt to define the word “domain” any further, but rather rely on this existing, loose definition. In addition, the concepts “source domain” and “target domain” usually refer to the domain on which a given ML model is trained and the domain with a different distribution on which it is tested, respectively.
Domain adaptation approaches can be categorized into three categories according to the level of supervision used during the training process. This is similar to the standard three-way categorization of machine learning models along the same axis.
Domain adaptation can be further divided into categories according to the method used to transfer knowledge from the source to the target domain. These approaches can be classified as either model-centric, data-centric, or hybrid.
Model-centric approaches achieve domain adaptation by redesigning parts of the model. They include feature-centric and loss-centric methods.
The following is a list of the most prominent feature-based approaches.
Loss-centric methods focus on altering the loss function of the model in some way:
On the other hand, data-centric methods make use of certain aspects of the data rather than changing the model architecture or its loss function.
Another possibility is to combine training data from multiple source domains, which can also increase the chances that a particular model would perform better on a different target domain. This approach is known as multi-source domain adaptation.
And finally, hybrid models make use of a combination of model- and data-centric approaches and they are currently being studied extensively.
Domain adaptation offers a large variety of techniques that can help increase the performance of NLP models in scenarios where little or no training data is available for the target domain. By bridging the gap between source and target domains, these methods are increasingly being used to produce more and more efficient NLP applications.
Anne-Maj van der Meer is a marketing professional with over 10 years of experience in event organization and management. She has a BA in English Language and Culture from the University of Amsterdam and a specialization in Creative Writing from Harvard University. Before her position at TAUS, she was a teacher at primary schools in regular as well as special needs education. Anne-Maj started her career at TAUS in 2009 as the first TAUS employee where she became a jack of all trades, taking care of bookkeeping and accounting as well as creating and managing the website and customer services. For the past 5 years, she works in the capacity of Events Director, chief content editor and designer of publications. Anne-Maj has helped in the organization of more than 35 LocWorld conferences, where she takes care of the program for the TAUS track and hosts and moderates these sessions.