Lisa Vasileva

EPIC

Resources

Lisa Vasileva

Lisa is a Data Curator at the NLP Team with TAUS. Using her background in linguistics and experience in the translation industry, she helps TAUS optimize the data offering and create new data solutions.

Data for AI

Web Scraping for Parallel Corpora Creation

by Lisa Vasileva

01/10/2021

Web scraping is a common way to generate parallel data, making use of the immense source of multilingual data offered on the web. Here is how to do web scraping.