More data is good, but clean data is always better. Cleaned and correctly processed data is what makes the difference. Clean data can mean different things, ranging from removing data bias to assuring better linguistic quality. Or filtering data to perform specific customized training.