Training algorithms on copyrighted data is not illegal, according to the United States Supreme Court
According to a recent Supreme Court decision in the USA, it is allowed to train machine learning models on copyrighted data:
This decision is very impacting in the current debate about data ownership and privacy, but it certainly should not be interpreted in an oversimplistic way, as there is always the problem of re-generating the source data from a number of recent deep learning models, which is likely to moderate these conclusions in many practical cases…