Training "AI" On Public Data Is Totally Fine And Not Stealing.

@[email protected] · 3 months ago

Training "AI" On Public Data Is Totally Fine And Not Stealing.

Zagorath · 3 months ago

They have indeed made a statement of fact. But to the best of my knowledge it’s not one that’s got any definite controlling precedent in law.

You are still not permitted to, for example, repost it elsewhere without the copyright holder’s permission

That’s the thing. It’s not clear that an LLM does “repost it elsewhere”. As the OP said, the model itself is basically just a mathematical construct that can’t really be turned back into the original work, which is possibly a sign that it’s not a derivative work, but a transformative one, which is much more likely to be given Fair Use protection. Though Fair Use is always a question mark and you never really know if a use is Fair without going to court.

You could be right here. Or OP could. As far as I’m concerned anyone claiming to know either way is talking out of their arse.

@Eccitaze · edit-2 3 months ago

Just because something is transformative doesn’t mean that it’s fair use. There’s three other factors, including the nature of the work you copied, the amount of the copyrighted work taken for the use, and the effect on the market. There’s no way in hell I believe that anyone can plausibly say with a straight face “I’m taking literally all of the creative works you’ve ever produced and using them to create a product designed to directly compete with you and put you out of business, and this qualifies as a fair use” and I would be shocked if any judge in any court heard that argument without laughing the poor lawyer making it out of the court.