Training "AI" On Public Data Is Totally Fine And Not Stealing.

@[email protected] · 3 months ago

Training "AI" On Public Data Is Totally Fine And Not Stealing.

@[email protected] · 3 months ago

As someone who doesn’t hate AI, I hate a few things about how it’s happening:

If I want to make a book, and I want to use other books for reference, I need to obtain them legally. Purchase, rent, loan… Else I’m a pirate. Multimillion companies say for them it’s fine as long as somebody posted it on the internet. Their version of annas-archive is suddenly legal and moral, while I’m harming the authors if I use it.
They are stuffing everything with AI, which generally means internet connection and sending unknown data.
It’s an annoying marketing gimmick. While incredible useful in some places, the insistence that it solves all the problems make it seem as a failure.

@[email protected] · 3 months ago

I think your issue moreso lies on copyright laws than the LLM datasets origination then. Which I completely understand, I hate copyright laws.

There’s TV shows that I can’t stream and the only legal way to watch them is to buy the box set for £90. Get fucked I’m not paying that, I’ll just download it for free.