The AI-focused COPIED Act would make removing digital watermarks illegal (as well as training any kind of AI on copyrighted content)

@[email protected] · edit-2 4 months ago

The AI-focused COPIED Act would make removing digital watermarks illegal (as well as training any kind of AI on copyrighted content)

@[email protected] · edit-2 4 months ago

I figure since big tech spent quite a bit of money building those datasets and since they were built before the law, they will be able to keep using them as long as they don’t add anything new but I can’t be certain.

This is a very weird assumption you are making man. The quoted text you sent above pretty much says the opposite. It says everyone who wants to train their models wirh copyrigthed data needs to get permission from the copyright holders. That is great for me period. No one, not a big company nor the open source community, gets to steal the work of people producing art, code, etc. I honestly don’t get why you assume all the data scrapped before would be exempt. Again, very weird assumption.

As for ML algorithms having use, of course they have. Hell, pretty much every company I have worked with has used them for decades. But take a look at the examples you provided. None of them requires you or your company scrapping a bunch of information from randoms on the internet. Specially not copyrighted art, literature, or code. And that’s the point here, you are acting like all of that stops with these laws but that’s ridiculous.

@[email protected] · 4 months ago

The article is pro corpo, I’m looking at the bill and it’s quite clear where it’s headed.

None of what I mentioned is possible without the LLM that’s at its heart. Just training an LLM is a million or two in compute power. We don’t get the next generation for free if laws like this tack on an extra 80 million. 6 million for Reddit and that was when you could scrap it for free, and that’s just a drop in the bucket.