A.I.’s un-learning problem: Researchers say it’s virtually impossible to make an A.I. model ‘forget’ the things it learns from private user data

@[email protected] · 2 years ago

A.I.’s un-learning problem: Researchers say it’s virtually impossible to make an A.I. model ‘forget’ the things it learns from private user data

@[email protected] · 2 years ago

They know how it works. It’s a statistical model. Given a sequence of words, there’s a set of probabilities for what the next word will be. That’s the problem, an LLM doesn’t “know” anything. It’s not a collection of facts. It’s like a pachinko machine where each peg in the machine is a word. The prompt you give it determines where/how the ball gets dropped in and all the pins it hits on the way down corresponds to the output. How those pins get labeled is the learning process. Once that’s done there really isn’t any going back. You can’t unscramble that egg to pick out one piece of the training data.

@[email protected] · 2 years ago

While you are overall correct, there is still a sort of “black box” effect going on. While we understand the mechanics of how the network architecture works the actual information encoded by training is, as you have said, not stored in a way that is easily accessible or editable by a human.

I am not sure if this is what OP meant by it, but it kinda fits and I wanted to add a bit of clarification. Relatedly, the easiest way to uncook (or unscramble) an egg is to feed it to a chicken, which amounts to basically retraining a model.

@[email protected] · 2 years ago

https://www.understandingai.org/p/large-language-models-explained-with I don’t think you’re intending to be purposefully misleading, but I would recommend checking this article out because the pachinko analogy is not accurate, really. There are several layers of considerations that the model makes when analyzing context to derive meaning. How well these models do with analogies is, I think, a compelling case for the model having, if not “knowledge” of something, at least a good enough analogue to knowledge to be useful.

Training a model on the way we use language is also training the model on how we think, or at least how we express our thoughts. There’s still a ton of gaps to work on before it’s an AGI, but LLMs are on to what’s looking more and more like the right path to getting there.

@[email protected] · 2 years ago

While it glosses over a lot of details it’s not fundamentally wrong in any fashion. A LLM does not in any meaningful fashion “know” anything. Training an LLM is training it on what words are used in relation to each other in different contexts. It’s like training someone to sing a song in a foreign language they don’t know. They can repeat the sounds and may even recognize when certain words often occur in proximity to each other, but that’s a far cry from actually understanding those words.

A LLM is in no way shape or form anything even remotely like a AGI. I wouldn’t even classify a LLM as AI. LLM are machine learning.

The entire point I was trying to make though is that a LLM does not store specific training data, rather what it stores is more like the hashed results of its training data. It’s a one way transform, there is absolutely no way to start at the finished model and drive it backwards to derive its training input. You could probably show from its output that it’s highly likely some specific piece of data was used to train it, but even that isn’t absolutely certain. Nor can you point at any given piece of the model and say what specific part of the training data it corresponds to or vice versa. Because of that it’s impossible to pluck out some specfic piece of data from the model. The only way to remove data from the model is to throw the model away and train a new model from the original training data with the specific data removed from it.

@[email protected] · 2 years ago

I really like that pachinko analogy. It gets the basic concept across without having to wade into technical descriptions.

@[email protected] · 2 years ago

It’s a statistical model. Given a sequence of words, there’s a set of probabilities for what the next word will be.

That is a gross oversimplification. LLM’s operate on much more than just statistical probabilities. It’s true that they predict the next word based on probabilities learned from training datasets, but they also have layers of transformers to process the context provided from a prompt to eke out meaningful relationships between words and phrases.

For example: Imagine you give an LLM the prompt, “Dumbledore went to the store to get ice cream and passed his friend Sam along the way. At the store, he got chocolate ice cream.” Now, if you ask the model, “who got chocolate ice cream from the store?” it doesn’t just blindly rely on statistical likelihood. There’s no way you could argue that “Dumbledore” is a statistically likely word to follow the text “who got chocolate ice cream from the store?” Instead, it uses its understanding of the specific context to determine that “Dumbledore” is the one who got chocolate ice cream from the store.

So, it’s not just statistical probabilities; the models’ have an ability to comprehend context and generate meaningful responses based on that context.

@[email protected] · 2 years ago

This is mostly true, except they do store information - it’s just not in a consistent, machine readable form.

You can analyze it with specialized tools, and an expert can gain some ability to understand what is stored in a specific link and manually modify it (in a very blunt way)

Scrambling an egg is a good analogy to a point - you can’t extract out the training data. It’s essentially extremely high, loss full compression from an informational perspective.

You can’t get the egg back, but you can modify the model to change the information inside of it. It’s extremely complex, but it’s a very active field of study - with simpler models we’ve been able to separate data out from ability - the idea is to use something closer to a database that can be modified without doing brain surgery every time. It’s

You can’t guarantee destruction of information without complete understanding of the model, but we might be able to scramble personal details… Granted, it’s not like we can do now