This reality just looks very different from the scenarios proposed by AI doomsayers, as the current LLMs, instead of being quiet, deadly things poised to take over the world, are instead loud and ineffectual and prone to making up all kinds of nonsense on the spot. It is far more likely they will make defective products instead of too-effective and deadly ones.
Does it look so different though? What happens when the LLMs are good enough to propose practical code improvements that will result in more-useful LLMs? And when we start generating LLM-driven agents that actually train and deploy these more-useful LLMs? Are we taking on faith that we'll never get there?
This is what a lot of skeptics fail to appreciate imo: That yes the current generation of models are not threatening, and on their own they probably never will be. But there will likely come a point where they can be used to recursively self-improve, and that can potentially lead to very dangerous scenarios.
Currently, one of the problems AI developers are dealing with is that LLMs training on data that is 'polluted' by AI generated content get dumber at predicting their desired outcomes.
Even if there was a way to gain efficiency through self modification rather than losing it, it would likely hit a plateau of diminishing returns, like most things in the world do.
That paper[0] mentions two defects, catastrophic forgetting which occurs during continual learning. GPT type models do not use this type of training and use supervised training instead which utilizes a curated corpus of text. The second is model collapse which is essentially just bad data making its way into the training set, once again a sufficiently curated corpus eliminates this problem.
Microsoft has published a few papers about their research in using LLMs to train new LLM models. They have used training data consisting of essentially recorded chatGPT interactions which consist of the system prompt, user request and model response[1] to successfully train a smaller model that outperforms Vicuna-13B. The other paper[2] demonstrates using a combination of curated web data and machine generated textbooks and exercises (using GPT-3.5) and they produced a small model (1.3B) that outperforms several larger models on programming tasks.
The point here is that the simple presence of machine generated content in the dataset is not an immediate detriment to the model being trained on it. It all hinges on the quality of the data regardless of whether it is of human or machine origin. Think about it, if the information is correct then there is no difference whether it came from a human or a machine. Machine generated data is not secretly cursed or other such nonsense. It's either correct or incorrect, nothing more.
>Even if there was a way to gain efficiency through self modification rather than losing it, it would likely hit a plateau of diminishing returns, like most things in the world do.
And maybe you're fine with betting your life on that, but the people who are more-hesitant on this are not crazy.
The fact that AI alignment is not yet a solved problem, and according to AI safety researchers we don’t have a clear path to solving it, is not an assumption. Further I don’t think that worrying about that in the context of possible super intelligent general AI is poor reasoning.
AI safety researchers are prone to making wild and spurious claims about what a superintelligent AI is capable of, often relying on scenarios from science fiction or very out there fields of physics to try and get people to pay heed to their ideas. If I had a dime for every time 'possible human extinction' was brought up...