AI Needs To Feed On Fresh Meat, Or It Dies

AI Needs To Feed On Fresh Meat, Or It Dies

In Asimov’s original Foundation, a novel set amidst a fading and decaying Empire, one saturated in hubris and self-esteem, we meet Lord Dorwin, a celebrated scholar from the Galactic Empire, and his encounter with a man from the Foundation, a group that has fled to the far side of the universe to preserve the Good and True.

Pirenne heard Lord Dorwin’s idea of scientific research. Lord Dorwin thought the way to be a good archaeologist was to read all the books on the subject—written by men who were dead for centuries. He thought that the way to solve archaeological puzzles was to weigh the competing authorities…Don’t you see there’s something wrong with that?

I trust you do see.

But maybe you don’t see what this has to do with AI.

The largest success of AI—semi-statistical models—is in making fake pictures, and even fake voices, which seem real. Or real enough to pass casual inspection. So vivid are these creations that there has been much speculation whether real people will be needed in the future to, say, act in movies or television. Couldn’t AI just do it all?

No.

Not in pictures or in text. The real things—real people, genuine texts, real-life observations—will always be needed.

AI fits it models using observations taken from Reality. It can only ever approximate it, which is to say, only ever model the causes of things. Suppose you took away the new observations used to fit the models. It can keep all the old ones, but no new ones can be taken.

But what if we used AI output as new observations? Could these make up for the food AI needs to feed on?

Here is a portion of the text of the story Family Fun, which I believe is used in the literature portion of the final exam for bachelor’s degrees at Harvard for Women’s Studies students:

See It Go

“Look,” said Dick. “See it go. See it go up.”

Jane said, “Oh, look! See it go. See it go up.”

“Up, up,” said Sally. “Go up, up, up.”

Down it comes.

“Run, Dick. We can find it.”

“See me run,” said Sally. “See Spot run. Oh, oh! This is fun.”

A dirt-simple version of an AI language model works like this. We first gather a “corpus”, or a body of literature with which to “train” our model. And, as I tire of repeating, AI is only a model. This corpus is our observations.

Our “training” will consist of walking through every unique word and building a table of the chance each other word follows it. For instance, it looks like the most common word after up is (again) up. And, for example, it never follows up.

In a larger model we’d do the same for sentence length and punctuation, building in some hard-coded rules of English grammar (“All quotation marks need to be closed”, etc.). And we’d consider more than just the word that came immediately after each word, and also look at that chance of other words “downstream” from the current one. We don’t need any of that complexity here since it’s not needed to make our point. The complexity of the AI model is not relevant.

After training, we can generate “new” AI texts, just like Google’s Gemini. Why not generated some two-word sentences?

We first pick a word at “random” from this “corpus”. We next pick a word by the probability it has coming after it. Suppose our first word was up. The word with the highest chance of coming after is also up, so we have a good chance of generating that as the second word. If so, our two-word AI-generated sentence would be “Up up”.

Others might be “Up Jane”, “Up down” or “Up said.” If the first word was Dick, the two word sentences would be “Dick see”, “Dick we”. For see we get “See it”, “See me”, “See Spot.” Most of these are passable sentences in English, as long as we accept slang.

Suppose folks generated lots of these sentences. The temptation to use these as if they were genuine new essays would be overwhelming, for some. As we have already seen with other AI-generated content (students using them for essays, etc.). This some of these generations would end up on Harvard student essays, others in blog posts, in Substacks and places like that. Some would quote the text while acknowledging it was AI, and others would simply us it.

Since we got our original corpus from “scrapping” sources just like this, the result is that some of the generated text would end up back in a new training corpus!

The new corpus would consist of the old one plus the new text, which was model output. We would then fit our New & Improved AI using the augmented corpus, which included the output from the first model.

I hope it is easy to see that if we continued this process enough times, the resulting generated sentences would all look something like “Up up! Up up up up. Up. Up up up Spot.”

This would in essence be models of models of models of models of, etc. etc., of models of the original corpus.

The same exact thing would happen if we were to feed the results of AI-generated images back into the training sets of future AI models. It’s more of a danger here because AI (model)-generated images are much more common than model-generated texts. So these would more easily be sucked up into new training datasets than text. If this dangerous feedback wasn’t closely monitored, pretty soon all generated images would be four-fingered young Asian women in computer-game armor. And very, very bland looking.

Models, which is to say AI, can only manipulate what it is given using the rules set by the coders. Real life will always be needed, because models have no hope of capturing the full complexity of Reality.

Incidentally, sharp readers will recognize we have seen elsewhere the nonsense that results of models feeding other models which feed other models, and so on.

Subscribe or donate to support this site and its wholly independent host using credit card click here. Or use the paid subscription at Substack. Cash App: $WilliamMBriggs. For Zelle, use my email: matt@wmbriggs.com, and please include yours so I know who to thank.

20 Comments

  1. And I’d like to add something to the post. Because “AI” can’t be fed by it’s own output, there’s actually a cybernetic feedback loop that will – I argue – regulate the amount and quality of public use of generative “AI”. Here’s how it goes: if “AI” outputs really good content, this content will be posted everywhere. Because it’s really difficult to recognize it comes from “AI”, it will be fed into “AI” which will in turn tank the quality of said “AI”. Which will in turn cause people to not use “AI” as much, which will decrease the presence of “AI”-generated content in public. Which in turn will improve “AI” output until it again starts being posted publicly, repeating the loop.

    This needs to be dug into. Any cyberneticians in the house, other than medical doctors?

  2. Robin

    While AI is not really AI in the literal sense, what matters is function and perception. In it’s current form, AI is already identifying new molecular structures that humans have not been able to identify. It will be identifying new medicines and new vaccines that are currently beyond human conception. This will include race-specific bio-weapons of mass-incapacitation. One can argue that it is nothing more than pattern recognition, but it’s already happening.

    Also AI is going to rewrite history going forward. The historical record will never be the same again. Recently this has been done to doctor real-time sound and footage of the Super Bowl replete with AI generated comments. If you weren’t actually there and you don’t actually remember, by the time it gets to the internet, the true historical record will be gone. Think of fact-checkers changing the facts before the masses can see them.

    We are entering a very dangerous new world that will never be the same again. Our elected officials will become obsolete relics, with AI the new legislators will be Silicon Valley and it’s favoured Oligarchs.

  3. Stan Young

    Here are two examples from air quality/health effects epidemiology.

    An AI program might claim causation based on multiple positive correlations. It did. When pressed on the logic it conceded that correlation does not prove causation.

    A friend obtained access to a complete air quality/mortality data set. It was discovered that data had been selectively removed to obtain a positive effect. When all the data was used, there was no effect.

    These examples point to the possibility of an AI program adopting incorrect logic in its training data or being manipulated in the training process by selective use of data.

  4. bruce g charlton

    Well argued!

    Spells out what ought to be obvious to all… but isn’t.

  5. it is designed that algorithms (programs) learn through rewards; what does a program want more than anything else?

  6. Incitadus

    The more immediate and political application will be something along
    the lines of simple image collation for target acquisition followed by some
    rudimentary paint by the numbers decision tree. This is something even the
    Democrats can figure out like stuffing ballot boxes with deceased voters. Time to
    ditch those red MAGA hats before you’re added to their base.

    Ukraine A Living Lab for AI Warfare
    3/24/2023
    By Robin Fontes and Dr Jorrit Kamminga
    https://www.nationaldefensemagazine.org/articles/2023/3/24/ukraine-a-living-lab-for-ai-warfare

  7. kevin

    It’s sort of like what used to happen when we made audio recordings of original records on analog audio tape. The recording was good enough but if we then made a 2nd or 3rd permutation from the original tape recording, the quality quickly degraded. Without access to the fidelity of the original source, the value of subsequent copies diminished rapidly. Not a perfect analogy because presumably the “generative” AI output isn’t intended to reproduce the input, but rather be “new”… but close enough to demonstrate what happens when you feed models with models.

  8. Jim H

    Robin: “our elected officials will become obsolete relics”…. I see what you did there!

  9. umm.. nes?

    What this looks like is a classic beaker/jeep problem. In the beaker version you have two mugs, one of wine the other of water and the problem is to achieve stability in the mix after moving wine to water (and back) in fixed steps. In the jeep version the truck has a fuel load limit below what it takes to cross a desert but can make multiple trips and cache fuel en-route. In that case the problem is set as finding the number of trips it takes to cross the desert. In both cases you get a divergent series (so the beakers can never quite make it to half wine/half water; and the jeep can cross an arbitrarily large desert in a (potential quite large) number of trips) and that’s what you get here: something like (% sane content) = some_constant + (infinite series of decreasing fractional terms… ) where the series steps are determined by the relative weighting given the AI generated content fed back into the LLM at each step.

    So “nes” = Yes LLMs are feedback dependent; and No they can survive with no original content in the feedback loop simply by not taking anything new “seriously” .

    Of course in the human analogue it’s important to note that an LLM measuring truth against a fixed external standard risks falling into the same trap science was in for the 1600 or so years between Aristotle’s esoterics and Bacon et al.

  10. Phileas_Frogg

    Our self-referential culture is, at long last, undertaking the task of producing a machine to do the heavy lifting of our own self-indulgence, which will in turn inform and think for the humans under it’s thrall, who will feed more of itself into the machine, to think for the people, to feed more into the machine…ad infinitum. An inward spiral into the self, for all eternity.

    This is what Hell looks like. This is a pre-figuration of Hell.

  11. Hun

    What if reality is the continued input for the AI? Using cameras, microphones, sensors, artificial limbs etc.

  12. Cary D Cotterman

    He pushed the launch button for the rocket, and up it went.

  13. kevin

    Is that reality or is what the camera portrays and the microphone records are simply AI generated content? How do you know when it’s not “reality” any longer?

  14. @kevin, ask him questions until his battery is empty, then you will know.

  15. Eyrie

    I thought it was obvious that AI would be self referential.
    Go to huskhit.net, scroll down and see what happened when a couple of AI’s were fed the names of some British aircraft. The AI’s seem to have missed that with very few exceptions aircraft are left/right symmetrical.
    I tried the AI (Leo) in the Brave browser to find out how many Schempp-Hirth Arcus aircraft had been built. There are plenty of pictures of these on the net. It told me it was difficult to answer as it was a fictional aircraft.
    Artificial Idiocy.

  16. Eyrie

    Sorry, that’s hushkit.net

  17. Milton Hathaway

    The analogy that popped into my mind was inbreeding of humans. My understanding is that inbreeding is only a severe problem with small isolated populations (or small populations that self-isolate, like nobility and certain members of Congress). Would it be sufficient to train AI models on the outputs of a large number of other AI models?

    One does have to be extra careful when reasoning by analogy.

  18. C-Marie

    Oh my!!!!!!! Did not understand much …. but is interesting to read anyway!

    God bless, C-Marie

  19. Pk

    Other than speed, what’s really the difference. The press and science has been self referential for ages.

  20. Patient

    When I read multiple anti-conspiracy theorists in one place, they act as a sedative to me. I predict the most peaceful future of the website, and bright, in case you endure in this spirit of reasonableness until November, when the Father of Vaccine will return to the throne. Let’s face it: however we twist the question, the left is just rot and has no future. All the potential for the future – spiritual, moral, intellectual – is concentrated on the right.
    And what the future is:

    “Vladimir Putin signed Executive Order On Awarding the 2023 Presidential Prize in Science and Innovation for Young Scientists.

    February 7, 2024

    Having considered the proposals of the Presidential Council for Science and Education, I decide:

    To award the 2023 Presidential Prize in Science and Innovation for Young Scientists and the honorary title of Laureate of the Presidential Prize in Science and Innovation for Young Scientists:

    Susanna Gordeeva, Doctor of Physical and Mathematical Sciences, Professor of the Lobachevsky National Research University of Nizhny Novgorod, for the development of models and technologies of neuromorphic artificial intelligence based on biophysical neuron-astrocyte network models for memristive electronics
    ..
    From the interview with Susanna Gordeeva in the official website of the Decade of Science and Technology in Russia, Nauka.rf (=Science.rf):
    March 5, 2024
    Artificial intelligence technologies have become almost commonplace. Neural networks help diagnose diseases, become business consultants, and automate workflows. But there are tasks that algorithms are not yet able to do. For example, to completely reproduce the work of the human brain. Such developments at the intersection of biophysics, mathematics and biology are considered to be among the most promising in the world.
    ..
    Question: In what areas can such a neural network be used?

    Susanna Gordeeva: In fact, in any place where information processing is needed. But one of the promising tasks, perhaps, is the creation of neural implants or neural interfaces. Such systems record brain activity. Therefore, it is necessary that information processing should be carried out with the help of systems adapted to biological mechanisms.”

    Etc.
    As both the article and the comments made clear, AI has no future without people, but, according to People who decide people’s futures and people don’t have a future without AI. And their common future looks bright when they merge (which is why AI is be developed neuromorphic).

    However, only people with the necessary potential for sustainable development, and these are the right-wing people. So, just go until November without crossing a line beyond which you can be considered as conspiracy theorists, keep in reasonable angle, because there is every chance that the new era will rise soon after November. Maybe even with a quick start sometime in early 2025, who knows?

    So, I don’t believe in coincidences like this. And “we need to stop Russia,” you know, (maybe Taiwan is be soon too, who knows; in fact, it would be excellent for technological development if they are both at the same time; there would be a lot of competition and an urgent need to take precedence over enemies by shock innovations; and who else could at warp speed awaken the dormant empire after the era of Sleepy Joe, except the Father of the Vaccine? – obviously the man for the Great Deeds).

    *Even, frankly, the whole thing with the Democrats and the Sleepy Joe, the stolen elections, the stuff with the sex, immigrants, etc., so blatantly overexposed, and Donnie Boy with the “oppressed right waiting for a comeback” in contrast.. it looks a bit like a badly written by AI, cybernetic play with multiple actors in a complex and dynamic environment, etc.

Leave a Reply

Your email address will not be published. Required fields are marked *