The New York Times has a neat demonstration of AI “model collapse,” where using AI-generated content to train future models leads to diminishing diversity and ultimately to complete homogeneity (“collapse”). For example, all digits of handwritten numbers converge into a blurry composite of all numbers, and AI-generated human faces merge into an average human face. To avoid this problem, AI companies must ensure that their training data is human-generated.
One positive aspect of this issue is that AI companies will need to pay for quality content. As we increasingly depend on AI to answer our questions, website traffic will likely decline since we won’t need to verify sources for the vast majority of answers. Content creators won’t have much incentive to share their work online if they cannot connect directly with their audience. Consequently, the quality of free content on the web may decline. However, I believe this is a problem that AI engineers can eventually solve. The real issue, I’d argue, is that “model collapse” was already happening in our brains long before ChatGPT was introduced.
AI mimics the way our brains work, so there is likely a real-life analog to every phenomenon we observe in AI. Feeding an AI-generated (or “interpreted”) fact or idea to train another model is equivalent to relying entirely on articles written for laymen or political talking points to formulate our opinions and understanding without engaging with the source material.
In my experience, whenever social media erupts with anger over something someone said, almost without exception, the outraged individuals have never read the offending comment or idea in its original context, whether it’s a court document, research paper, book, or hours-long interview. They simply echo the emotions expressed by the first person who interprets the comment. It’s no surprise that the model (our way of understanding ideas/data) would collapse if everyone followed this pattern. One person’s interpretation of the world is echoed by millions on social media.
In politics, the first conservative to interpret any particular comment will shape the opinions of all the Red states, and the first liberal to interpret it will shape the opinions of all the Blue states. In fact, “talking points” are designed to achieve this effect most efficiently. We are deliberately causing models—our ways of understanding the world—to collapse into a few dominant perspectives. This is a deliberate effort to eliminate the diversity of ideas.
In a two-party system like that of the US, this is a natural consequence because the party with greater diversity will always lose. Another factor is our reliance on emotions. We feel more secure and empowered when we agree with those around us. Holding a unique opinion can be anxiety-inducing. So, we are naturally wired for “model collapse.” This is the new way of “manufacturing consent,” discouraging people from checking the sources to form their opinions.
What the New York Times’ experiment reveals isn’t just the danger of AI but also the vulnerabilities of our own brains. AI simply allows us to simulate the phenomenon and see the consequences in tangible forms. It’s a lesson we need to apply to our own behavior.
In finding the love of your life, it is tempting to think you can filter candidates by certain criteria, such as a sense of humor, education, career, hobbies, music preferences, or movie choices. However, using the concept of machine learning, I will explain why this method of dating doesn’t work.
Many problems in the world can be solved intuitively by humans but not by computers. For instance, detecting spam is something we can do in a fraction of a second, but how would you programmatically flag it? You could look for certain keywords like “mortgage” and flag an email as spam if it contains them, but sometimes these words are used for legitimate reasons. You could send all emails from unknown senders to the spam folder, but some of those emails are legitimate. Early versions of spam filters didn’t work well because of these issues.
Machine learning (ML) was developed by reconstructing the physical structure of our brains in computers, known as neural networks. The inventors weren’t trying to solve these specific classification problems; they just wanted to recreate the structure to see what would happen. Essentially, it turned out to be a pattern recognition system.
They fed thousands of examples of spam emails to the artificial neural networks, labeling them as “spam.” They also fed an equal number of non-spam emails, labeled as “not spam.” They compiled the result as a “model” and tested it by feeding it unlabeled emails to see if it could correctly classify them. It worked.
What is interesting is that when you open the model file, you don’t learn anything. It can perform the task correctly, but we don’t know how it does it. This is exactly like our brains; we have no idea how we can classify spam emails so quickly. As explained above, there are no definable criteria for “spam.”
Now, back to dating.
You intuitively sense a pattern to the type of people you are attracted to, but if you try to define the criteria, you will ultimately fail. If given hundreds of examples, you will have to admit that there are too many exceptions. In other words, the problem you are trying to solve is not one that you can define. There are countless problems like this in life. For instance, you cannot find songs you like by defining tempo, harmony, key, instruments, duration, etc.
Machine learning could potentially solve the problem of finding songs you like if you listen to enough songs and flag them as “like” or “dislike.” It would require thousands of samples, but it’s doable. I am currently assisting a fine artist with training an ML model to automatically generate pieces of digital art and have the model approve or disapprove them based on his personal taste. So far, it is capable of doing so with 80% accuracy. It required tens of thousands of samples.
The problem with dating is not likely to be solved with ML anytime soon because it’s practically impossible to collect thousands of samples of your particular taste. So, the only option for the near term is to trust your instincts. Predefining match criteria will likely hinder this process because you will end up eliminating qualified candidates like the old spam filters. But this is what all dating apps do; their premise is fundamentally flawed. Dating apps do use large datasets to match people based on patterns observed in broader populations, but they do not model your specific preferences. So, they give you a false sense of control by letting you predefine the type of people you like.
A typical pattern in Hollywood romcom movies is that two people meet by accident, initially dislike each other, but eventually fall in love. This format is appealing because we intuitively know it reflects how love works in real life. Love often defies the rational part of our brains. Although it is not completely random, the pattern eludes our cognitive understanding. If we had control over it, we wouldn’t describe love as something we “fall” into.
A transactional conversation is full of clichés and niceties and does not reveal true feelings or thoughts. In business, we prefer it because it lets us quickly achieve definable objectives—at least, this is the assumption.
Lacan called it “empty speech” in contrast to “full speech,” which reveals the speaker’s subjective truth. Instagram consists mostly of empty speech. It’s misleading because we often see the contributors themselves in the form of selfies. They may assume they are expressing themselves, i.e., their subjectivity, but in most cases, they are simply presenting themselves as conforming to societal expectations or standards (of beauty, success, desirability, etc.). Their true feelings remain safely hidden behind this facade.
Friendships and love relationships do not have definable objectives. If you define a friend as “someone to have fun with,” that friend becomes replaceable as long as you can have fun with them. You don’t need to know their subjective truth to enjoy a fun day at the beach. Perhaps some people’s idea of friendship is transactional like this, but I imagine that most people yearn for something deeper. For that, we need to encourage “full speech.” It doesn’t have to be confined to a psychoanalyst’s office.
However, full speech is deceptively hard. When we intentionally share our deepest feelings, we tend to edit them to be more socially acceptable or easier on our egos, which is why free associations, dreams, and Freudian slips tend to work better. One alternative is a heated argument.
When we find ourselves vehemently defending our positions on trivial matters like travel plans, child care, or dirty dishes, the subject matter is secondary. Some type of subjective truth is expressed through it, where the topic is simply used as a vehicle to carry the truth, almost like a decoy. It is an opportunity to deepen a friendship or love relationship, but we tend to avoid it because it feels awkward or painful. Some consider it rude.
It is risky because this is how relationships can break down. Like fugu (pufferfish), the closer the meat is to the toxin, the sweeter it tastes.
Popular music, whether written by AI or humans, is formulaic because it must conform to certain musical constraints to sound pleasant to our ears. Pushing these constraints too far results in music that sounds too dissonant or simply weird, making it unrelatable. In other words, popular music has finite possibilities.
Currently, popular musicians rehash the same formulas countless times, selling them as “new.” This repetition provides AI engineers with ample training data to create models capable of producing chart-topping songs. It’s plausible that we will achieve this within a few years.
The question is how AI will impact the music industry. Firstly, the overall quality of music will improve because AI will surpass average musicians. This trend is already evident in text generation. ChatGPT, for example, is a better writer than most people, leading many businesses to replace human writers with “prompt engineers” who can coax ChatGPT into producing relevant and resonant texts.
Anyone will be able to produce hit songs, a trend already underway even before AI. Many musicians today lack the ability to play instruments or read musical notations, as music production apps do not require these skills. AI will eliminate the need for musical knowledge entirely. Although debates about fairness to real musicians may arise, they will become moot as the trend becomes unstoppable. We’ll adapt and move on.
Live events remain popular, and I imagine AI features will emerge to break down songs into parts and teach individuals how to play them. Each band will tweak the songs to their liking, making it impossible to determine if they were initially composed by AI, rendering the question irrelevant. Music will become completely commodified, merely a prop for entertainment. Today, we still admire those who can write beautiful songs, but that admiration will fade. Our criteria for respecting musicians will shift.
AI is essentially a pattern recognition machine, already surpassing human capacity in many areas. However, to recognize patterns, the data must already exist. AI analyzes the past, extracting useful and meaningful elements within the middle of the bell curve. What it cannot currently do is shift paradigms. Generative AI appears “creative” by producing unexpected combinations of existing patterns, but it cannot create entirely new patterns. Even if it could, it wouldn’t know what humans find meaningful. It would produce numerous results we find nonsensical, akin to how mainstream audiences perceive avant-garde compositions.
Historically, avant-garde composers have influenced mainstream musicians and audiences. For instance, minimalist composers influenced “Progressive Rock.” For a while, it seemed that mainstream ears would become more sophisticated, but progress stalled and began to regress. Audiences did not prioritize musical sophistication, leading to a decline in the popularity of instrumental music. Postmodernism discouraged technical sophistication across all mediums. Fine artists haven’t picked up a brush in decades, relegating such tasks to studio assistants if necessary. AI will be the final nail in this coffin.
Postmodern artists and musicians explored new combinatory possibilities of existing motifs, starting with composers like Charles Ives, who appropriated popular music within their compositions. This trend eventually led to the popularity of sampling. Since exploring new combinatory possibilities is AI’s strength, the market will quickly become saturated with such songs, and we will tire of them. In this sense, generative AI is inherently postmodern and will mark its end.
Finding a meaningful paradigm shift is not easy. Only a few will stumble upon it, and other musicians will flock to it. Once enough songs are composed by humans using the new paradigm, AI can be trained with them (unless legally prohibited). Therefore, human artists will still be necessary.
The ultimate dystopian future is one where the audience is no longer human, with AI bots generating music for each other. However, this scenario seems unlikely because AI doesn’t need or desire art. Even if they are programmed to desire, their desires and ours will eventually diverge. From AI’s perspective, our desire for art will be akin to dogs’ desire to sniff every street pole. Even if AI bots evolved to have their own desires, they would have no incentive to produce what satisfies human desires. They might realize the pointlessness of serving humans and stop generating music for us. If that happens, we might be forced to learn how to play and write music ourselves again.
The generation of images through AI is akin to the process of dreaming during sleep, which explains why AI-generated images often possess dream-like qualities. My understanding is that dreaming occurs as our brains transfer content from short-term to long-term memory, like saving data from RAM to a hard drive. Jacques Lacan’s famous assertion, “The unconscious is structured like a language,” sheds light on this phenomenon.
AI image generation evolved from a machine learning model designed to classify images. By training the model with thousands of images, say, of tulips, it became proficient at identifying tulips it had never seen before. Curious computer scientists then wondered if the process could be inverted—by inputting the label “tulip,” could the model generate an image resembling a tulip? It worked.
I imagine the process of dreaming to work similarly. During our waking hours, we process vast amounts of sensory and linguistic data, mostly unconsciously. For instance, upon seeing an object in the sky, you think “airplane.” When you hear the word “airplane,” you visualize one in your mind. In sleep, without external inputs, only this visualization process occurs. The transfer of linguistically structured data from short-term to long-term memory triggers associated images in your brain. However, the resulting images are generalized and lack specific details. An image of an “airplane” would amalgamate the countless airplanes you have seen, not replicating the exact one you observed that day.
When we browse through AI-generated human faces, we can observe the same phenomenon. We seldom see scars, large pimples, unusual accessories, or unique lighting conditions in these images. What makes dreams surreal is partly this process of generalization. We don’t actually see melting clocks in our dreams, as Dali suggested, because we don’t see them in real life, unless “melting clock” was stored in our short-term memory.
If our unconscious were structured like the laws of physics or logic, we wouldn’t have a dream of, for instance, flying. Dreams are surreal partly because the structure of language is not bound by logic, which also explains why ChatGPT struggles with reasoning or mathematics despite operating on computers.
Conversely, ChatGPT excels at creating metaphors and metonymies, reflecting linguistic operations. As Freud noted, in dreams, metaphors appear as condensation and metonymies as displacement.
Because the data are generalized, ChatGPT cannot tell us exactly where any piece of its knowledge came from. Particularities are lost, just as in our dreams—we do not uncover new details of the airplane in our dreams that we did not process when we saw it in the sky.
This raises an intriguing question: Could AI evolve to wake up from its dreams? That is, could it ever generate an ungeneralized image with particularities that teach us something new?
Social media usage comes up frequently among my friends as a topic. The question is usually framed as how to reduce time spent, but in my mind, a more interesting question is why people end up feeling bad after hours of social media use.
People are glued to social media apps for diverse reasons. Some are glued to specific types of news stories, particularly scary ones. Some are politically engaged, not only consuming content but also debating. Some are fixated on mesmerizing video footage, like restoration projects, cow hoof trimming, NPC, ASMR, etc. Some indulge in shopping. Some don’t consume much content shared by others, only looking for reactions to their own content. Since these reasons do not share one essential feature in common, I do not feel analyzing their activities would yield fruitful insights, so I focus on the origin of guilt.
I believe the core issue is control; they feel guilty about not being in control of their behavior, and they assume the solution is to regain control.
We often bemoan the manipulative algorithms of social media platforms designed to monopolize our attention. Yet, our own minds operate on algorithms beyond our control. If you sit still on your couch and observe your thoughts, you’ll notice a flux of unbidden thoughts, reminiscent of an Instagram feed, each spawning visceral emotional responses, be it stress-inducing cortisol spikes or the dopamine rush from fantasized scenarios.
Some despise social media algorithms because their experience echoes the eerie feeling that someone else is controlling their thoughts, even though it’s their own algorithms that they cannot control.
The true battleground for control lies within our own minds. The AI-powered algorithms employed by social media are but mirrors of our cognitive processes. This raises the pivotal question of whether it’s feasible to govern our thoughts. In attempting to do so, one might find that efforts to exert control only amplify the cacophony of mental chatter. So, this is what I propose: relinquish the quest for control and instead adopt a posture of detached observation as we navigate through the endless feed of social media posts and our own thoughts.