In the current era of rapid development of AI technology, we are often amazed by its amazing creativity, but what exactly is the source of this creativity? This article explores the truth behind AI creativity from a new perspective, pointing out that it does not stem from a “perfect” knowledge system, but from “imperfect” design constraints.
We are in the era of “Renaissance 2.0” powered by AI. From ChatGPT’s Ghibli creative craze to Veo3’s explosive short videos, AI seems to have creativity comparable to or even surpasses human creativity.
We have always thought that the awakening of this ability stems from larger data sets, more complex algorithms, and a perfect imitation of the human world. We firmly believed that the road to stronger creativity is to “feed” a “god of innovation” with endless inspiration with a huge amount of data.
But what if the truth is the opposite? What if AI’s creativity does not stem from its “omniscient”, but from its “half-knowing”? What if those “technical flaws” that we have been trying to fix are precisely the secret engine that ignites its creative spark?
Recently, a blockbuster study published in Quanta Magazine turned my cognition upside down. The research of Stanford University researchers Mason Kamb and Surya Ganguli shows that AI creativity is not some kind of inscrutable “emerging intelligence”, but an unexpected inevitable result of “imperfect” design.
Unveiling the creative veil: the so-called “inspiration flows” is just a beautiful misunderstanding
For a long time, when we saw AI generate an image of “astronauts riding horses in a baroque palace”, we tended to think that AI “understood” astronauts, horses, and baroque styles and creatively combined them like human artists. It’s a comforting anthropomorphic imagination, but research shows that it’s a beautiful misunderstanding.
After 10 years of interaction design, why did I transfer to product manager?
After the real job transfer, I found that many jobs were still beyond my imagination. The work of a product manager is indeed more complicated. Theoretically, the work of a product manager includes all aspects of the product, from market research, user research, data analysis…
View details >
The truth is that the core of generative AI, represented by Diffusion Models, does not come from a “higher-order understanding” of concepts, but from two seemingly “flawed” underlying constraints in its architecture.This “ignorance” of the overall situation has freed AI from a classic shackle in human thinking – “Functionalfixedness”.
Functional fixation is a cognitive bias that humans have but AI does not, which means that after people know the fixed role and function of some things, they will habitually think that these things are used to do these things, while ignoring the functions of other aspects of these things.
In the experiment of the famous psychologist Karl Duncker’s “candle problem”, it is difficult to think of making a candlestick out of a box filled with thumbtacks, because our complete knowledge of the “box” (it is a container) limits the imagination. AI does not have this “curse of knowledge”, it does not “understand” the box, but only “sees” the local properties that can support the object, so this “ignorance” fulfills its creativity.
AI is not an omniscient painter, but more like a highly skilled mosaic artist, although he cannot see the whole picture of the entire mural, but with the limited colored tiles in his hands and a strict set of stitching rules, he can create stunning new patterns.
The “golden shackles” of AI creativity: two fundamental principles
So, what are these two pairs of “golden shackles” that are put on by AI but make it dance more beautiful?
The first shackles: Locality.This means that when AI models process information, they do not “look at the big picture” like we do, but can only focus on a very small patch of images at a time. It is like an observer who sees the world through a keyhole, with an extremely limited field of vision.
AI doesn’t know what a complete cat looks like, but it knows the local features of “cat hair texture”, “sharp contours of cat ears”, and “reflection of cat eyes”. This “visual field defect” forces it to generate images without being able to directly copy an entire cat in memory, but to reassemble countless “local fragments” it has learned.
The second set of shackles: Translational Equivariance.It sounds professional, but the principle is intuitive. It is a set of iron laws to ensure “structural consistency”. In simple terms, if the model learns the texture of a “brick wall” in one local block, it will use the exact same rules and structure when it needs to draw a brick wall in another part of the image. This ensures that the AI-generated world doesn’t fall into chaos.This idea coincides with the “WorldModels” vigorously advocated by AI pioneer and Turing Award winner Yang Likun.The core of both is to let AI learn the predictable and generalizable basic laws of the world, rather than memorizing endless appearances. It is this adherence to the underlying rules that makes the “collage” of AI seem real and credible.
When “locality” breaks the world into infinite possibilities for reorganization, while “translation and metamorphosis” is like an invisible thread, stitching these fragments together in a harmonious, coherent, logical way, the miracle of creativity is born.
Coincidentally, the “Equivariant Local Score (ELS)” machine developed by AI researchers in recent years is a simplified mathematical model that optimizes only these two core principles, and it can highly reproduce the output of complex diffusion models, which once again proves –Restraint, not freedom, is the true source of AI creativity.
Turning “limitations” into “catalysts”: 3 ways to systematically improve AI innovation
Understanding the fundamental principles of AI creativity, the direction in which we improve AI innovation capabilities becomes clear. We are no longer blindly enlarging models and piling up data, but we can act like a skilled engineer.Through “design constraints” to actively guide and stimulate the creative potential of AI. This philosophy of “embracing limitations” has long been common in the history of human innovation. Steve Jobs believed in “Simplicityistheultimatesophistication” throughout his life, and his ultimate constraint of retaining only one Home button on the iPhone was precisely the revolutionary experience of a generation of products.
Similarly, in the world of AI, we can turn limitations into catalysts for innovation by:
Method 1: Design an “imperfect” architecture.In the future, the focus of AI model design may no longer be simply to pursue “bigger and stronger”, but to strategically build architectures with specific “creative flaws”. We can design models with different “local” visions in different dimensions, or introduce more interesting “isomorphism” rules (e.g., rotation, scaling, etc.), just like giving LEGO players bricks of different shapes and functions to build more imaginative pieces.
Method 2: Manage data “information gaps”.If we want AI to draw more creative chairs, perhaps we shouldn’t just show it thousands of photos of chairs. We can try a “poor information” training method: show the model the local textures of countless objects (wood, metal, fabric), and then show it countless structures (four-legged, single-legged, suspended), but not show it a complete “chair”. This will force the model to explore and combine like never before in its “local knowledge base”, thus “inventing” chair designs that we have never seen before.
Method 3: Elevate prompt engineering to the “art of constraints”.When we input “a butterfly made of crystal, perched on a lava flow” to the AI, we are letting the model complete an unprecedented “creative jailbreak” under strict constraints (crystal texture + butterfly structure + lava environment).This is reminiscent of legendary musician Brian Eno’s famous “ObliqueStrategies” card.When he gets stuck, he pulls out a card with instructions like “just one note” or “repeat an action” to break the mindset and inspire new ideas through this artificial limitation.
This method also makes the meaning of the prompt more profound. A good prompt is essentially imposing a clever “creative constraint”.
What problems do we face when embracing “imperfection”?
The research and exploration of AI creativity may allow us to reflect on our obsession with AI to “perfectly” reproduce the human brain, and instead make good use of AI’s “imperfection”. The key to innovation in our hands is no longer endless data and computing power, but the ability to design “intelligent constraints”.
This also raises two deeper questions:
- Since constraints are the engine of creativity, is there a “optimal constraint” scale? Too many constraints will stifle creativity, too few will lead to chaos, where is the “golden ratio” that stimulates the greatest innovation?
- If AI’s creativity stems from a “cognitive paradigm” that is completely different from humans, has the pursuit of artificial general intelligence (AGI) that makes AI think like a human being deviated from the beginning?
Perhaps these will be the key research directions in the field of AI in the future.