Understanding the AI Micro-Universe: Episode One

 


Hello everyone, and welcome to the very first episode in our AI series. This will be a continuing set of discussions where I share how I personally perceive the “AI micro-universe.”

My hope is that, as I explain my framework, you may find parts that resonate with you. If so, this could become our shared foundation for discussion going forward. On the other hand, if you see things differently, that’s equally valuable—I encourage you to share your perspectives, so we can refine and enrich this understanding together.

In this first episode, I want to lay out three initial judgments that shape my thinking:

  1. AI is still underestimated.
    Despite the hype, I believe today’s AI remains vastly undervalued. Its true potential has not yet been realized, and its impact on society has only just begun. The real question is: at what point in this “micro-universe” will that potential begin to unfold?

  2. Once AI potential unlocks, productivity will be redefined.
    When the window of opportunity fully opens, AI may drive a complete re-engineering of industries. If nearly every sector can be reimagined through AI, how exactly will this technology integrate into our daily lives and workflows? That is where the real transformation lies.

  3. AI’s impact on the physical economy may surpass past digital revolutions.
    Unlike the internet wave, which primarily disrupted digital industries, I believe AI will affect the real economy even more deeply. Because most of us are tied, in some way, to physical industries, the ripple effects will be widespread. The challenge—and opportunity—is to become people who use AI as a tool to solve problems, rather than people who are eventually replaced by it.


The Structure of the AI Micro-Universe

At the foundation of this micro-universe, I see three core components:

  • Data (the raw material)

  • Models (the recipes or cuisines)

  • Training methods (the cooking skills of the chef)

Let’s use cooking as an analogy:

  • Data is the ingredients.

  • The model is the recipe or cuisine style.

  • Training is the chef’s technique.

Data – The Ingredients

No matter where you’re from—China, the U.S., Europe, or India—we all eat from the same broad categories: meat, vegetables, grains, dairy. Similarly, most AI systems draw on broadly available sources of data.

  1. Public data (like buying groceries at a supermarket). Most large language models are trained on open internet data—anyone can access these “ingredients.”

  2. Specialized data (like for autonomous driving). Collecting sensor data from cars is more like foraging wild vegetables: it takes effort, skill, and carries risks.

  3. Proprietary data (like growing food in your own backyard). This is highly specific to a company’s operations, and potentially the most valuable.

But in every case, data can’t be eaten raw. It must be cleaned, labeled, filtered, and normalized before use. Just as in cooking, the quality of preparation determines whether the dish will be nourishing—or dangerous.

Poor data governance can actually backfire: feeding messy data into AI produces misleading results. Conversely, unique, high-quality datasets can become a company’s strongest moat.

Models – The Recipes

Models are like cuisines—different approaches that serve different tastes and needs.

All modern AI models are branches of the same neural network family tree. For example:

  • Early models like RNNs and CNNs solved specific problems well.

  • The Transformer (2017) marked a breakthrough, vastly increasing generalization and reasoning ability.

Think of Transformers as the “iPhone 4 moment” of neural networks: a milestone that redefined what was possible.

But even within the same cuisine, chefs differ. Dishes vary depending on who prepares them. Likewise, even if two companies train on similar data, their models can turn out very different.

The current trend is toward modular, hybrid architectures: combining general-purpose models with specialized sub-models. This is like a modern restaurant that offers multiple cuisines under one roof, bringing out the strengths of each.

Training – The Chef’s Skill

Training is the most resource-intensive stage. It is:

  • Resource-intensive (massive compute and energy).

  • Talent-intensive (AI scientists, engineers, and researchers).

  • Labor-intensive (constant tuning, error analysis, diagnostics).

Training a large model is like firing ceramics in a kiln: once sealed inside, you can’t interfere until it’s done. If the result fails, you must retrace every step and try again.

This cycle is grueling but essential. The better the chef’s skill, the better the dish—the more effective the model.


The Breakthrough of Vectorization

Among all methods, one stands out as a turning point: vectorization.

Vectorization is the mechanism that gives machines a “bridge” to perceive the human world.

Humans live with abstract, often subjective concepts—like “cute” or “trustworthy”—that are nearly impossible to define precisely. Machines, however, understand only numbers.

Vectorization encodes our world into a numerical “dictionary” that computers can process. Each object, trait, or relationship is represented as a vector in a high-dimensional vector space.

  • For humans, “Wang Ziru” may mean a name, a height, and a personality trait.

  • For a machine, “Wang Ziru” becomes a series of numbers, such as [66, 78, 384].

These vectors capture both attributes and relationships. Similar concepts cluster closer together in vector space; unrelated ones drift farther apart.

The finer the data is split—down to characters, pixels, or sound frames—the richer and more accurate the vector space becomes. Machines, through statistical learning, build their own internal map of our world.

This is how AI systems, without ever truly “understanding” like humans, can nevertheless approximate meaning and operate effectively within our reality.


Closing Thoughts

In today’s episode, we established the foundation of the AI micro-universe:

  • Data as the ingredients

  • Models as the cuisines

  • Training as the chef’s skill

  • And most importantly, vectorization as the bridge connecting human cognition with machine computation.

In future episodes, we’ll build on this framework and explore how these mechanisms will transform industries, economies, and ultimately our daily lives.

Stay tuned—this is only the beginning.