Demystifying the technical insider of Cursor, Perplexity, and Lovable: Why they all chose the “anti-frame” route

While the industry is still chasing endless AI frameworks, star products such as Cursor, Perplexity, and Lovable are doing the opposite, using the lowest basic units to build agents in a “building block” style. This paper deeply analyzes the technical logic of this “anti-framework” route: in the current rapid iteration of models and paradigms, excessive abstraction has become a shackle to innovation and scale; Returning to simple, composable AI primitives not only provides better performance and flexibility in scaling, but also gives developers the ability to truly control the future.

Have you ever wondered why Cursor, v0, Perplexity, Lovable, Bold, and other top AI agent products serving millions of users all have an amazing thing in common? None of them are built on any AI framework. While the entire industry is frantically chasing the latest AI frameworks, trying to simplify development through layers of abstraction, these truly successful products have chosen the exact opposite path – they are built directly on AI primitives to solve the most complex problems in the most unpretentious way. This is not a coincidence, but reveals a fundamental truth that we may have been overlooking: in the era of rapidly changing AI, framework abstraction may instead be the biggest obstacle to innovation and scale.

Recently, I delved into a talk by Ahmad Awais, founder and CEO of Langbase. Ahmad is by no means an ordinary person, his technical resume is legendary: NASA helicopter mission code contributor, member of Google’s Developer Advisory Board, former vice president of developer tools, and core contributor to well-known open source projects such as WordPress, Next.js, Node.js, React, etc. His open-source subscription has a staggering 4000-50 million downloads per year and has also created the popular Shades of Purple code theme.

What’s more, he has been deeply involved in LLM technology development since 2020, when GPT-3 was only released a month ago, Greg Brockman gave him direct access and started building code generation tools a year before GitHub Copilot. This technical depth makes his insights into AI agent construction particularly weighty. In this lecture, he not only built AI agents with 8 different architectures with basic units on the spot, but also put forward a subversive view: the best AI agents should be assembled with basic units like building blocks, rather than being bound by the abstract layers of the framework.

The Trap of Framing: Why Abstraction Became a Productivity Killer

In Ahmad’s more than a decade in technology, he has witnessed countless technical cycles, but the situation in the AI space has made him rethink what frameworks mean to exist. Ahmad pointed out in his talk: frameworks don’t really add value, they’re bloated, slow to iterate, and full of abstractions that no one really needs. This idea may sound harsh at first, but after thinking about it, I realized that it hits the core pain point of current AI development.

To achieve these three challenges, product managers will only continue to appreciate
Good product managers are very scarce, and product managers who understand users, business, and data are still in demand when they go out of the Internet. On the contrary, if you only do simple communication, inefficient execution, and shallow thinking, I am afraid that you will not be able to go through the torrent of the next 3-5 years.

View details >

In traditional software development, frameworks are valuable because the technology stack is relatively stable, the business model is relatively fixed, and the abstraction layer can help developers handle repetitive tasks and improve development efficiency. But that’s not the case at all in the AI space. Every few weeks, new large models are released, new agent architectures appear, and new capability boundaries are broken. As Ahmad said, the field is changing so quickly that any predefined abstraction can quickly become obsolete. What’s worse is that when you are deeply dependent on a framework, you are locked into that specific layer of abstraction. When there is a major breakthrough in technology, it is difficult for you to adapt quickly because you either wait for the framework author to update (which is usually slow) or struggle to migrate to a new framework.

I particularly agree with the classic example of Amazon S3 mentioned by Ahmad. S3 is able to support the entire cloud computing ecosystem not because it provides a complex object storage framework, but because it provides extremely simple basic units: uploading data and downloading data. These two basic operations support countless complex application scenarios, from simple file storage to complex data lake architectures. The core of this design philosophy is to provide strong enough underlying capabilities that allow developers to flexibly combine them according to specific needs, rather than trying to predict all possible use cases and provide corresponding abstractions.

From a production practice perspective, I find that the biggest problem with most current AI frameworks is that they try to solve an inherently unsolvable problem: how to provide a stable abstraction for a rapidly changing technology field. It’s like trying to make an accurate topographic map of an erupting volcano that is doomed to failure. Another problem with frameworks is that they are often too generalized, introducing a large number of configuration options and abstraction layers to meet various possible use cases, which ultimately become complex and inefficient. In a field like AI, where performance and responsiveness are extremely demanding, this bloat is unacceptable.

Another great insight from Ahmad: most engineers are rapidly transitioning into AI engineers. Front-end developers, back-end developers, DevOps engineers, ML engineers, all learning how to integrate AI capabilities into their products. This shift requires us to rethink the design philosophy of development tools. These engineers already have extensive programming experience and mature thinking patterns, and instead of another complex set of frameworks and abstract concepts, they need basic units that can be directly operated with familiar programming languages and patterns. As demonstrated in the presentation, AI agents built on base units are essentially ordinary JavaScript/TypeScript code that can be understood and modified immediately by any developer with programming experience.

The power of AI base units: Build intelligence like Lego

After delving into Ahmad’s methodology, I began to re-understand the essential architecture of AI agents. I found that no matter how complex an AI agent is, its core can be broken down into several basic building blocks, just like the periodic table of chemical elements, where all compounds are made up of basic elements. In his presentation, Ahmad identified several key AI foundations: Memory (long-term memory system with vector storage capabilities), Thread (conversational context and state management), Tools (external tool calling ability), Parser (multi-format data parsing), Chunker (document segmentation and preprocessing), Router (intelligent routing decisions), and Evaluator (quality evaluation and feedback).

The beauty of these basic units lies in their composability and specialization. Each unit focuses on solving a specific problem and getting to the extreme on it, then building complex features with flexible combinations. It’s like the perfect embodiment of the Unix philosophy of “do one thing and do it well” in the age of AI. The Memory unit doesn’t need to worry about how to parse PDFs, the Parser unit doesn’t need to think about how to store vectors, and each unit has clear boundaries of responsibilities, which makes the entire system efficient and easy to understand.

During the live demonstration of the talk, Ahmad said “chat with PDF” and his CHAI system automatically identified which basic units were needed and generated the complete implementation code. The whole process was impressive: the system automatically determined that a Memory unit was needed to store PDF content (including vector storage capabilities), a Parser unit was needed to process PDF formatting, a Chunker unit was needed to segment long documents, and an LLM unit was needed to understand questions and generate answers. Most importantly, the generated code is completely frame-free, pure JavaScript code, clear, readable, and easy to modify.

I was impressed by the effect of the presentation. Ahmad uploaded several PDF files, including his bio, speech information, Langbase’s API documentation, and more. The Parser unit automatically converts these PDFs into text, the Chunker unit splits long documents into suitable segments, and the Memory unit automatically vectorizes and stores them, and the entire process is fully automated. When he asks, “Who is the founder and the topic of his last three presentations,” the agent is able to find relevant information across multiple documents and give accurate answers. This cross-document information integration capability is the power of the basic unit combination.

I particularly appreciate the transparency and controllability of this architecture. Unlike black box frameworks, basic unit-based systems allow you to clearly see what happens at each step, understand how data flows, and make it easy to locate and debug problems as they arise. Ahmad even found a small bug in the demo that he could quickly fix directly in “vibe code” mode without having to dig into complex framework source code or wait for a framework update. This immediate feedback and direct control are crucial for production environments.

From the perspective of architectural evolution, I think the basic unit approach is more in line with the natural evolution of software systems. Complex systems tend to evolve from simple components rather than being designed to be complex from the start. The basic unit provides such an evolutionary path: you can start with the simplest combinations and gradually add more units according to actual needs, each step being controllable and verifiable. This incremental approach not only reduces risk but also makes the system easier to maintain and scale.

Eight Architecture Patterns: Production Practices for Basic Unit Combinations

In his presentation, Ahmad detailed eight different AI agent architecture patterns, each based on different combinations of base units. These architectural patterns not only demonstrate the powerful combination capabilities of basic units, but more importantly, they cover the vast majority of AI agent requirements in the current production environment. After careful analysis, I found that these eight patterns actually constitute a complete AI agent design language that can handle a wide range of scenarios, from simple Q&A to complex reasoning.

The first is the Augmented LLM architecture, which is the most basic and commonly used pattern. It combines LLMs with basic units such as Tools, Thread, and Memory to form an intelligent agent capable of calling external tools, maintaining conversation state, and accessing long-term memory. In this architecture, LLMs are no longer isolated text generators, but agents capable of perceiving the environment, invoking tools, and learning memories. Ahmad emphasizes that the key to this architecture is that each base unit is independent and replaceable, and you can choose the most appropriate implementation based on your specific needs.

The second is the Prompt Chaining & Composition architecture, which handles complex multi-step tasks by connecting multiple specialized agents in series. Ahmad demonstrates an example of marketing content generation with a summary agent, a feature extraction agent, and a marketing copy agent. Each agent has a clear division of responsibilities and works in a predetermined order, with the output of the previous agent as input for the next agent. The ingenuity of this design is that each agent can use the model that best suits their task, such as summarizing with Gemini (because it excels at understanding and generalizing), Claude for reasoning (because it is stronger at logical analysis), and programming with GPT-4 (because of its excellent code generation capabilities).

The third is the Agent Router architecture, which is one of the patterns that I am most interested in personally. In this architecture, an intelligent routing agent analyzes user input and then decides which specialized execution agent to call. Ahmad built a system with three specialized agents: the summary agent (using Gemini), the reasoning agent (using DeepSeek Llama 70B), and the programming agent (using Claude Sonnet). When a user asks “why are the days shorter in winter”, the routing agent correctly identifies this as a problem that requires scientific reasoning and automatically routes the task to the inference agent for processing. The value of this architecture is that it automatically selects the optimal processing path based on the characteristics of the task, and each specialized agent can be optimized and upgraded independently.

The fourth is the Parallel Agents architecture, which leverages the concurrency capabilities of modern programming languages to run multiple agents simultaneously to handle different aspects of the same input. In JavaScript, this is easily achieved with Promise.all. For example, for a piece of customer feedback, you can run the Sentiment Analysis Agent, Key Information Extraction Agent, and Problem Classification Agent at the same time, and then combine the results to form a comprehensive analysis report. This parallel processing not only greatly improves efficiency but also allows for simultaneous analysis of problems from multiple perspectives for more comprehensive insights.

The fifth is the Orchestrator-Worker architecture, which I think is the most innovative model that simulates how human teams work together. An orchestrator, an agent, analyzes and breaks down complex tasks into subtasks, then assigns them to multiple workers in parallel, and finally a synthetic agent integrates all the results into the final output. The blog writing example demonstrated by Ahmad is particularly brilliant: the orchestrator breaks down “Write a blog about the benefits of remote work” into five specific subtasks: write an introduction, write a productivity section, write a work-life balance section, write an environmental impact section, and write a conclusion. The five worker agents work in parallel, each focusing on their own parts, and finally the comprehensive agent combines all the parts into a coherent complete article. The power of this architecture is that it can handle tasks of almost any complexity, as long as it can be decomposed efficiently.

The sixth is the Evaluator-Optimizer architecture, which improves output quality through continuous evaluation and optimization, similar to a simplified version of RLHF (Human Feedback Reinforcement Learning). A build agent creates the initial content, then an evaluation agent (usually using the best-performing LLM) evaluates the results, and if not satisfied, provides specific suggestions for improvement, and the build agent optimizes based on feedback. The example of an eco-friendly water bottle product description demonstrated by Ahmad is compelling: the first generated description was not well targeted to the target audience of “environmentally conscious millennials”, and the evaluation agent provided very specific suggestions for improvement, including what features should be emphasized, what tone of voice to use, etc. The second generated description is significantly more aligned with the needs and preferences of the target audience.

The seventh is the Tool Calling architecture, which allows agents to seamlessly integrate with external systems and expand their capabilities. The eighth is the memory architecture, which is the document Q&A mode we saw at the beginning of the presentation.

What impressed me the most was that Ahmad built several complex real-world applications using CHAI on site. He built a Perplexity-like deep research agent that automatically identified the entire process of analyzing queries, conducting web searches (using the Exa search tool), synthesizing results, and generating responses, all in pure JavaScript with no framework dependencies. He also demonstrated a receipt checker that automatically found and integrated Mistral’s OCR API when it found that there was no OCR base unit off-the-shelf, demonstrating the extensibility of the base unit system. There is also an image analysis agent that can analyze the expressions and emotions of people in the picture, and Ahmad uploaded a photo of himself, and the agent accurately described his “slightly raised eyebrows and slightly suspicious and curious expression”.

Why basic units are an inevitable choice for the future

After gaining a deeper understanding of Ahmad’s philosophy and practice, I began to think about the profound impact of this basic unit-based approach on the AI industry as a whole from a broader perspective. I think we’re going through a technological revolution similar to that from assembly languages to higher-level languages, but interestingly, this time the direction seems to be the opposite: we’re returning from over-abstraction to expressions that are closer to the essence and more controllable.

From the perspective of technological development trends, the basic unit method conforms to the natural evolution law of software systems. In the history of computer science, those enduring technologies have often followed the design principle of “simple but powerful”. Unix pipelines and tools, the HTTP protocol, and the SQL language have been around for a long time and continue to evolve because they provide simple yet powerful basic units that developers can flexibly combine to solve complex problems. The AI Fundamentals unit continues this tradition, providing a stable infrastructure for the rapidly changing AI landscape.

From a developer experience perspective, the basic unit approach significantly reduces the cognitive burden of AI development. Ahmad made an important point in his speech: when most engineers are transitioning to AI engineers, they need not another complex conceptual system, but the ability to build AI applications with familiar tools and mindsets. The basic unit meets just that. A developer with JavaScript experience can immediately understand the AI agent code based on base units, as it is essentially ordinary JavaScript code with special API calls. This continuity is crucial for the promotion and popularization of technology.

From a performance and scalability perspective, the basic unit approach offers significant advantages. Each basic unit can be optimized independently, such as the Memory unit can focus on performance optimization for vector search, and the Parser unit can focus on processing efficiency for various document formats. This specialized division of labor enables the entire system to achieve higher performance levels. At the same time, the stateless design of the base units allows them to naturally support horizontal scaling, making it easy to implement serverless architectures. Ahmad emphasized this point in his talk: agents built on Langbase base units can automatically scale to handle a wide range of loads, from a few requests to millions of requests.

From a business agility perspective, the basic unit approach allows businesses to respond more quickly to market changes and user needs. When new LLMs are released or new AI capabilities emerge, businesses can quickly integrate new technologies without refactoring the entire system. For example, if better document parsing technology comes along, you just need to replace the Parser base unit and the rest will not be affected at all. This modular design greatly reduces the risk and cost of technology iteration.

From the perspective of technology ecology, the basic unit approach promotes true technical standardization. Unlike frameworks that try to standardize the abstraction layer (which often leads to fragmentation), the basic unit standardizes on capability interfaces. This allows the basic units of different manufacturers to replace and combine with each other, forming an open, competitive, and collaborative ecosystem. The Model Context Protocol (MCP) mentioned by Ahmad is an example of this standardization effort.

I particularly agree with Ahmad’s judgment on the future trend: as LLMs become smarter in agent workflows, applications tied to specific framework abstractions will have a hard time adapting quickly to new capabilities. Systems built on base units can more easily integrate new AI capabilities and working modes due to their underlying flexibility. It’s like building a scalable instruction set architecture for the AI era, allowing upper-layer applications to evolve without being constrained by changes in the underlying technology.

In the long term, I predict that we will see a layered AI development ecosystem: the bottom layer is high-performance, high-reliability basic unit services (e.g., Langbase, OpenAI API, Anthropic API, etc.), the middle layer is a library of composable patterns and best practices for specific industries or scenarios, and the upper layer is end-user-facing application products. This hierarchical structure will both maintain sufficient flexibility to accommodate technological changes and provide enough abstraction to simplify development efforts.

Ultimately, I believe that the basic unit approach represents a new philosophy of technology: not to try to predict and control the future, but to build infrastructure that is flexible and robust enough to accommodate the possibilities of the future. In this era of uncertainty and AI, this may be the technological paradigm we need most. As Ahmad said at the end of his talk: Instead of building an agent with a bloated framework, build your own “Agent Stack” from the basic units like building blocks. This is not only a technology choice, but also a strategic thinking for the future.

End of text
 0