Stop staring at dialogs! AI design “two-dimensional matrix” used by top products

In the current rapid development of AI technology, many products try to introduce AI functions, but often simply add dialogs or generators, lacking systematic design thinking. This article will introduce you to a new AI product design framework – “AI Product Design 2D Matrix”. Through this matrix, you can systematically deconstruct successful AI capabilities from two key dimensions (AI core capabilities and interaction patterns) and identify opportunities for innovation.

In the face of the AI wave, we all have a “gold rush” style of excitement and anxiety. But when you calm down, you will find that many products introduce AI in a way that is like putting a jet engine on a carriage – it looks cool, but the experience is out of place.

We have seen too many articles on AI tools and are familiar with the fashionable words such as “copilot” and “generator”. But today, I want to take you beyond these appearances and take a step deeper to build an underlying framework that can systematically think about AI product design.

This framework, I call the “AI Product Design 2D Matrix”.

It helps us clearly deconstruct any successful AI function and systematically identify innovation opportunities in our products.

You ready? Let’s start building this powerful thinking model.

01 Step 1: Understand the two dimensions

For any AI function, we can analyze it from two dimensions:

Y-axis: AI Core Capabilities (What) – What exactly is AI doing?

This is about the technical nature of AI and the core value it can provide.

X-axis (horizontal axis): Interaction mode (How) – How does AI interact with users?

This is about user experience, how AI capabilities are “packaged” and presented to users.

To achieve these three challenges, product managers will only continue to appreciate
Good product managers are very scarce, and product managers who understand users, business, and data are still in demand when they go out of the Internet. On the contrary, if you only do simple communication, inefficient execution, and shallow thinking, I am afraid that you will not be able to go through the torrent of the next 3-5 years.

View details >

Successful AI products find the perfect intersection between these two dimensions.

Y-Axis In-Depth Analysis: AI Core Capabilities (What) – What exactly is AI doing?

Behind the dazzling array of AI applications, what AI can really do for us can be summarized into four basic and powerful core capabilities. They are like four “magic weapons” in the AI toolbox, determining what kind of value AI can create.

1) Understand & Synthesize

Definition in one sentence: Let AI be your “super brain”, digest massive amounts of information, and refine gold.

Imagine you’re dealing with a huge mountain of information – countless articles, reports, emails, meeting minutes, user comments. With manpower alone, we may only be able to wander at the foot of the mountain.

The ability to “understand and synthesize” gives AI a pair of “perspective eyes” and a pair of “skillful hands”.

“Understanding” is input: AI can read and understand this information like a human. It recognizes language, understands semantics, grasps context, and even senses emotions. It’s no longer just keyword matching, it’s really “read”.

“Synthesis” is the output: after understanding, AI can efficiently reorganize and refine the information according to your requirements.

It’s like a top consultant who can see the core risks at a glance from lengthy financial reports; Or like a senior editor who can organize the cluttered interview recordings into a logical and clear manuscript.

This ability mainly solves the problems of [information overload] and [cognitive efficiency].

Classic application scenarios:

  • Content Summary: Condense a 10,000-word long essay into a 300-word summary. (e.g., Kimi Smart Assistant, Quark Summary)
  • Meeting minutes: Automatically distills 1-hour meeting recordings into “core points”, “to-do items”, and “key decisions”. (e.g., flying book wonder, nail flash)
  • Sentiment Analysis: Automatically analyzes thousands of user reviews to tell you about the product’s positive and negative sentiment distribution. (e.g., various public opinion monitoring systems)
  • Intelligent translation: Not only literal translation, but also understanding culture and context, providing more authentic translations. (e.g., DeepL, Tencent Translator)
  • Information classification and marking: Automatically label emails, news, and customer work orders as “urgent”, “complaint”, “consultation”, etc.

2) Generate & Create

Definition in one sentence: Let AI be your “muse” and bring your imagination to life.

If “understanding and synthesis” is dealing with the known, then “generation and creation” is exploring the unknown. This capability has allowed AI to jump from an information processor to a content creator.

It breaks down barriers to creativity, making work that would otherwise require specialized skills and a significant time investment within reach. It’s like a collection of artists, writers, programmers, and designers with infinite skills.

Input is a “seed”: an idea of yours, a one-sentence description, a sketch, or even just a style can serve as a “seed” for creation.

The output is a “work”: AI will take root around this seed, and finally “grow” a complete work – an article, a painting, a piece of code, a song.

This ability mainly solves the problems of [creative bottleneck] and [productivity liberation].

Classic application scenarios:

  • Copywriting: Generate Xiaohongshu copy, marketing emails, advertising slogans, and even novel chapters. (e.g., bean buns, Wen Xin Yiyan)
  • Image Generation: Create photo-quality, illustration-quality, and even surreal visuals based on text descriptions. (e.g., Midjourney, Wenxin Yige)
  • Code Generation: Assists programmers in writing, debugging, and even interpreting code. (e.g., GitHub Copilot, Tongyi Lingcode)
  • UI design: Enter a sentence to directly generate an interface design draft for an app or web page. (e.g., Real-Time AI, MasterGo AI)
  • Music/Video Creation: Generate background music or video clips that match specific moods and styles. (e.g., Suno AI, Runway)

3) Predict & Recommend

Definition in one sentence: let AI be your “crystal ball”, gain insight into the future from the past, and guide the direction in the fog.

Human decision-making often relies on experience and intuition, but these can be biased. AI’s “prediction and recommendation” capabilities are based on massive data and complex algorithms to make probabilistic inferences. It may not be 100% accurate, but it can greatly improve the winning rate of our decisions.

It’s like a tireless data analyst, spotting hidden patterns and trends in seemingly unrelated data points.

Input is “history”: every click, every purchase, every stay you make becomes data learned by AI.

The output is “probable”: the AI will tell you “what is most likely to happen next”, or “what you are most likely to like”.

This capability mainly solves the problems of [decision uncertainty] and [personalized experience].

Classic application scenarios:

  • E-commerce Recommendation: The “Guess Your Likes” feature recommends products based on your shopping history. (e.g., Taobao, JD.com)
  • Content recommendations: Tailor your feed to your viewing/reading preferences. (e.g., Douyin, Toutiao, Spotify)
  • Smart navigation: predict the road conditions in the next 15 minutes and plan the fastest route for you. (e.g., AutoNavi map, Baidu map)
  • Financial risk control: Determine whether a transaction is fraudulent in real time.
  • Sales forecasting: Forecasting product sales in the next quarter, helping businesses manage inventory.

4) Execute & Automate

Definition in one sentence: Let AI be your “universal butler”, freeing you from repetitive and tedious tasks.

Our daily work and life are filled with a lot of repetitive, mechanical, but must-do tasks: sorting out documents, scheduling meetings, filling out reimbursement forms, answering regular emails…… These tasks consume a lot of our energy and time.

“Execution and automation” capabilities allow AI to become a reliable digital employee to complete these tasks accurately and efficiently.

Input is “Rule” or “Instruction”: You can preset a set of rules (e.g., “If you receive an invoice email, it will be automatically saved to the ‘Reimbursement’ folder”) or simply place an instruction (“Help me schedule a meeting tomorrow at 2 p.m. and Zhang San”).

The output is “action” and “result”: AI will do a series of actions for you across different applications and platforms, and finally deliver you only one result.

This ability mainly solves the problems of [repetitive labor] and [process efficiency].

Classic application scenarios:

  • Smart home: control the whole house appliances in one sentence, and implement scene modes such as “home” and “sleep”. (e.g., Tmall Genie, Xiao Ai)
  • Workflow automation: Automatically sync newly received customer emails to the CRM system and create tasks. (e.g., Zapier, domestic “Jijian Cloud”)
  • Intelligent customer service: Automatically answer users’ common questions, and even complete orders, refunds, and other operations. (e.g., intelligent customer service of major e-commerce and banks)
  • Schedule management: Automatically scan your emails to discover and add meeting invitations to your calendar.

X-Axis In-Depth Analysis: Interaction Patterns (How) – How does AI interact with you?

If AI’s core capability is the “horsepower” of the engine, then the interactive mode is the “cockpit” of the car – it determines whether you are driving an F1 car, a comfortable limousine, or a fully autonomous future car.

A good interaction mode can make powerful AI capabilities approachable and just right.

At present, the most mainstream and successful interaction modes can be summarized into the following four types.

1) Co-pilot

One sentence definition: AI is the “quick-eyed” assistant around you, and it will only take action when you need it.

Imagine that you are driving intently (performing your core tasks such as writing, programming, designing) while the AI in “copilot” mode sits quietly next to the map and looks at it. It won’t bother you, but when you deviate slightly from the route, or there is a better option ahead, it will whisper a reminder: “Maybe we can go this way?” ”

Its core philosophy is “enhance, not replace”. It respects the dominance of the user and puts the control firmly in the hands of the user.

  • Trigger mode: Usually contextual, user-initiated. For example, select a piece of text, hover over an element, or press a shortcut/enter a specific symbol (e.g., /) at a specific location.
  • Interface presentation: extremely lightweight and disturbing. It could be just a small floating icon, a gray line of suggested text, or a drop-down menu. It never interrupts your flow with a huge pop-up.
  • User Control: Always provide clear “accept,” “reject,” or “modify” options. AI is only the suggester, and the user is the decision-maker.

This mode is best suited for scenarios where it is deeply embedded in existing workflows, aiming to improve the efficiency of professionals.

Classic Scene:

  • When writing code in the IDE, GitHub Copilot completes the code with gray text, which can be accepted by pressing Tab.
  • When writing a document in Notion, select Chinese characters and the AI menu will pop up to help you summarize or polish them.
  • When processing data in Excel, use natural language to command it to “help me mark the negative numbers in this column” and it will do it for you immediately.

2) Automated butler (Butler)

Definition in one sentence: AI is your “invisible” caring housekeeper, and before you know it, it has already taken care of everything for you.

A top English butler will never show off how much work he has done in front of his master. He is always out of sight and keeps the estate in order. All you enjoy is a perfect, comfortable, and productive environment.

This is the case with AI in the “automated butler” model. Its highest state is “moisturizing things silently”.

  • Trigger method: Almost fully automatic, based on preset rules or continuous learning in the background. There is almost no input and zero operation by the user.
  • Interface presentation: usually “invisible”, the user does not perceive the interaction process, only sees the final result. Or only when necessary to reflect its presence through notification or reporting.
  • User control: Control is reflected in the “background settings”. Users can turn on/off automation, or adjust rules, but do not intervene at the moment the task is being executed.

This mode is best suited for [high-frequency, repetitive, and well-defined rules] tasks, aiming to completely liberate users.

Classic Scene:

  • Spotify’s Discover Weekly playlist automatically appears in your library every Monday without any action.
  • Smart home’s “home mode” When you open the door, lights, air conditioning, and music are automatically ready for you.
  • Google Photos photo categorization It automatically recognizes faces, places, and things in the background, and you just enjoy the convenience of searching.

3) Generative Creator

Definition in one sentence: AI is your “magic pen Ma Liang”, you give it a proposition, and it draws a complete picture for you.

In this mode, AI is no longer an assistant, but the main force of creation. You play the role of a director or art director who is responsible for coming up with ideas, giving direction, and evaluating and adjusting the results.

At its core, it’s a creative loop of [instruction → generation → iterations].

  • Trigger method: The user initiates a task through a clear input box (Prompt Box) and enters detailed instructions.
  • Interface presentation: The center of the interface is usually the input and output areas. It clearly presents the resulting results (usually multiple options) and provides convenient iteration tools (such as “Regenerate”, “Change from this graph”, “HD”, etc.).
  • User Control: Control is reflected in the accuracy of the prompt and the filtering and iteration of the generated results. The user’s ability lies in how to “harness” AI with precision in language.

This model is best suited for the “zero-to-one open creation” scenario, aiming to spark inspiration and large-scale content production.

Classic Scene:

  • Midjourney / 文心一格 You type “a cat in a spacesuit playing guitar on the moon, cyberpunk style”, and it will generate an image for you.
  • Suno AI You input the lyrics and music style, and it composes and sings for you. Instant AI You type “design a meditation app landing page with dark mode”, and it generates a UI design for you.

4) Conversational kernel

One sentence definition: AI is your “omniscient” partner, and you explore and solve problems together through questions and answers.

This is the closest pattern to the natural way humans communicate. At the heart of the whole product is a conversational interface. You don’t need to learn complex menus and buttons, just ask your questions and needs in natural language as if you were chatting with a friend.

Its core is [natural language understanding] and [multiple rounds of contextual memory].

  • Trigger method: Simple and direct, that is, typing or speaking in an input box.
  • Interface presentation: Chat flow similar to WeChat or iMessage. To enhance the experience, answers are streamed (typewriter effects) and embedded with rich UI elements (e.g., code blocks, tables, images, buttons) instead of just plain text.
  • User control: Control is reflected in the “art of asking questions” and “the ability to ask questions”. Through continuous questioning and clarification, guide AI to give more accurate and in-depth answers.

This mode is best suited for tasks such as open-ended exploration, information query, and multi-step coordination.

Classic Scene:

  • ChatGPT / Wen Xin Yiyan for knowledge quizzing, brainstorming, and copywriting.
  • Perplexity AI conducts in-depth research and exploration with informed sources.
  • Alipay intelligent customer service helps you check bills and do business through conversation.

02 Step 2: Build an AI design matrix

Building an AI Design Matrix: Cases and Insights

Now, here comes the best part. We combine the two dimensions to build a 4×4 matrix. You will find that those top AI products can find their precise positioning in this matrix.

In-depth case analysis: Use matrix thinking to understand AI products

This matrix is not only a classification table, but also like an anatomical table. Let’s take a scalpel and see how the same AI capability (Y-axis) presents completely different product forms through different interaction modes (X-axis) through comparative analysis.

Analysis 1: The quartet of “understanding and synthesis” ability

This is the most widely used capability of AI, and there are classic cases in almost every grid.

1) Copilot x Understanding Synthesis: Notion

Select Chinese words to summarize and translate with one click.

2) Butler x Understanding Synthesis: Quark App

When you’re just browsing for information, you don’t even want to actively click any buttons. The Quark AI summary of the “butler” mode reflects the value. It automatically presents you with a snippet before you open the search results, actively serving you and improving your efficiency without feeling it.

3) Creator x Understand Synthesis: Kimi Smart Assistant

When you have a clear, ambitious task – like learning a hundred pages of professional material and applying it to a project, you choose Kimi in “creator” mode. You “feed” the document to it, give a clear prompt, and it “creates” a new, structured summary for you to help you learn quickly. It’s a goal-oriented, heavily engaged interactive process.

4) Conversational x understanding synthesis: Baidu Library AI

When you want to not only “know” the content, but also “understand” the content, the “conversational” Baidu Wenku AI provides the possibility. You can upload an essay and then ask multiple rounds of questions about one of the points as if you were asking a teacher. It’s an exploratory, diggable interaction that turns static knowledge into a dynamic conversation.

Analysis 2: The diverse gameplay of the “generation and creation” ability

1) Professional Creation: WPS AI (Copilot x Generative Creation)

In professional office scenarios, creation is often coherent. WPS AI’s interaction mode is very restrained. You are writing a document, type/ai, and it can help you continue, polish, or generate a PPT based on the outline. It’s a “super plugin” in your writing stream, blending seamlessly and enhancing rather than interrupting.

2) Mass Entertainment: Wenxin Yige (Creator x Generative Creation)

For painting, which is a creation from zero to one, it is more suitable for the “creator” mode. Wenxin Yige provides a canvas where you paint with language and “refine alchemy” by adjusting the prompt. This is an open, high-freedom creative playground.

3) Daily Communication: Beanbao (Conversational x Generative Creation)

When creative needs are more inclined to daily communication, such as writing a Xiaohongshu copywriting, the “conversational” bean bun is particularly intimate. You can say to it: “Help me write a store visit note, in a playful style.” This anthropomorphic and stylized interaction lowers the psychological threshold for creation.

Analysis 3: “Prediction and recommendation” ability, moisturizing things silently

This case of ability perfectly illustrates the nuances between “housekeeper” and “co-pilot”.

1) The ultimate butler: Taobao/Douyin (butler x prediction recommendation)

You open the App, and the recommendation system, the “butler”, has already laid everything out for you. It makes predictions based on massive data, so you don’t need to do anything, just enjoy the immersive experience of personalized content. This is a completely AI-led model designed to increase user engagement.

2) Timely co-pilot: AutoNavi map (co-pilot x prediction recommendation)

When you are driving, the AI predicts the congestion ahead, and the Amap will pop up a small prompt: “Find a faster route, save 10 minutes, do you switch?” It gives recommendations at key junctures, but the final decision is still in your hands. This is to assist users in making better decisions.

Analysis 4: “Execution and automation” capabilities, from virtual to reality

1) Hands-free butler: Tmall Genie (butler x execution automation)

“I’m going home.” In a word, the lights, curtains, and music are all automatically turned on. This is the typical “butler” mode, which executes a full set of preset action streams in the background to provide you with a seamless smart living experience.

2) Co-pilot with precise efficiency: clipping (co-pilot x execution automation)

When editing a video, you click “Smart Subtitles”, and AI will automatically help you complete a series of tedious operations such as recognizing speech, generating subtitles, and aligning the timeline. Instead of your creative decisions, it replaces the most mechanical and time-consuming part of the process.

3) Creator for developers: Coze (Button) (Creator x Executive Automation)

Going further, Coze allows ordinary people to “create” AI that can perform tasks. You describe in natural language what you want the AI bot to do (e.g., “Help me create a bot that can query the weather and send it to Feishu groups”), and Coze can automatically orchestrate workflows and API calls for you. This is a meta-automation, that is, “automation that creates automation tools”.

Through this matrix, we can clearly see:

Diagonal trend: “Understanding” ability is more suitable for “copilot” and “butler” modes; The ability to “create” is naturally in line with the “creator” and “dialogue” models.

Innovation blank areas: Blank spaces in the matrix or grids with fewer cases are often potential innovation opportunity points. For example, is it possible to “generate and create” in the “automated butler” mode? (Maybe in the future, AI can automatically draft the first draft of meeting minutes for you based on your calendar and email?) )

03 Action Guide for Product Designers

This two-dimensional matrix is not only an analysis tool, but also an innovative map. When you plan AI features for your products, you can use it like this:

Step 1: Locate Core Competencies (Y-Axis)

Think about what problems do users need AI to solve most in your product scenario? Is it information overload (need to be understood and synthesized), or is it a creative bottleneck (need to be generated and created)?

Step 2: Explore the Interaction Mode (X-axis)

After determining the core competencies, go through the four interaction modes and think about which one best suits your user habits and product tonality.

  • “Can I seamlessly integrate this capability into my existing processes using the ‘copilot’ mode?”
  • “Can this function be made into an ‘automated butler’ to completely free users’ hands?”

Step 3: Find the best intersection

Find the point in the matrix that creates the most value and has the best user experience, and that is the direction in which your AI functions should land.

Forget about those scattered AI features. From now on, use the systematic thinking of the “two-dimensional matrix” to build a truly elegant, effective, and deeply rooted AI product experience.

End of text
 0