This article will provide an in-depth analysis of the core dimensions involved in the selection of back-end technologies in the whole process of building an enterprise-level low-code platform from scratch, including the in-depth comparison and selection of mainstream technology stacks, the design of architecture blueprints to support high concurrency and high availability, and the key strategies of database selection and optimization, aiming to provide a detailed and practical back-end construction guide for low-code platform product managers and architects.
1. Back-end technology stack selection
The choice of technology stack is the cornerstone of the backend system, which directly affects the platform’s underlying performance, development efficiency, maintainability, and future evolution. Enterprise-level low-code platforms usually carry complex business logic, high concurrency user access, and massive data processing requirements, and the requirements for the technology stack are particularly stringent. We focus on the three major technology ecosystems that are currently mainstream: Java/Spring Boot, Python/Django, and Node.js/Express.js for an in-depth analysis.
1.1 Java and Spring Boot
Java has been around for more than 20 years, and its “Write Once, Run Anywhere – WORA” philosophy has long been deeply rooted in the field of enterprise application development. Its core advantages are reflected in multiple levels:
- Strict type system and object-oriented paradigm:Strongly typed systems can catch a large number of potential errors at compile time, significantly improving code quality and maintainability for large projects. Object-oriented features (encapsulation, inheritance, polymorphism) make modeling and organization of complex business logic clearer and more modular, which is essential for low-code platform backends that require a high degree of abstraction and flexible scalability.
- Proven performance optimization mechanisms:The just-in-time compiler (JIT) of the Java Virtual Machine (JVM) is key to its high performance. JIT analyzes the hotspot code at runtime and dynamically compiles it into local machine code, which greatly improves execution efficiency. Efficient memory management with mature garbage collectors (e.g., G1, ZGC) enables Java applications to maintain robust performance when handling high concurrent requests and large-scale data. Fully tuned Java applications with throughput and stability are reliable for enterprise-level scenarios.
- Spring Boot revolutionizes development paradigms:Spring Boot is based on the huge Spring ecosystem, and completely subverts the bulky development mode of traditional Java EE applications through the concept of “Convention over Configuration” and a powerful auto-configuration mechanism. Developers only need to introduce the corresponding Starter dependencies, and Spring Boot can automatically configure most of the infrastructure (such as data sources, web MVC, security, etc.) according to the class path, allowing developers to focus on the implementation of core business logic and achieve a qualitative leap in development efficiency.
- Unparalleled Ecosystem and Community Support:Java has probably the largest and most active open source community and business support network in the world. From core libraries to various middleware (e.g. message queue RabbitMQ/Kafka, cache Redis, distributed coordination Zookeeper/Nacos), to microservices family bucket Spring Cloud (service discovery Eureka/Consul/Nacos, configuration center config, gateway Zuul/gateway, circuit breaker Hystrix/Sentinel, link trace Sleuth/ Zipkin), almost any enterprise-level development problem can find a mature, stable, mass-produced proven solution. The vast array of technical documentation, books, tutorials, Q&A communities like Stack Overflow, and a community of experienced developers form a treasure trove of invaluable resources, clearing the way for long-term project maintenance and technical upgrades.
- Natural partners for microservices architecture:Spring Boot’s seamless integration with Spring Cloud makes building and managing complex distributed microservices systems relatively controllable. Its modular design, clear interface definition, and rich service governance capabilities perfectly meet the needs of large-scale low-code platforms that need to be split by functional modules, deployed independently, and scaled independently.
To achieve these three challenges, product managers will only continue to appreciate
Good product managers are very scarce, and product managers who understand users, business, and data are still in demand when they go out of the Internet. On the contrary, if you only do simple communication, inefficient execution, and shallow thinking, I am afraid that you will not be able to go through the torrent of the next 3-5 years.
View details >
However, there is always another side to the coin:
- Relative Cost of Development Efficiency:While strong typing and rigorous OO design bring maintainability, it also means that developers need to write more “ceremonial” code (e.g., getters/setters, interface definitions, dependency injection configurations, etc.), and the development speed may seem less “brisk” in the early stages than dynamic languages.
- Boot time and memory usage:JVM startup requires loading a large number of core libraries and the application’s own classes, resulting in relatively long application startup times, which can be a pain point that needs to be optimized (such as using GraalVM Native Image) in scenarios that pursue rapid iteration and frequent deployment (such as serverless environments). At the same time, the JVM itself has a higher underlying memory overhead than some of the more lightweight runtime environments.
- Learning Curve:To become proficient in the Java and Spring ecosystems, especially to deeply understand core concepts such as JVM principles, concurrency models, and Spring’s IOC/AOP, it requires considerable learning costs.
1.2 Python and Django
With its simplicity and elegance, clear and easy-to-read syntax, and the flexibility brought by dynamic typing, Python has attracted a large number of developers, especially in fields such as data science, machine learning, and automation scripting. The Django framework is a benchmark for Python web development with its “Batteries Included” philosophy.
- Ultimate development efficiency:Python’s syntax is concise, and Django has built-in functional modules such as a powerful ORM (Simplified Database Operations), elegant URL routing, a robust user authentication system, and an automated admin backend. This allows developers to quickly build fully functional back-end application prototypes and CRUD interfaces with minimal code, making them ideal for scenarios where requirements change rapidly and ideas need to be verified quickly.
- Powerful ORM with built-in features:Django ORM has a high degree of abstraction and can operate databases in a Pythonic way, greatly reducing the need for handwritten SQL. The built-in admin site is an almost zero-cost perk for the development of back-office management functions on low-code platforms. Its form processing, caching, internationalization and other modules are also well-designed, significantly improving development speed.
- A bridge between data science and AI:Python’s dominance in data science (NumPy, Pandas), machine learning (Scikit-learn, TensorFlow, PyTorch), and more is indisputable. If the target scenarios of low-code platforms involve data analysis, predictive model integration, AI-assisted development, etc., choosing Python as the implementation language for backends or partial services can seamlessly leverage these powerful libraries and reduce integration complexity.
- Active and Diverse Community:The Python community is also very large and active, with a wealth of third-party libraries (PyPI). Although it is slightly inferior to the Java ecosystem in terms of overall maturity and unity of enterprise-level middleware, it is rich in its areas of expertise (Web, data, AI).
The main challenges are:
- Performance bottlenecks:As an interpreted language, Python’s raw execution speed, particularly CPU-intensive operations, is significantly lower than that of compiled or JIT-optimized languages like Java. The presence of a Global Interpreter Lock (GIL) limits its ability to perform CPU-intensive tasks in parallel on multi-core CPUs, making it a hard wound in handling high concurrent computational demands. While I/O bottlenecks can be mitigated through asynchronous frameworks such as ASGI’s Django Channels/FastAPI, or by integrating with C extensions, the root causes of CPU bottlenecks are difficult to eradicate.
- Dynamic types of double-edged swords:Dynamic typing brings flexibility at the expense of compile-time type safety. In large projects, without strict code specifications and adequate test coverage (e.g., Type Hints + Mypy), maintenance costs and the likelihood of runtime type errors increase.
- Ecological relativity between microservices and high-concurrency architectures:Although there are solutions such as Celery (distributed task queue), Dramatiq, and asyncio-based frameworks (FastAPI, Sanic) for building asynchronous and distributed applications, there is still a gap in the maturity, unity, and integration with the framework of Java’s Spring Cloud in terms of the complete suite required for enterprise-level distributed systems such as microservice governance, service discovery, configuration center, and full link tracing. Building a hyper-scale, ultra-concurrency, and distributed Python service stack often faces greater challenges and requires more self-research investment.
- Deployment and Packaging:Managing and deploying dependencies in a Python environment (e.g., virtual environment, Docker image size) can sometimes be a bit more cumbersome than a Java application packaged as a Fat Jar/War.
1.3 Node.js and Express.js
Node.js is based on the Chrome V8 JavaScript engine and employs an event-driven, non-blocking I/O model, making it excellent for handling highly concurrent, I/O-intensive tasks such as network requests, file operations, database access. Express.js is the most popular, clean, and flexible web framework on it.
- Superior I/O performance and high concurrency processing:The core strength of Node.js is its Event Loop and non-blocking I/O model (implemented by the underlying libuv library). It can efficiently handle thousands of concurrent connections with a single thread (assisted by Worker Threads), making it particularly suitable for scenarios that need to handle a large number of real-time, high-frequency I/O operations, such as API gateways, real-time communication services, data stream processing, etc. In low-code platforms, Node.js has a clear advantage when handling I/O-intensive operations such as bulk form submissions, file uploads and downloads, and third-party API calls.
- Unified full-stack JavaScript experience:Using JavaScript (or TypeScript) for front-end development allows for maximum code sharing (e.g., data model validation, tool functions), unified technology stack, and reduced team learning costs and context switching overhead. Attractive to full-stack teams looking for efficient collaboration.
- Lightweight, fast and rich npm ecosystem:Node.js Lightweight runtime, fast app launch. npm has the world’s largest repository of open-source packages, providing a vast array of modules and tools for all aspects of development. The Express.js framework itself is very lightweight and flexible, and it is easy to extend its functionality through the middleware mechanism. Mature Express-based frameworks like NestJS also offer more structured and enterprise-oriented solutions.
- Active Community vs. Asynchronous Programming Paradigm:The community is extremely active and innovative. Promise and async/await syntax greatly improve the readability and maintainability of asynchronous code, making it easier to write efficient non-blocking code.
Its limitations cannot be ignored either:
- Disadvantages of CPU-intensive tasks:The event loop model blocks the entire event loop when encountering CPU-intensive computations such as complex business logic processing, image processing, and large-scale data encryption and decryption, causing response delays to spike for all requests. While computation can be transferred to independent threads through Worker Threads, it adds complexity and communication costs.
- Callback Hell and Asynchronous Complexity:Although async/await mitigates the problem, deeply nested asynchronous operations and error handling are still more error-prone and harder to debug than linear synchronous code. Developers need to have a deep understanding of asynchronous programming.
- Single point of failure risk and process management:A single Node.js process crash causes all connections to be lost. A robust process manager (e.g., PM2, Forever) is required to ensure high application availability and automatically restart crashed processes.
- Enterprise middleware and microservices governance:Although Node.js has developed in the field of microservices (such as NestJS microservice modules, Seneca, Moleculer), service discovery (Consul, etcd clients), API gateways (Express Gateway, Kong Node Plugin) and other solutions, it is better than Java Spring Cloud in terms of the completeness, maturity, and deep integration with the framework of the enterprise-grade service governance suite The ecology is still relatively fragmented and requires more integration work.
1.4 Selection decision
After the above in-depth analysis, a comprehensive trade-off is made for the core requirements of enterprise-level low-code platforms – high performance, high stability, high scalability, complex business logic support capabilities, and long-term maintainability:
- Java + Spring Boot is often the preferred option:It provides the most comprehensive and reliable guarantee in terms of performance (especially CPU-intensive), stability, mature microservices ecosystem (Spring Cloud), strong enterprise-grade middleware support, large developer base, and massive production practice validation. Its strongly typed system and OO characteristics, while increasing the initial amount of code, are critical for the long-term maintenance and evolution of large, complex systems. The JVM’s mature optimization technology (JIT, GC) can effectively support the high concurrency and big data processing requirements. For low-code platforms that need to handle complex business processes, integrate multiple enterprise systems, and require extremely high stability and scalability, the comprehensive strength of the Java ecosystem is the best match.
- Python + Django can be used as a complement or sub-preferred, for specific scenarios:If the core positioning of low-code platforms is rapid prototyping, internal tool generation, or deep integration of data analysis/machine learning capabilities, and does not require extremely high concurrency or complex CPU calculations, the development efficiency advantages of Python + Django will be very prominent. It can be used as an implementation language for specific services in the platform (such as AI model services, data analysis backends), or as an overall technology stack in core scenarios with low performance requirements.
- Node.js + Express.js (or NestJS) for specific modules or full-stack unified scenarios:Node.js is an excellent choice for components in a platform that need to handle extremely high I/O concurrency (e.g., API gateways, file services, real-time collaboration engines) or when teams are strongly looking for full-stack JavaScript/TypeScript unification. Its lightweight, fast, and event-driven model delivers maximum performance in these scenarios. It is also a strong contender for building small and medium-sized low-code applications that are API-centric, focusing on front-end interaction and rapid iteration.
Core conclusions:For building an enterprise-level low-code platform with strict requirements to support the core business of enterprises, the Java + Spring Boot technology stack is the most robust and long-term development choice in most cases due to its comprehensive advantages. Python and Node.js are more suitable for specific benefit scenarios or as implementation technologies for specific services in the platform.
2. Back-end architecture design
Choosing a robust technology stack also requires careful architectural design to realize its full potential and build a back-end system that can handle enterprise-level challenges. Microservices architecture is the mainstream paradigm for building complex, scalable applications.
2.1 Microservice architecture
The core idea of microservices is to decompose a large monolith into a set of small, loosely coupled, independently deployed services based on business capabilities or domain boundaries. Each service is built around a specific business function (e.g., user service, form design service, process engine service, rule engine service, data storage service, permission service, notification service), with its own independent process, database (following the Database per Service pattern), and business logic.
Significant advantages:
- Technical Heterogeneity:Different services can choose the most suitable technology stack according to their needs (for example, Node.js is used as an API gateway, Java is used as a core business service, and Python is used as an AI service).
- Independent development and deployment:The team can independently be responsible for the entire life cycle of one or several services, and development, testing, and deployment do not affect each other, greatly improving development efficiency and iteration speed.
- Elastic Scaling:You can scale horizontally according to the actual load of each service (such as adding more instances to a form submission service with high concurrency), which makes resource utilization more efficient and costs more controllable.
- Fault tolerance improvement:Failures of individual services (such as OOM crashes) are isolated within their boundaries, and mechanisms such as circuit breakers, degradations, etc. can prevent failure spreading and causing the entire platform to be paralyzed.
- Easy to understand and maintain:Each service codebase is relatively small and business-focused, reducing cognitive complexity and maintenance difficulty.
Challenges and Responses:Microservices also bring inherent complexity to distributed systems: inter-service communication (network latency, failures), data consistency (cross-service transactions), distributed tracing, increased test complexity, etc. Supporting infrastructure (service discovery, configuration center, API gateway, link tracing) and good DevOps practices are needed to manage these complexities. The division of service granularity (when to split and how fine) also requires rich experience and continuous evolutionary adjustments.
2.2 API Gateway
In microservices architecture, the API gateway plays a crucial role, serving as a single entry point and unified façade for all external clients (web, apps, third-party systems) to access backend services.
Core Responsibilities:
- Routing and request forwarding:Route client requests precisely to the corresponding backend microservice instance. For example, route a request to /api/user/ to a user service cluster.
- Load balancing:Integrate load balancing features such as Round Robin, Least Connections, IP Hash to distribute requests across multiple healthy instances of the same service to increase throughput and availability.
- Authentication and authentication:Centralize authentication (e.g., JWT authentication, OAuth 2.0) and permission verification, ensuring that only legitimate and authorized requests can access downstream services. Avoids repeatedly implementing security logic in each microservice.
- Current Limiting and Fuse:Implement rate limiting to protect back-end services from being overwhelmed by burst traffic. Implement circuit breaker mode, when downstream services continue to fail or respond too slowly, quickly fail and return a preset response (degradation) to avoid resource exhaustion and cascading failures.
- Request/Response Transformation:Perform necessary aggregation, filtering, and format conversion (such as XML<->JSON) of the requested parameters or returned results to adapt to the client’s needs.
- Logging and Monitoring:Centrally record access logs, audit logs, and integrate with monitoring systems to provide system ingress level observability.
- Static Response/Edge Caching:For responses that change frequently, you can set a cache at the gateway layer and return it directly to reduce the pressure on the backend.
Technical selection:Mature open source solutions include Nginx (combined with Lua extensions such as OpenResty), Kong (based on Nginx/OpenResty, providing rich plugins and APIs), Spring Cloud Gateway (native to the Java ecosystem, deeply integrated with Spring Cloud), and Zuul (produced by Netflix, older). When choosing, you need to consider performance, feature richness, scalability (plug-in mechanism), fit with the existing technology stack, and O&M complexity.
2.3 Service Registration and Discovery
In a microservices environment, service instances are frequently started, stopped, and migrated (such as Kubernetes pod scheduling), and their network locations (IP:Ports) are dynamically changing. Service registration and discovery mechanisms are the critical infrastructure that keeps this dynamic system functioning properly.
Working principle:
- Service Registration:When a microservice instance is up and ready to receive requests, it proactively registers its network location information (service name, IP, port, health status, metadata, etc.) to the service registry.
- Service Discovery:When one service (service consumer) needs to call another service (service provider), it does not hardcode the other party’s address, but queries the service registry for a list of all currently available and healthy instances of the target service name.
- Client load balancing:The service consumer (or its client library) selects one from the list of fetched instances to make a call based on load balancing policies such as polling, randomness, response time weighting.
- Health Checks:The service registry continuously performs health checks on registered service instances (such as HTTP/TCP probes). Failed or unresponsive instances are marked as unhealthy or automatically culled from the registry, ensuring that consumers are not called to failed instances.
Core Components:Commonly used service registries include:
- Netflix Eureka:AP system (high availability, partition tolerance) with a simple design and excellent integration with Spring Cloud.
- HashiCorp Consul:CP system (strong consistency), powerful features, built-in service discovery, health check, KV storage, multi-data center support, support DNS and HTTP interfaces.
- Alibaba Nacos:It has comprehensive functions, supports service discovery (AP/CP mode can be switched) and configuration management, is active in the domestic ecosystem, and integrates well with Spring Cloud / Dubbo.
- Apache ZooKeeper:CP system is the standard for early distributed coordination, powerful but relatively heavyweight, and configuration management is its strength.
Value:It realizes the decoupling of service consumers and service providers, makes the dynamic scaling and replacement of service instances transparent to the caller, and greatly improves the elasticity and maintainability of the system.
2.4 Load balancing
Load balancing is the core means of improving performance, availability, and resource utilization in distributed systems. It does this by intelligently distributing workloads across multiple backend service instances.
Level:
- Global load balancing:Typically implemented at the DNS level, it directs user traffic to different geographies or data centers.
- Application layer load balancing:Working at OSI Layer 7 (HTTP/HTTPS) to understand application protocols. API gateways typically integrate an L7 load balancer. Smarter routing (e.g., grayscale publishing, A/B testing) can be performed based on the request content (URL Path, Header, Cookie).
- Transport layer load balancing:Works on OSI Layer 4 (TCP/UDP) and forwards based on IP addresses and ports. Higher performance but no awareness of app content. Such as LVS (Linux Virtual Server), F5 BIG-IP (hardware).
Algorithm:
- Polling :Distributing requests in turn is simple and fair.
- Weighted polling:Different weights are assigned according to the server’s processing capacity, and the more capable servers get more requests.
- Minimum Number of Connections :Send a new request to the server with the fewest current connections. It is more suitable for the actual load of the server.
- Minimum response time :Send requests to the server with the lowest average response time (monitoring support required).
- IP Hash :The hash value is calculated based on the client IP and is fixed to a server to maintain session affinity.
- Random:Randomly select servers.
Implementation:
- Centralized load balancers:Such as independent Nginx, HAProxy, F5 BIG-IP. All traffic passes through the load balancer first, which forwards it to the backend instance. Simple deployment is a common pattern.
- Client load balancing:The load balancing logic is embedded in the client library of the service consumer (such as Ribbon for Java). After the client obtains the list of all instances from the service registry, it selects the target instance. Reduces network hops (no central proxy bottlenecks) but increases client complexity and requires language support.
Role in low-code platforms:Ensure that user requests are evenly distributed (or on-demand) to healthy service instances, avoid single point overload, maximize resource utilization, and improve the overall throughput and responsiveness of the platform. It is the basic component for achieving high availability.
2.5 High availability, stability, and security guarantees
Enterprise low-code platforms must strive for extreme availability (e.g., 99.99%), stability, and security. This requires a complete set of engineering practices and technical support.
High availability :
- Multi-replica deployment and redundancy:Deploy at least 2 or more instances of key services (including databases, registries, gateways, and core business services) distributed across different physical machines, racks, and even availability zones.
- Failover :When an instance fails, the load balancer or service discovery mechanism can automatically switch traffic to other healthy instances, and users will not be aware of the interruption. Databases usually require master-slave replication with VIP (Virtual IP) switching or read/write separation middleware to achieve failover.
- Health Checks:Continuously monitor the status of service instances (such as HTTP/TCP healthy endpoints and process status) to detect and isolate faulty nodes in a timely manner.
- Elegant start and stop:The service receives traffic only after the launch is completed and the registration is successful. Before stopping, notify the load balancer/registry to log out of the system and wait for the existing requests to be processed before exiting to avoid losing requests.
Stability:
Capacity planning and auto scaling:Estimate capacity based on business volume and performance metrics (CPU, memory, request latency, queue length) and auto-scaling (e.g., Kubernetes HPA). It automatically expands during peak traffic peaks and shrinks at troughs, ensuring stability and cost savings.
Fuse, degradation and current limiting:
- Fusing:When the failure rate or latency of downstream service calls exceeds the threshold, it quickly “circuits” and directly returns an error or degraded response to prevent cascading failures. After a period of time, try to recover the ajar state probe.
- Demote:Provide lossy but usable basic functionality (e.g., returning cached data, simplifying processes, turning off minor features) when non-core services are unavailable or underperforming.
- Current Limit :Limit the request rate at the ingress (API gateway) or within the service to prevent burst traffic from overwhelming the system. Common algorithms include token bucket, leaky bucket, and fixed/sliding window counters.
Asynchronous vs. Message Queues:Asynchronize time-consuming operations (such as sending emails, generating reports, and calling external slow APIs) and decouple them through message queues (such as RabbitMQ, Kafka, RocketMQ). Producers respond quickly to requests, consumers process in the background, improve system response speed and throughput, and shave peaks and fill valleys.
Full Link Tracing:Use tools like Jaeger, Zipkin, SkyWalking, and more to track all services that a request flows through a distributed system, visualize call links, latency, and dependencies to quickly locate performance bottlenecks and failure points.
Monitoring Alarms and Observability:
- Indicator Monitoring:Use Prometheus to collect and store metrics (CPU, memory, disk, network, JVM GC, HTTP Request/Latency/Error Rate, Database Connection Pool, Cache Hit Rate) for your services, middleware, hosts, and containers. Visualize with Grafana.
- Log Focus:Use scenarios such as ELK (Elasticsearch, Logstash, Kibana) or Loki + Grafana to centrally collect, store, index, and query logs for all services for easy troubleshooting and auditing.
- Alarm:Set alarm rules based on monitoring metrics and logs (for example, the error rate is > 1%, the CPU > is 90% for 5 minutes, and the service instance is down). Timely notify O&M personnel through email, SMS, DingTalk, Enterprise WeChat, PagerDuty, and other channels. Alarms need to be clear and operable to avoid “wolves are coming”.
Security:
- Transmission Security:Enforce the use of HTTPS (TLS/SSL) to encrypt all network communications, preventing data from being eavesdropped or tampered with in transit.
- Identity Authentication:Strictly verify user identity. Common solutions include OAuth 2.0 / OpenID Connect (OIDC), JWT (JSON Web Tokens), and SAML 2.0. Integrate enterprise AD/LDAP for single sign-on (SSO).
- Authorization:Fine-grained control of user access to resources (e.g., whether a user can view/edit a form). Common models are RBAC (role-based access control) and ABAC (attribute-based access control). Ensure the principle of least privilege.
- Input Verification and Output Coding:Strict validation and sanitization of all user input (anti-XSS, SQL injection, command injection, etc.). Encode the content output to the page to prevent XSS attacks.
- Security Dependency Management:Regularly scan the project dependency library (e.g., OWASP Dependency-Check, Snyk) for known vulnerabilities and upgrade them in a timely manner.
- Vulnerability Scanning and Penetration Testing:Proactively identify and fix security vulnerabilities by regularly using automated tools (e.g., OWASP ZAP, Nessus) and hiring a team of professionals to conduct security scans and penetration testing.
- Audit log:Detailed records of the operator, time, content, and results of key operations (e.g., login, sensitive data access, configuration modification) to meet compliance requirements and facilitate post-event traceability.
3. Database selection and design
The core value of low-code platforms is to build applications quickly, and the core of applications is data. Choosing the right database and designing a good data model are the foundation for ensuring platform performance, stability, and scalability.
3.1 In-depth comparison between relational databases and non-relational databases
Relational databases (e.g., MySQL, PostgreSQL, SQL Server, Oracle):
Core features:Based on a relational model, data is stored in structured two-dimensional tables, with rows representing records and columns representing attributes. Tables are associated with each other by a foreign key. Strict adherence to ACID (atomicity, consistency, isolation, persistence) transaction characteristics. Use Structured Query Language (SQL) for data manipulation.
Advantage:
- Data Structuring and Strong Consistency:Predefined schemas ensure data formatting specifications, foreign key constraints, and ACID transactions ensure strong data consistency and integrity, making them ideal for storage core business entities and their associations (e.g., user-role-permission, order-commodity).
- Powerful query capabilities:The SQL language is powerful and standardized, supporting complex joins (JOIN), aggregation (GROUP BY), subqueries, transaction control, and other operations.
- Mature ecosystems and tools:It has the longest history, the widest user base, and the richest management tools, monitoring solutions, backup and recovery mechanisms, and ORM framework support.
Inferior position:
- Scalability Challenges:There is an upper limit to vertical scaling (upgrading stand-alone hardware), while horizontal scaling (sharding databases and tables) is technically complex, which may affect SQL compatibility and transactions.
- Inflexible mode changes:Modifying the table structure (e.g., adding fields, modifying types) can be costly operations with large amounts of data and require downtime or an online DDL tool such as pt-online-schema-change for MySQL.
- Inefficient processing of unstructured/semi-structured data:While feasible for storing documents such as JSON, querying and indexing are often not as efficient as native document databases.
Representative Players:
- MySQL:The most popular open-source RDBMS with excellent performance, ease of use, and large community, is the preferred choice of Internet companies. The InnoDB engine provides good transaction support and concurrency performance.
- PostgreSQL:A powerful open-source RDBMS known for being highly compliant with SQL standards, supporting rich data types (e.g., JSONB, GIS, arrays), powerful scalability (e.g., plug-ins), and excellent optimization capabilities for complex queries. The advantages are significant in scenarios requiring advanced features, complex analytics, or geographic information processing. The transaction and concurrency control model (MVCC) is also very mature.
Non-relational database (NoSQL):
Core features:Optimize for specific types of data models and access patterns, often sacrificing some ACID features (especially strong consistency) in exchange for better scalability, performance, and flexibility. No fixed mode or flexible mode. Not using SQL or using SQL-like dialects.
Main types and representatives:
- Document Database:For example, MongoDB and Couchbase. Data is stored in a JSON-like document (BSON in MongoDB). Documents are self-contained data units that can nest arrays and subdocuments. Flexible patterns for storing frequently changing or inconsistent data (e.g., user configuration, CMS content, product catalogs). Powerful query language and indexing support.
- Key-value database:For example, Redis, Memcached, DynamoDB. The simplest model accesses the Value through a unique key. Value can be a simple string, a complex structure (e.g. Redis’s Hash, List, Set, Sorted Set). Extreme performance (especially memory-based such as Redis) with ultra-low latency. It is commonly used for caching, session storage, leaderboards, distributed locks, and message queues (Redis Streams).
- Wide column database:For example, Cassandra, HBase. Data is stored in cells that are positioned by row keys, column families, column qualifiers, and timestamps. It is suitable for scenarios where massive data (especially temporal data), high write throughput, and querying by line key range. Extremely scalable.
- Graph Database:For example, Neo4j, Amazon Neptune. Data is stored in nodes, relationships, and properties. Expertise in handling highly correlated data and conducting deep relationship traversal queries (e.g., social networks, recommendation engines, fraud detection).
Advantage:
- Flexible mode:It is easy to adapt to changes in requirements and store semi-structured and unstructured data.
- Extreme scalability:It is typically designed to scale horizontally (sharding) and can handle large amounts of data and high concurrent access.
- High performance for specific scenarios:For example, document reads and writes in document databases, ultra-low latency reads and writes in key-value databases, high-throughput writes in wide-column databases, and relational queries in graph databases.
Inferior position:
- Weakened transactions and consistency:Transactions that typically only support single-document/key-value operations are supported, cross-record/cross-shard transactions are weak and complex (e.g., MongoDB 4.0+ supports multi-document ACID transactions but has performance penalties), and eventual consistency models are more common.
- Query capabilities are relatively limited:Compared to SQL’s versatility and power, NoSQL’s query language is generally optimized for its data model and has weak cross-type/complex correlation query capabilities (except graph databases).
- Learning Curve and Tool Ecology:Different types of NoSQL vary greatly and require specialized learning. Management tools and monitoring schemes are generally less mature than RDBMS.
3.2 Database selection scheme
For a full-featured, enterprise-grade low-code platform, a single database type often struggles to meet all needs. It’s wise to adopt a hybrid persistence strategy and choose the most appropriate database technology based on the nature of your data, access patterns, and consistency requirements.
Core Business Data:User accounts, organizations, permission configurations, form definitions, process definitions, process instance statuses, core business entities and their relationships, etc., all have extremely high requirements for data consistency, integrity, and transactions. Prefer a relational database (MySQL or PostgreSQL). Leverage its ACID transactions, powerful JOIN queries, and foreign key constraints to guarantee the accuracy and relevance of your core data.
Unstructured/Semi-Structured Data:
- Files/images/videos uploaded by users:Typically stored in object storage (e.g., Amazon S3, MinIO), the database only stores its metadata (file name, path, size, type, uploader, etc.). Object storage provides high-reliability, low-cost massive storage.
- Rich text/JSON dynamic data for form submissions:If the structure is very flexible or the data volume of a single form is large, consider using a document database store such as MongoDB. PostgreSQL’s JSONB type is also a good compromise, allowing for efficient storage and querying of JSON documents in relational databases.
- System Logs/Operation Audit Logs:The data volume is large, the writing is intensive, and the query mode is relatively simple (by time range, keywords). Suitable for write-optimized Elasticsearch (which provides powerful full-text search and aggregation analysis capabilities) or wide-column databases such asCassandra。 You can also write to Kafka before consuming it into these stores.
Caching layer:To significantly improve read performance and reduce the pressure on the primary database, caching must be introduced. Redis is the best choice for supporting rich data structures, extremely high performance, and is often used for caching:
- Data that is frequently accessed and changes infrequently (such as configuration information, permission information).
- database query results.
- Session Information (Session Store).
Scenario-Specific Optimization:
- Real-time collaboration/message notifications:Pub/Sub by Redis or more robustRedis Streams / Kafka。
- High-performance counting/leaderboarding:Sorted Set by Redis.
- Complex relationship analysis (as recommended):Graph database (Neo4j).
Example of a typical low-code platform data storage portfolio:
- Primary Storage:PostgreSQL (stores users, roles, permissions, form/process definitions, core business entities)
- Dynamic Form Data Storage:PostgreSQL JSONB fields or MongoDB
- File Storage:Object storage (S3/MinIO) + PostgreSQL storage metadata
- Cache:Redis
- Logs/Audits:Elasticsearch (+ Logstash + Kibana) or Loki + Grafana
- Optional: Message Queue:Kafka / RabbitMQ (for asynchronous tasks, event-driven)
3.3 Database table structure design
Schema design is the core part of back-end development, requiring trade-offs between normalization, performance, scalability, and business requirements.
Standardization:The main goal is to eliminate data redundancy and update exceptions (insert, delete, modify exceptions). By breaking down data into multiple associated tables and using foreign keys to establish connections (1:1, 1:N, M:N). The advantage is high data consistency and storage space savings. The downside is that queries often require JOIN multiple tables, which can impact performance, especially when data is large.
Anti-normalization:Intentionally introduce redundant data into tables to reduce JOIN operations and improve query speed. For example, store the product name and unit price redundantly in the order schedule (even if they are in the product list), so that there is no need to associate the product table when querying order details. The advantage is that the read performance is significantly improved. The disadvantage is that it increases data redundancy, which may lead to complex updates (need to be updated in multiple places at the same time) and there is a risk of data inconsistency. Careful evaluation of read/write ratios and business tolerance is required.
Design Principles and Practices:
- Clarify the primary key:Choose the appropriate primary key for each table (natural or proxy/surrogate key such as self-incrementing ID, UUID).
- Fair use of foreign keys:Clarify inter-table relationships and maintain referential integrity at the database level or application level (ORM).
- Field Type Selection:Choose the most precise type (e.g. INT, BIGINT, VARCHAR(n), DECIMAL, DATETIME/TIMESTAMP, BOOLEAN). Avoid overusing TEXT/BLOB to store large fields, consider splitting tables or external storage.
- Handling Many-to-Many Relationships:Use the Junction Table.
- Consider scalability:Reserve extended fields (such as ext_dataJSON fields) or use the Entity-Attribute-Value (EAV) model (which can be cautious and can complicate queries) to account for possible future field increases. Designing a well-designed metadata table structure to support the dynamic modeling capabilities of low-code platforms is one of the core challenges of platform design.
- Documentation:Use database modeling tools (e.g., MySQL Workbench, pgModeler) to design and generate ER diagrams that clearly show table structures and relationships.
3.4 Index optimization and database and table seating strategies
Index Optimization:Indexes are a magic wand that speeds up database queries, but they are also a double-edged sword.
Function:Indexes are like a table of contents in a book, allowing the database engine to quickly locate specific data rows and avoid full table scans.
Type:Commonly used B+ tree indexes (MySQL/PostgreSQL default). There are also hash indexes (fast exact matching), full-text indexes (text search), spatial indexes (GIS), composite indexes (multi-column combinations), etc.
Create a policy:
- High-frequency query conditions:Create an index on columns that appear frequently in WHERE, ORDER BY, GROUP BY, JOIN ON clauses. For example, users.username, orders.user_id, orders.create_time.
- High degree of distinction:Choose columns with high distinction (such as unique IDs, mobile phone numbers) to build indexes for the best results. Indexing columns with low distinction (e.g., gender, status flags) is of little significance and may be ignored by the optimizer.
- Override index:If the index contains all the columns required for the query (SELECT columns + WHERE condition columns), there is no need to go back to the table to look up the data columns, and the performance is optimal.
- Composite index:Combine multiple columns into an index. Note the order of the columns: the equivalence query is listed first, and the range query is listed last. Follow the leftmost prefix matching principle.
- Avoid Abuse:Indexes slow down INSERT, UPDATE, DELETE (because indexes are maintained) and take up extra disk space. Regularly analyze the slow query log (slow_query_log) and use the EXPLAIN command to analyze the query execution plan to create only the indexes that are really necessary. Take advantage of the indexing suggestion tools provided by the database, such as MySQLsys.schema_index_statistics.
Database and table:When the capacity (data volume and concurrency volume) of a single database and table reaches a bottleneck, it is necessary to consider partitioning databases and tables to distribute the pressure.
Vertical split:
- Vertical Partitioning:Split different tables into different physical databases by business module. For example, separate user databases, form databases, and process databases. Reduce the pressure on a single database and facilitate independent management by business.
- Vertical sub-table:Split a wide table into columns (hot and cold separation) into separate tables that are frequently accessed (hot data) and infrequently accessed columns (cold data). Reduce the amount of I/O per query.
Horizontal split:
This is the most common strategy for dealing with big data volumes. Distribute data from the same table across multiple tables across multiple databases by a sharding key and rule.
Sharding Strategy:
- Scope sharding:Divide the data according to the range of the shard key (e.g., 110 million to 20 million shards for order_id 1-10 million, 2 for shard2). Advantages: High efficiency in querying by range (such as checking orders for a certain time period). Disadvantages: It is easy to lead to uneven data distribution (hotspot sharding), and improper selection of shard keys will cause severe skew.
- Hash sharding:Hash the shard key, and determine the shard to which the data belongs according to the hash modulus or range (e.g., hash(user_id) % 1024, the result is mapped to a specific shard). Pros: The data is relatively evenly distributed. Disadvantages: Low range query efficiency (all shards need to be checked), large amount of data migration during expansion (rehash required).
- Consistency hash:The improved hashing algorithm only needs to migrate a small amount of affected data when adding or decreasing sharded nodes, greatly reducing the complexity of scaling up and down. It is a commonly used sharding algorithm in distributed systems (such as caching, NoSQL).
- Geographic sharding:Route data to the nearest data center shards based on user geolocation information (e.g., IP, GPS) to optimize access latency and compliance with data residency regulations.
- Business logic sharding:Sharding according to specific business rules. For example, in SaaS multi-tenancy low-code platforms, the most natural shard key is the tenant_id (tenant ID), where each tenant’s (or group of tenants’) data is stored independently in a shard (or database). This naturally isolates tenant data and makes it easy to scale by tenant.
Challenges and Responses Brought by Database and Table Splitting:
- Distributed Transactions:Data updates across shards require distributed transactions to ensure consistency. Scenarios include: eventual consistency + compensation (e.g., Saga pattern), using middleware that supports distributed transactions (e.g., Seata), and trying to avoid cross-sharding transactions (by design with related data in the same shard).
- Cross-shard queries:Operations that require querying multiple shards of data (such as global sorting, multi-dimensional aggregation) become complex and inefficient. Scenario: Use middleware that supports distributed queries (such as ShardingSphere’s Federation execution engine), pull the result set into application layer memory for aggregation (for small result sets), avoid such queries by design, and utilize a separate OLAP analysis database (such as ClickHouse).
- Globally unique ID generation:Stand-alone self-incrementing IDs are not available in distributed environments. Distributed ID generation schemes are required: Snowflake, Redis auto-increment, database number segments, UUIDs (long and unordered), ZooKeeper, etc.
- Data migration and expansion:When adding shards, you need to migrate data smoothly without impacting your online business. Tools such as: ShardingSphere-Scaling, database vendor tools (MySQL Shell UTIL), and self-developed migration tools.
- Operational complexity surges:A powerful database management platform (DMP) and O&M team are required to manage the deployment, monitoring, backup, and recovery of many sharded instances.
Middleware Selection:It is highly recommended to use full-fledged open-source shard and table shard middleware to mask the underlying complexity:
- Apache ShardingSphere (formerly Sharding-JDBC):The first choice for the Java ecosystem. It is positioned as a distributed database ecosystem, providing data sharding, read/write separation, distributed transactions, data encryption, and elastic scaling. It can be embedded directly into the application as a JDBC driver (non-intrusive to the code) or deployed independently in proxy mode (transparent to the application). Powerful functions, active community, and perfect documentation.
- MyCat:Early popular proxy-based database middleware. It is rich in functions (sharding, read/write separation, HA), relatively complex configuration, and community activity has decreased compared to ShardingSphere, but there are still many production applications.
- Vitess (for MySQL):CNCF Graduation Program, developed by YouTube. It is mainly aimed at hyperscale MySQL clusters, with powerful functions (sharding, connection pooling, query rewriting, online DDL), relatively complex deployment and O&M, and good Kubernetes integration.
Read/Write Separation :An important optimization method used before horizontal splitting or in conjunction with database and table splitting.
Principle:The master handles write operations (INSERT, UPDATE, DELETE) and synchronizes data changes to one or more slaves (slave/replication) in real time through replication mechanisms (e.g., MySQL Binlog Replication, PostgreSQL Streaming Replication). The read operation (SELECT) is carried out by the slave library.
Advantage:Significantly increase the read throughput of the system and share the pressure on the main library. Improve availability, and quickly upgrade a slave library to the main library when the main library fails (requires tools such as MHA, Patroni).
Implement:
- Application layer implementation:At the ORM framework or DAO layer, the decision to use the primary or slave data source is based on the SQL type (read/write). It is necessary to deal with the inconsistency of “reading and writing” caused by the master-slave replication lag (such as the data that has just been inserted cannot be queried immediately). It can be mitigated by policies such as “force read master after write” and “wait for replication based on GTID/site”.
- Middleware implementation:SQL is automatically parsed and routed by database middleware (e.g., ShardingSphere, MyCat, ProxySQL, MaxScale). Be transparent about the application, but be mindful of the high availability of the middleware itself.
Application on low-code platforms:A large number of read operations such as the background management interface, report query, and user viewing submitted forms can be routed to the slave database. Write operations such as form submission, process flow, and configuration modification go to the main database.
3.5 Database O&M and Optimization
After the database is designed and deployed, continuous operation and maintenance monitoring and optimization are key to ensuring its long-term stable and efficient operation.
Backup and Restore:This is the last line of defense for data security, and it’s essential to have strict policies in place and regularly verify recovery processes.
Backup Type:
- Physical backup:Directly copy the physical files of the database (e.g. ibdata for MySQL, ibd; PGDATA directory for PostgreSQL). Fast and fast recovery, often requiring a stop or lock of tables (Percona XtraBackup is available, pg_basebackup online hot standby is available).
- Logical backup:Use tools to export the logical structure and data of the database (e.g., mysqldump, pg_dump). Slow speed, slow recovery, but more flexible (optional backup of some objects), universal format.
- Incremental Backup vs. PITR (Point-In-Time Recovery):Combined with full backups and continuous WAL (Write-Ahead Logging) archiving (e.g., MySQL Binlog, PostgreSQL WAL), the database can be restored to any point in time, which is an artifact to deal with incorrect operations (such as deletion and incorrect update).
Backup Strategy:Follow the 3-2-1 principle (3 backups, 2 different media, 1 off-site). Regular (e.g., daily) fully available and more frequent (e.g., hourly) incremental/binlog backups. Backup files need to be stored encrypted, and recovery drills are conducted regularly.
Performance monitoring and tuning:
Monitoring metrics:Closely monitor database core metrics:
- Resource level:CPU usage, memory usage (Buffer Pool Hit Rate is critical), disk I/O (read and write throughput, latency, queue depth), network traffic.
- Connections and Sessions:Number of connections, number of active sessions, long transactions, lock waiting.
- Query performance:Number and details of slow queries, QPS (Queries Per Second), TPS (Transactions Per Second), and average response time.
- Copy status:Replication Lag.
Monitoring tools:Prometheus + Grafana (with exporters such as mysqld_exporter, postgres_exporter), database monitoring tools (such as MySQL Performance Schema, Sys Schema; PostgreSQLpg_stat_*view), commercial database monitoring tools (e.g., Percona Monitoring and Management, Datadog).
Tuning means:
- SQL Optimization: This is the most effective optimization method!Continuously analyze slow query logs (slow_query_log) and use EXPLAIN/EXPLAIN ANALYZE to analyze execution plans. The optimization directions include: avoiding full table scanning, rational use of indexes (creating and avoiding index invalidation), optimizing the order of joins, reducing nesting of subqueries, avoiding SELECT *, using batch operations, and preventing injection of parametric queries.
- Configuration tuning:Adjust database configuration parameters (such as memory allocation innodb_buffer_pool_size, shared_buffers) according to server hardware and load; number of connections max_connections; log settings). Do not blindly apply the “optimization template”, you need to understand the meaning of the parameters and adjust them in combination with the monitoring data.
- Architecture Optimization:For example, the aforementioned database partitioning, read/write separation, and introduction of cache.
High availability deployment:Core databases must be deployed with a high-availability solution to avoid single points of failure.
Master-slave replication + VIP/Proxy failover:The basic solution uses Keepalived + VIP or middleware (such as ProxySQL, HAProxy) to automatically switch traffic to the slave database when the main database fails (the slave database needs to be promoted).
Strong consistent clusters based on Paxos/Raft:Provides higher availability and automatic failover. As:
- MySQL:Percona XtraDB Cluster (PXC), MariaDB Galera Cluster, MySQL Group Replication (MGR, official plan).
- PostgreSQL:Patroni + etcd/ZooKeeper/Consul, PostgreSQL built-in stream replication + synchronous submission + automatic failover (tools required).
- MongoDB:Replica Set (built-in, automatic failover).
- Redis:Redis Sentinel, Redis Cluster。
4. Summary
Technology stack selection is the cornerstone:With an in-depth understanding of the core advantages and applicable boundaries of the three mainstream ecosystems of Java/Spring Boot, Python/Django, and Node.js, combined with the enterprise-level positioning of low-code platforms (high performance, high stability, complex logic, and long-term maintenance), the comprehensive strength of Java + Spring Boot makes it the most robust default choice. Python’s strengths in AI/Data convergence, Node.js in I/O-intensive, and high-concurrency API scenarios make it a strong candidate for specific modules or supplemental stacks.
Architecture design and pattern:Microservices architectures provide the best paradigm for complexity and resiliency, but they require robust infrastructure (API gateways, service registration discovery, configuration centers) and a mature DevOps culture to navigate their complexities. API gateways are essential as a unified entry point and security barrier. Service registration discovery is the link that keeps the world of dynamic microservices together. Load balancing is a key means of distributing pressure and ensuring availability. The design of high availability, stability, and security must be carried out throughout, from multi-copy deployment, circuit breaker downgrade and current limiting, to full-link tracing, refined monitoring and alarming, to strict transmission security, identity authentication authorization, and vulnerability management, to build a solid line of defense that will never go down.
Database design is fundamental:Data is at the heart of the application. Adopt a Polyglot Persistence strategy to accurately select based on data characteristics and access patterns (relational protection of core transactions, NoSQL/cache/object storage for flexibility and performance). A well-designed table structure (balancing normalization and denormalization) is the foundation for efficient access. Index optimization is a daily homework to improve query performance. In the face of massive data, database and table sharding and read/write separation are must-grasp scaling tools, and at the same time, it is necessary to clearly understand the distributed challenges they bring and make good use of mature middleware to solve them. Backup and recovery, performance monitoring, and high-availability deployment are the continuous guarantees of the database lifecycle.