Architecture
Bosca uses a small, thoughtfully designed set of components. Fewer moving parts mean fewer surprises, easier operations, and predictable scaling.
You can start simple: many core functions run in a single server. As your needs grow, you can split responsibilities and scale components independently.
At a glance, this approach helps you:
- Keep operations straightforward for small teams
- Add capabilities gradually with modular growth
- Balance performance, cost, and reliability
Component Organization
Bosca's components are grouped into key functional areas to maintain clarity and ensure effective modularization:
- Object Storage
- Structured Storage
- Search
- Caching
- Workflows
- AI/ML
- Analytics
- General Operations
These functional areas allow us to design and organize the system in a way that is both efficient and scalable. In the following sections, we will explore these components in greater depth and explain how they work together.
Ingress
Component Type: General Operations, See More
Bosca is agnostic about the ingress method you choose for deployment. But, we do recommend nginx as a starting point.
- Kubernetes Deployment: We leverage nginx ingress because we have experienced it running at scale and find it suitable.
- Docker Compose Deployments: In this setup, all services are routed through nginx, enabling it to handle SSL termination and load balancing effectively.
Analytics
Component Type: Analytics, AI/ML, Workflows
See More
The Analytics Collector is a Ktor service that captures first-party events and persists them for downstream analysis. It is optional, but recommended when you want full control over data retention and personalization workflows.
We still recommend using third-party analytics alongside Bosca for validation and redundancy.
This control allows for validation, advanced system capabilities, and a safety net in case additional privacy laws cause unexpected changes in how you leverage third party systems through systems like the App Store or Play Store.
The collector writes data through an Iceberg catalog and object storage configuration, which makes it suitable for batch processing and long-term retention.
If you don't want to use Bosca's analytics system, you can bypass this component.
Bosca Server
Component Type: General Operations
The Bosca Server serves as the backbone of the Bosca platform, offering GraphQL interfaces to manage and interact with your content. It handles critical functions, including workflow state transitions, authentication, permissions, profiles, collections, metadata, supplementary content, documents, guides, and more.
Other Servers
- Analytics Collector: event ingestion and storage (
:backend:servers:analytics-collector) - Messages Server: messaging utilities and email workflows (
:backend:servers:messages-server)
Job Runners
Component Type: General Operations, Workflows
Bosca job runners process background work such as indexing, transition validation, and content processing. Runners are part of the same server binary and can be enabled or isolated by configuration, allowing you to separate API traffic from background processing when needed.
PostgreSQL
Component Type: General Operations, Structured Storage, See More
We leverage PostgreSQL for many aspects of the Bosca System. Ranging from structured storage, to its JSONB storage. Most major cloud providers provide managed PostgreSQL services. Allowing for low overhead backups and scaling (through things like read-replicas). There are also several PostgreSQL compliant databases that allow for other scaling approaches like CockroachDB and YugabyteDB. We typically use CloudNativePG to manager our PostgreSQL deployments.
Meilisearch
Component Type: General Operations, Search, See More
Meilisearch is our preferred search index. Thanks to its foundations in Rust, it has a very reasonable memory footprint and is very fast. It also has many advanced features. While there are certain trade-offs in functionality that they have chosen to make to achieve some of the capabilities they have, we have found them to be acceptable in most cases. With their vector store, things like semantic search are extremely easy to integrate and manage.
While Meilisearch doesn't have native clustering, there are easy ways to achieve eventually consistent read replicas via Bosca Workflows. Combined with Kubernetes load balancing, this is a practical way to scale search efficiently.
Redis
Component Type: General Operations, Caching, See More
We use a Redis compliant server (though you can choose any Redis compliant server), called Dragonfly.
We chose Dragonfly because it has a Kubernetes operator that makes it extremely easy to spin up a primary and failover node. In addition, its performance is on par with our expectations thanks to it being multithreaded.
We're primarily using this component for Caching, Workflow Job Queues, and PubSub. Most cloud providers provide managed Redis services that can satisfy all the use cases.
We're moving more towards leveraging NATS instead of Redis, but Redis is the more proven than NATS at this time.
Object Storage (S3 or Cloud Storage)
Component Type: General Operations, Object Storage
Bosca uses S3-compatible object storage for assets and analytics data. But it also supports Cloud Storage, and standard file systems.
NATS
Component Type: General Operations, Messaging
Bosca uses NATS for lightweight messaging and optional job queue backends.
Text Extractor
Component Type: General Operations, Content Processing
The text extractor is a standalone service used for extracting text from uploaded documents. It runs as a separate container in local development.
Image Processor
Component Type: General Operations
Publishing images often requires creating multiple size and format variants. The image processor handles tasks such as resizing, format conversion, optimization, and more.
OpenTelemetry
Component Type: General Operations, Telemetry
Bosca includes OpenTelemetry instrumentation and can export traces to any OpenTelemetry compatible backend.