Meshly Data Stack · v1.0

Your complete data platform.
One pod. One command. Self-hosted.

MDS bundles ingestion, lakehouse storage, SQL federation, orchestration, AI agents and SSO into one stack. Spin it up on a laptop, deploy to your own server, or scale it on OpenShift and Kubernetes. Same stack at every stage. Yours to run.

See the architecture
Containers32 standard / 37 enterprise
Footprint32 GB+ RAM · 16 cores+
TargetsPodman · Docker · OpenShift · K8s 1.19+
LicenseMIT · Source available
~/Meshly-Stack · bash
Progress
59%
PostgreSQL
MinIO
OpenBao
Keycloak
Nessie
OpenSearch
Kafka
Schema Registry
Kafka Connect
OAuth2 Proxy
Trino
Airflow
OpenMetadata
Superset
Vault Agent · Superset
OPA
Prometheus
Fluent Bit
Fluent Bit Mapper
Grafana
Vault Agent · Grafana
Valkey
Qdrant
Faust Streams
MDS MCP Server
MDS Dashboard
MDS-BI
MDS Data API
LangFlow
LangGraph
LangGraph Worker 1
LangGraph Worker 2
Grafana...
Full log: /tmp/meshly-install.log
[Press Ctrl+C to exit install]
For decision makers

Built for the long term.

One contract. One stack. One platform you will still be running in five years, with the same predictable bill.

→ Pricing

One flat fee. Any volume.

Single annual support price. Same fee whether you ingest a gigabyte or a petabyte. No per-row, per-query, or per-seat surprises. Budgetable forever.

→ Ownership

Yours forever.

MIT licensed. No vendor lock-in. No data egress fees. No phone-home telemetry. Walk away with everything you built if you ever need to.

→ Consolidation

One stack. One contract.

Replaces the warehouse, BI, ETL, observability, identity, vector search, and secrets vendors you stitched together. One bill, one vendor relationship, one procurement cycle instead of twelve.

→ Compliance

Compliance from day one.

SOC2-aligned controls. GDPR by architecture, not by checkbox. EU-resident or air-gapped on your own infrastructure. Every secret access audit-logged via OpenBao.

For customers

Your stack, your portal.

Self-serve access to your Meshly Data Stack: license, team, contracts, contacts, support, release updates. All in one place.

→ For customers

Your stack, your team, your contracts. One sign-in.

From licence to launch, everything related to your Meshly Data Stack lives in one place. See your team, your contracts, your contacts. No ticket to file just to know what is yours.

Sign in →
01 · The Stack

Best-of-breed open source, pre-wired.

Thirty-two containers in the Standard tier, one shared network namespace, randomly generated credentials stored in OpenBao. Every component is production open source: Trino, Kafka, Iceberg, Airflow, Superset, Keycloak. All chosen and tuned to work together.

Storage
PostgreSQL
Shared metadata database backing Airflow, Superset, OpenMetadata, Grafana, LangFlow and LangGraph.
Storage
MinIO
S3-compatible object storage hosting the Iceberg lakehouse and intermediate ETL outputs.
Catalog
Nessie
Git-like catalog with branching and time travel for Iceberg tables, backed by PostgreSQL.
Streaming
Apache Kafka
Single-node KRaft broker. Three listener addresses cover laptop, in-pod and external clients.
Streaming
Schema Registry
Avro and JSON schema management for Kafka topics, enforcing data contracts across producers.
CDC
Kafka Connect
Debezium PostgreSQL CDC with pluggable Kafka Connect sinks: Postgres replicas, Iceberg lakehouse, OpenSearch, S3 and more.
Auth-proxy
OAuth2 Proxy
Keycloak-fronted auth sidecar for the Kafka Connect REST API, enforcing SSO on every call.
Query
Trino
Distributed SQL federation across MinIO, PostgreSQL, Iceberg and Kafka, with OPA policy enforcement.
Search
OpenSearch
Distributed search and analytics engine indexing logs and powering OpenMetadata full-text.
Logging
Fluent Bit
Tails every container stdout via a read-only Podman log bind-mount and ships records to OpenSearch.
Sidecar
Fluent Bit Mapper
Polls the Podman API every 30s and rewrites log records so OpenSearch indices use service names, not container IDs.
Orchestration
Apache Airflow
DAG-based pipeline orchestration with native OpenBao secrets backend and Keycloak SSO.
Catalog
OpenMetadata
Data catalog, lineage, governance and quality tests on OpenSearch with Keycloak login.
BI
Apache Superset
Self-service dashboards backed by Trino federation, with row-level security and Keycloak SSO.
Observability
Grafana
Time-series dashboards and alerts on Prometheus, with PostgreSQL backend and Keycloak OAuth.
Observability
Prometheus
Metrics collection scraping every container in the pod with retention and multi-target support.
Identity
Keycloak
OIDC identity provider hosting the mds realm, federating SSO across every stack UI.
Authz
Open Policy Agent
Dynamic RBAC policies for Trino, Kafka Connect and other resources, reloaded without restart.
Secrets
OpenBao
Vault-compatible secrets store with TLS, auto-unseal and full audit logging.
Sidecar
Vault Agent · Superset
OpenBao agent rendering secret templates into Superset configuration at startup.
Sidecar
Vault Agent · Grafana
OpenBao agent rendering secret templates for Grafana environment and dashboard provisioning.
Cache
Valkey
Redis-compatible in-memory cache for sessions, pub/sub, Faust state and LangGraph job queues.
Streaming
Faust Streams
Python stateful stream processing for real-time aggregations and transformations on Kafka topics.
Vectors
Qdrant
Vector similarity search for embeddings and RAG, with REST and gRPC APIs.
AI · Agents
MDS MCP Server
Stack exposed as MCP tools (340+ across 20 categories) for Meshly Build and Claude Code agents.
AI · Agents
LangFlow
Visual builder for LangChain workflows with drag-and-drop nodes for prototyping AI flows.
AI · Agents
LangGraph
Customer agentic runtime: FastAPI REST and SSE on Postgres-checkpointed state, Keycloak-validated identity. Ships with 2 worker replicas pulling from Valkey Streams for concurrent runs.
Management
MDS Dashboard
Container monitoring, logs, credential retrieval and quick restarts behind Keycloak SSO.
AI · BI
MDS-BI
AI-operated dashboard builder. React and MUI, build, preview and approve flow.
AI · API
MDS Data API
AI-managed REST endpoints on top of Trino and PostgreSQL, with auth, caching and CORS.
Marketplace

Domain modules, installed in one click.

Beyond the 32 containers in your stack, browse and install domain modules from the Meshly catalog. Tier-locked, signature-verified, one click from your portal into your stack.

IngestionEnterprise
Apache NiFi
Visual data flow programming for complex ingestion pipelines.
StreamingEnterprise
Apache Flink
Stateful stream processing at scale, jobmanager and taskmanager.
ComputeEnterprise
Apache Spark
Distributed batch and structured streaming, master, worker and history.
DomainEnterprise
HL7 / FHIR Bridge
Healthcare interoperability adapter, HL7v2 and FHIR R4.
DomainEnterprise
GTFS-RT Feed
Real-time public-transit position and alert ingest.
DomainEnterprise
GDPR Compliance Pack
Retention policies, right-to-erasure tooling, audit reports.

+ FinOps Suite · ITxPT Ingestion · OPC-UA Bridge · Fleet Analytics · Milvus · LangFlow · MDS Scraper. New modules added regularly.

Browse the marketplace →
02 · Architecture

A layered platform, not a pile of containers.

Every layer talks to the ones beside it through pre-wired connections. Credentials flow from OpenBao at startup; SSO flows from Keycloak across every UI; queries flow through Trino; events flow through Kafka into the lakehouse.

00 · Security SSO, RBAC & Secrets
Keycloak SSOOAuth2 ProxyOPA PoliciesOpenBao SecretsVault AgentsTenant federationPer-service tokensDynamic credentials
01 · Ingestion Bring data in
Apache NiFiKafka Connect + DebeziumREST / API connectorsIoT bridgesExternal CDC sources
02 · Messaging Real-time event bus
Apache KafkaSchema RegistryAvro / JSON3 listener contexts
03 · Processing Stream + batch
Apache FlinkApache SparkFaust StreamsMLlib · CEP
04 · Storage Lakehouse + relational + vectors
MinIO (S3)PostgreSQLApache IcebergNessie versioningQdrant vectorsValkey cache
05 · Query One SQL surface
Trino federationIceberg catalogPostgres catalogKafka catalog
06 · Orchestration & Catalog Pipelines, lineage, quality
Apache AirflowOpenMetadataData quality testsLineage
07 · Analytics & AI Dashboards and agents
Apache SupersetGrafanaMeshly Build (ops MCP)MDS-BIMDS Data APILangFlowLangGraph (customer agents)
08 · Observability Metrics, logs, health
PrometheusGrafanaFluent BitFluent Bit MapperOpenSearch logsHealth checks
09 · Management Operate and self-serve
MDS DashboardCustomer portalPer-service restartAudit log
03 · Tiers

Pick a footprint. Not a checkout.

Configuration tier picks which services run. Hardware sizing is a separate axis: the same code runs at any size, from a laptop to a production rack. Price is a flat annual support fee, not a usage tax.

Custom
Bring your own footprint
ContainersPick
  • Pick individual components
  • Keycloak + OPA always required
  • Tight resource budgets
  • Specialised deployments
  • No stream / batch by default
Enterprise
All components · Stream + batch
Containers37
  • Everything in Standard
  • Apache NiFi visual ingestion
  • Apache Flink stream processing
  • Apache Spark batch + ML
  • Spark History server
  • Complex event processing (CEP)
  • Faust auto-disabled if Flink is on
Hardware sizing

Separate axis from configuration tier. Pick the row that matches what you will actually do with the stack.

SizingWhat it really meansRAMCPUsDisk
Boot floorStack starts, demos open, no real queries12 GB640 GB
Light devOne user, ad-hoc Trino queries, light DAGs16 GB8100 GB SSD
ProductionConcurrent users, steady ingest, retained data64 GB+24+500 GB+ NVMe
04 · Dashboard

One pane of glass for every container.

The MDS Dashboard ships with the stack. Live container metrics, structured logs, Postgres schema and query inspection, an OpenBao secrets browser, Data API consumer key lifecycle, module installs and quick restarts, all behind Keycloak SSO.

localhost:8090 / overview
CPU Usage
30%
Across 32 containers
RAM Usage
15%
11.38 / 22.91 GB
Disk Usage
8%
36.88 / 450.42 GB
Network I/O
↑1.38
↓918
Across 32 containers
Running32
Stopped0
Total32
Paused0
mds-postgres
up 4d 12h
CPU2.5%
RAM0.10 GB
:5432
mds-trino
up 4d 12h
CPU13.1%
RAM1.00 GB
:8180
mds-kafka
up 4d 12h
CPU0.8%
RAM0.93 GB
:9092:9094
mds-keycloak
up 4d 12h
CPU0.3%
RAM0.70 GB
:8445
05 · Connect

Four surfaces, one stack.

From code on the host, inside the pod, or in an external container, to operators and agents driving the stack through the Meshly CLI from any workstation. Every surface is documented; every path stays the same as your deployment grows.

connect.py
# 1 · Host machine · Trino over localhost
from trino.dbapi import connect
conn = connect(
  host="localhost", port=8180,
  user="dev", catalog="meshly_coffee",
)
06 · Operate

Production-grade by default.

MDS is shaped by the things you only learn from running platforms in production: secrets rotation, blue-green deploys, structured logging, backup retention.

→ Credentials

Zero hardcoded passwords.

Every service password is generated at install and sealed in OpenBao. Per-service tokens grant least-privilege reads. Rotate any credential with one script.

→ SSO

One Keycloak realm, every UI.

Superset, Grafana, Airflow, OpenMetadata, NiFi, MDS-BI. All federate through the mds realm. Bring your own Entra ID, Google Workspace or LDAP.

→ Backups

Versioned, testable backups.

PostgreSQL, Keycloak, Grafana, MinIO buckets, OpenBao Raft snapshot, Trino catalogs and OpenSearch dumped into ~/.meshly-data-stack/backups/ with retention. Restore in dry-run mode before committing.

→ Deployment

Laptop, server, OpenShift.

Same install on macOS, Linux, WSL2, plus OpenShift and Kubernetes 1.19+. --mode server --domain yourdomain.com generates Nginx + SSL config and per-service subdomains. K8s manifests ship default-deny NetworkPolicies and Secret-driven credentials. Same stack at every stage, no migration project.

→ Observability

Pino logs. Prometheus metrics.

Every container’s stdout streams to OpenSearch automatically: Fluent Bit tails Podman log storage, parses each line, and writes to logs-<service>-YYYY.MM.DD indices. Zero per-app config. Prometheus scrapes every container, Grafana ships with default dashboards, health checks across every container with auto-recovery via MDS Sentinel.

→ AI-native

Agents are first class.

Two AI surfaces. The ops surface (MDS MCP Server, 340+ tools across 20 categories) lets Meshly Build and Claude Code introspect, query, and manage the stack. The customer surface (LangGraph) hosts your own agentic flows on Postgres-checkpointed state with Keycloak-validated per-user identity. Both behind audit trails.

→ Lifecycle

Components install and uninstall like packages.

Per-component install, uninstall and purge with explicit flags on the installer. The component registry rehydrates on stack updates so customisations survive upgrades, no manual reinstall. Component runtime config lives in OpenBao, managed via MCP tools, and every write is audited.

CLIAir-gapped by default

Run the stack from a workstation that never touches the internet.

One CLI talks to MDS and Meshly Build. SSO through your Keycloak realm. Works fully offline against a local stack, or over a VPN to a customer site. A real differentiator for regulated, sovereignty-conscious deployments.

See the Meshly CLI →

Let's plan your stack.

MDS is delivered through Meshly consultation, sized and configured for your environment. Source available, never self-serve.

Read the documentation