How Rawbbit Works

Rawbbit is a managed game analytics pipeline, built on open-source infrastructure. It collects your game events, stores the raw history in infrastructure you own, makes it fast to query with ClickHouse or BigQuery, models it into useful tables, and serves it through dashboards - deployed and maintained for you, so your studio does not need to hire data engineers.

Under the hood, producers send event batches over HTTP with a per-project API key, the collector validates and enriches them, NATS JetStream buffers the write path with at-least-once delivery, and a raw writer lands partitioned Parquet files in object storage.

From that raw layer, the data is served through ClickHouse or BigQuery, modeled into useful tables, and shown in Metabase. The raw Parquet layer stays the system-of-record boundary, so query engines, models, dashboards, and AI-assisted workflows can evolve without changing the ingestion contract.

Producer → Collector API → NATS JetStream → Raw Writer → Parquet → ClickHouse / BigQuery → Models → Dashboards

Components

Collector API

HTTP ingestion service. Accepts and validates event batches (authenticated with a per-project API key), enriches the accepted events, and publishes them into the stream.

NATS JetStream

Message broker between the collector and the writer. It separates request handling from storage writes, provides buffering, and gives the write path at-least-once delivery.

Raw Writer

JetStream consumer that writes partitioned Parquet files to object storage. The raw Parquet layer is the durable system-of-record for downstream analytics work.

Object Storage

Google Cloud Storage or an S3-compatible backend such as SeaweedFS. Keeps the raw layer portable, exportable, and reusable.

ClickHouse (OLAP)

The primary open-source query layer. A scheduled CRON job loads raw Parquet into ClickHouse, modeled into an analytics.events One Big Table for fast game analytics.

BigQuery External Tables

Alternative query path for teams on Google Cloud. Query raw Parquet through a BigQuery external table, while the raw layer stays portable and rebuildable.

SQLMesh

Downstream modeling layer for the BigQuery path. Reads from the external table over raw Parquet and shapes it into analytical tables, scheduled via Cloud Scheduler.

Metabase

Open-source BI for dashboards on top of the modeled data, so studios can explore player behavior without living inside the database.

MCP Server

Exposes the analytics layer through a standard interface so AI agents can query your data in the same environment your team controls.

Agents layer (Opencode, OpenClaw)

AI agents connect through the MCP server and answer questions in plain language by writing and running SQL against ClickHouse for you.

Deployment

Services run as Docker containers built from docker compose, pushed to Artifact Registry, and deployed onto a VM. Under the managed model, Rawbbit deploys and maintains this for you.

Why this shape

The collector accepts and validates event batches.
NATS JetStream separates request handling from storage writes with at-least-once delivery.
The raw writer lands durable Parquet files in object storage.
Raw Parquet is the system-of-record boundary; ClickHouse or BigQuery sits downstream and can be rebuilt from it.
Downstream modeling and dashboards can evolve without changing the ingestion contract.

What’s working today

Ingestion path (collector + NATS + raw-writer), with at-least-once delivery
Raw Parquet landing
Storage backend selection (GCS or S3-compatible)
ClickHouse query layer (analytics.events OBT, CRON loader)
BigQuery external-table path + SQLMesh modeling (Cloud Scheduler)
Metabase dashboards
MCP server + AI agents (Opencode, OpenClaw)

The current release focuses on reliable ingestion, durable raw storage, ClickHouse or BigQuery serving layers, and dashboards. The JavaScript SDK is available today; native Unity and Unreal SDKs are still in progress.

Ask your data in plain language

Because your modeled data lives in a real database you control, Rawbbit can expose it through an MCP server. AI agents like Opencode and OpenClaw connect to that server and answer questions by writing and running SQL against ClickHouse for you - so someone can ask "what's D7 retention for players who finished the tutorial, by platform?" and get an answer without hand-writing the query.

This only works because the stack is open, queryable, and owned.

Open source, managed for you

Rawbbit is released under the Apache 2.0 License - all source code, documentation, deploy scaffolding, and the modeling projects are public on GitHub. Rawbbit is managed for you by default: book a setup and it is deployed and maintained in infrastructure you own. Technical teams can self-host the same open architecture for free, without vendor lock-in.

Self-host it free

Help improve Rawbbit