blog · Jul 5, 2026

self-hosted analytics in one binary, no clickhouse, no postgres

Arjun Varma · maker of smolanalytics

if you want self-hosted analytics without running ClickHouse or Postgres, smolanalytics is one open-source (MIT) Go binary that stores events itself: no external database, no cluster, no ops project. you run docker run or drop a single static binary on a small VPS, and the dashboard, the API, and the MCP server are all up. events cost about 7 bytes each at rest, because the hot log seals into compressed, CRC'd columnar segments on local disk (or S3/R2), so a small box holds billions of events on flat memory. that is the whole footprint: one process, one data file. the tradeoff is deliberate and written down: it will never add feature flags, session replay, A/B testing, surveys, data warehouses, or multi-node clustering. exactly one writer, one binary, forever. if you need a distributed event platform, this is the wrong tool; if you want analytics you can actually run yourself and forget about, this is the shape.

why is most self-hosted analytics so heavy?

because analytics tools grew into event platforms, and event platforms assume a database tier.

Umami is clean open-source web analytics, but the app always needs a database running beside it, and since v3 dropped MySQL, self-hosting means standing up Postgres. that is a second service to provision, back up, patch, and keep alive. the smolanalytics vs Umami page walks through this in detail.

PostHog is the far end of the same axis. self-hosting it means Kafka, ClickHouse, Redis and Postgres, a real cluster. PostHog has publicly said most teams lack the resources to run it reliably, and that a full disk can take an instance down for hours or days. that is not a knock on PostHog; it is what happens when one product tries to be nine.

the pattern is: the analytics is the easy part, and the storage tier is the thing that turns "self-host" into a weekend that becomes a month.

how does one binary store analytics without a database?

the trick is that analytics events are a narrow, append-only workload, and you do not need a general-purpose database for that.

smolanalytics keeps a durable append-log for the hot, recent events on a single box. as events age, the log seals into an immutable columnar segment (compressed, CRC'd, about 7 bytes per event at rest) that lives on local disk or object storage (S3/R2). RAM stays flat no matter how much history piles up, because memory is bounded by the seal size, not by total events. the store interface has three backends behind it: in-memory, the durable append-log, and the columnar segment tier. none of them is a separate database process you run.

so the operational reality is: one binary, one data file. point it at a directory, and that directory is your entire analytics database. back it up by copying the file. there is no second process to babysit, no schema migration to run, no cluster to keep in quorum.

docker run -p 8080:8080 -v $PWD/data:/data \
  -e SMOLANALYTICS_DB=/data/smolanalytics.data \
  ghcr.io/arjun0606/smolanalytics
# dashboard + API + MCP at http://localhost:8080

what does "~7 bytes per event" mean in practice?

it means a small VPS is genuinely enough, not marketing-enough.

at roughly 7 bytes per event at rest, storage is not the constraint you plan around; a cheap box holds a very large amount of history without you thinking about disk. the placeholder math to make it concrete: if your products generate, say, [a few million events a month], a year of that is on the order of tens to low-hundreds of megabytes sealed on disk, and it sits next to the process, not in a database you provision separately. treat those numbers as illustrative. the point is the order of magnitude, and that the box you would rent for a hobby project is already oversized for this.

because the format is documented with compatibility guarantees and export is one call to CSV or JSONL, "your data" is not trapped in a proprietary store. you can read it, move it, or leave.

what will one-binary analytics never do? (the honest never-list)

this is the tradeoff, stated plainly, so you can rule it out fast if it is wrong for you. the never-list is in the README, not buried:

no session replay. other tools do that well; it is a different data shape.
no feature flags. that is a config system, not analytics.
no A/B testing. same reason.
no surveys.
no data warehouses. this is not a warehouse; it answers questions about your events.
no multi-node, clustering, or HA. exactly one writer per instance, one binary.

bundling those is precisely how you end up needing a Kafka topic and a pricing calculator to self-host. keeping them out is what lets this stay one file on your box. if any single item on that list is a hard requirement for you, pick a heavier tool. that is the correct call, and no amount of "but it's one binary" changes it.

when should you self-host analytics this way?

self-host this when you want to own the data outright, run analytics on a box you already have, and never think about a storage tier again. it is a strong fit for an indie running a portfolio of products (one instance, unlimited sites, a morning brief across all of them) or a team that wants product analytics without an ops burden. compare it against what you run now on smolanalytics vs Plausible and smolanalytics vs Umami.

do not self-host this if you need any never-list feature, or if you need a distributed platform ingesting a firehose across many nodes; that is not what one writer per instance is for.

self-hosting is free forever, MIT, with no feature gates. the binary you run in dev is the whole product, not a demo of a paid cloud. if you would rather not run a server at all, the cloud is a 14-day full trial then $9/month, same product, managed.

the code, the storage design, and the never-list are all at github.com/Arjun0606/smolanalytics: docker run and it is up in 30 seconds. or try the cloud and skip the server entirely.

smolanalytics is the analytics that tells you what to fix — try the cloud or self-host free.