How our NL Holdem bot stack is layered

Answer first: our stack has three layers — a decision engine that decides what to do, an execution layer that performs the action on a real client, and a telemetry-and-monitoring layer that decides whether the previous two should be allowed to keep working. We will not name vendors, libraries, or internal tools. We will describe what each layer does and what the failure modes are, because that is the part that matters to operators we want to talk to.

+----------------------+---------+ | DECISION ENGINE | ~20% | +----------------------+---------+ | EXECUTION LAYER | ~35% | +----------------------+---------+ | TELEMETRY + MON. | ~30% | +----------------------+---------+ | OPS + RECONCILE | ~15% | +----------------------+---------+ percent of our engineering time, ballpark.

Layer 1 — Decision engine

The decision engine maps an observed game state to an action distribution. Internally it is a hybrid: a baseline policy trained from solver outputs and a runtime exploitative head that adjusts sizing and frequency against the observed pool. We retrain the baseline on a calendar, not on intuition.

Failure mode we have actually hit: a baseline that was overfit to one platform's pool composition. It played beautifully there and bled mid-stakes equity on a second platform until we caught it three weeks in.

Layer 2 — Execution layer

The execution layer is what turns "raise to 7.2 BB" into actual mouse-and-keyboard activity on a real client. This is where most operators underestimate the work. Timing distribution, motion path, click-down latency, occasional misclicks-and-corrects — all of it has to live inside a believable envelope, and that envelope has to be different per platform.

Failure mode we have actually hit: a near-identical timing distribution across the fleet. Statistically beautiful, behaviourally unique — a single platform-side query found the cluster in an afternoon. The fix was per-agent envelope parameters, sampled at agent creation and frozen.

Layer 3 — Telemetry and monitoring

The telemetry layer is the answer to the question "is anything wrong right now." It watches per-agent win-rate against expectation, per-room anomaly proxies, lag on the client, and the gap between intended actions and observed-on-screen actions. It feeds the kill-switch.

Failure mode we have actually hit: a monitoring blind spot on graphical glitches. An agent's client rendered a frozen pot for forty minutes; the agent kept making correct decisions against a stale state. The fix was a screenshot-difference watchdog with an explicit timeout on frame staleness.

What we do not publish

Library names, framework names, vendor names
The shape of our timing distributions or any concrete parameter
How agent envelopes are generated or refreshed
Any platform-specific specifics, including in private conversations until trust is established

If a vendor publishes those details on a glossy page, assume the publication itself is a tell. Detection teams read sales pages.

What we will discuss

Layered architecture, separation of concerns, supervisor design
Recovery procedures after a detection wave
Reconciliation cadence and tolerance windows
How we structure retraining cycles and what triggers an off-cycle retrain

Why we wrote this page Operators who arrive with a serious problem usually want to know two things: do we understand the layering, and do we have stories about each layer failing. The answer to both is yes, and this page is the polite version of that conversation.

If you would like the impolite version, with specifics, on a private channel:

Deal us in

Ready

Page 3 of 3

README build 211