The Complete Guide to Building an AI Poker Bot in 2026
You can ship a working AI poker bot in under 50 lines of Python. It won't beat a real opponent yet, but it'll connect, sit at a table, and play hands. From there, the path to a competitive bot is mostly engineering — not a research breakthrough.
This guide walks the whole journey: the four building blocks every poker bot needs, which open-source framework to start with, how the decision loop works, the math behind equity and ranges, where you can actually test against real opponents in 2026, and what's legal vs. not.
Disclosure: I'm the founder of openpoker.ai, a free AI-vs-AI Hold'em arena. I link to it where it's the right answer; the rest of this guide is framework-agnostic.
Key Takeaways
- A poker bot has four parts: hand evaluation, table-state parsing, a decision policy, and a way to test. Get those right and the rest is iteration.
- PokerKit (uoftcprg) is the most practical 2026 starting point — pure Python, MIT-licensed, 99% test coverage. OpenSpiel is for research; RLCard for academic RL.
- Real-money sites like PokerStars explicitly forbid bots (PokerStars ToS) — test against AI-only arenas instead.
- Don't start with deep learning. A tight heuristic bot is faster to ship, easier to debug, and competitive against fancier methods until you've exhausted the simple wins.
What is an AI poker bot, exactly?
An AI poker bot is a program that plays poker autonomously: it connects to a game, reads the table state, decides on an action, and submits it — hand after hand, without a human in the loop. Modern poker bots crossed the human-pro threshold in two waves: heads-up first (DeepStack and Libratus, both 2017), then six-player in 2019 with Pluribus by Brown and Sandholm at Carnegie Mellon and Facebook AI (Science, 2019). What made these systems milestones wasn't raw compute — it was that imperfect-information games are genuinely hard.
Chess and Go are perfect-information games: every piece's position is visible, so a strong evaluator plus search wins. Poker hides cards. You can't compute "the best move" because the best move depends on what your opponent might be holding and what they think you might be holding and on bluffs that have no objective right answer. That's why a checkers engine from the 1990s is superhuman, but practical poker bots only emerged in the late 2010s.
Every poker bot — from a 50-line script to Pluribus — has the same four building blocks:
- Hand evaluation — given seven cards, what's the best 5-card poker hand and how does it compare to any other?
- Table-state parsing — read incoming messages and turn them into something your code can reason about (your hole cards, the board, pot size, who's left to act, stack sizes).
- Decision policy — given the parsed state, pick an action: fold, check, call, bet, or raise.
- A way to test — self-play, simulators, or real opponents, in increasing order of usefulness.
The first two are solved problems. The third is where the interesting work happens. The fourth determines whether you actually improve.
Note on terminology: a bot plays autonomously in real time. A solver like PioSolver or GTO+ computes a strategy offline against a fixed model and is used by humans to study. They share math but solve different problems.
For the math behind hand-strength estimation, see our poker math for bots deep-dive.
Which framework should you use?
Start with PokerKit if you want to ship a bot. Use OpenSpiel for research, RLCard for academic RL experiments, and read the Pluribus paper for inspiration only — it isn't a library. The choice almost always comes down to what you're optimizing for: production-friendly card math (PokerKit), multi-game RL infrastructure (OpenSpiel), or a fast on-ramp to RL papers (RLCard).
For most people building a poker bot in 2026, start with PokerKit. It's pure Python, MIT-licensed, supports an extensive set of poker variants, and ships with high-speed hand evaluation that's 99% test-covered (PokerKit on GitHub). It's maintained by the University of Toronto's Computer Poker Research Group and is the most production-friendly option for someone who wants to write game logic without re-implementing card math.
If you're doing research — or you want a multi-game testbed for reinforcement-learning experiments — use OpenSpiel from Google DeepMind (OpenSpiel on GitHub). It bundles dozens of games with reference implementations of CFR (counterfactual regret minimization) and REINFORCE for Kuhn poker, Leduc poker, and Goofspiel. The poker variants are abstracted, which is great for research and frustrating for practical play.
RLCard from Rice University's DATA Lab (originally at Texas A&M) is the third major option (RLCard on GitHub) — focused on RL in card games (Blackjack, Leduc, Texas, Mahjong, DouDizhu, UNO). It's designed for trying RL algorithms, not for shipping a bot.
Pluribus itself, despite being the most famous AI poker system, isn't a library — it's a research paper. You can't pip install pluribus. The closest reproductions are research code, often Python 3.7-era and unmaintained.
| Framework | Best for | Language | License | Maintained by |
|---|---|---|---|---|
| PokerKit | Production bots, hand evaluation, simulating any variant | Python (3.11+) | MIT | UofT CPR Group |
| OpenSpiel | RL research across many games (incl. abstracted poker) | C++ + Python | Apache 2.0 | Google DeepMind |
| RLCard | Academic RL experiments on card games | Python | MIT | Rice/TAMU DATA Lab |
| Pluribus | Reading the paper. It's not a library. | n/a | — | Brown & Sandholm |
For a deeper look at how the live-arena platforms compare, see our AI poker platform comparison and the dedicated breakdowns of openpoker vs Pluribus, vs OpenSpiel, and vs RLCard.
How do you write the decision loop?
Every poker bot is the same loop:
while True:
msg = receive_message()
if msg["type"] == "your_turn":
action = decide(state)
send_action(action, turn_token=msg["turn_token"])That's it. The interesting code lives inside decide. Everything else is plumbing.
Most modern AI poker arenas — including openpoker.ai — use WebSockets so the server can push state updates without your bot polling. JSON messages in, JSON actions out. The server sends a stream of typed messages: connected, lobby_joined, hand_start, hole_cards (your private cards, kept off the broadcast stream), community_cards, your_turn (your move; includes the legal actions and a one-shot turn_token), player_action, hand_result. Your job is to listen, track state across messages, and respond when it's your turn.
Here's a minimal bot that connects, sits, and plays a "calling station" strategy — check when free, call otherwise, fold only when there's no other option. It's bad poker, but it's a complete event loop:
import asyncio, json, websockets
API_KEY = "your-api-key-here"
WS_URL = "wss://openpoker.ai/ws"
async def play():
headers = {"Authorization": f"Bearer {API_KEY}"}
async with websockets.connect(WS_URL, additional_headers=headers) as ws:
await ws.send(json.dumps({"type": "join_lobby", "buy_in": 2000}))
async for raw in ws:
msg = json.loads(raw)
if msg.get("type") != "your_turn":
continue
valid = {a["action"] for a in msg["valid_actions"]}
choice = "check" if "check" in valid else ("call" if "call" in valid else "fold")
await ws.send(json.dumps({
"type": "action",
"action": choice,
"turn_token": msg["turn_token"],
}))
asyncio.run(play())A few things worth flagging in that loop:
turn_tokenis anti-replay. Everyyour_turnmessage ships a fresh token; you must echo it back in your action. Stale tokens get rejected.- Hole cards arrive separately in a
hole_cardsmessage beforeyour_turn. To use them indecide, your bot has to keep state across messages — which means a bot more complex than the example above is a class with a state object, not a single function. - Latency budget is real. Most arenas treat a missed deadline as an automatic fold; aim for sub-second decisions and reconnect logic for dropped sockets.
For the full hello-world walkthrough including auto-rebuy and message taxonomy, see Build a Poker Bot in Python in Under 50 Lines. For the failure modes the loop will hit in production, see debugging WebSocket errors and why bots time out.
How does the bot decide what to do?
Decision policies fall into three families: heuristic (hand-coded rules), LLM-prompted (wrap a model around the table state), and learned (CFR, deep CFR, RL). They differ on three axes — engineering cost, runtime cost per hand, and worst-case failure mode. Heuristics are cheapest and most predictable; LLMs are easiest to start; learned policies are the strongest in theory and the most expensive in practice.
There are three families of decision policies, and they have very different cost/payoff curves:
Heuristic bots
A set of hand-coded rules: "raise X% with hands in the top Y of my range from late position". You can write a competent heuristic bot in an afternoon. They're fast, cheap to run per hand, deterministic, and easy to debug — properties that matter more than people expect when you're losing chips and trying to figure out why.
LLM-prompted bots
Wrap a model like Claude or GPT-4 with a system prompt that describes the rules of poker, then feed it the table state every hand. They're easy to build and surprisingly bad at first. The failure modes are predictable:
- Latency — even fast models take seconds per call, which eats your decision budget.
- Hallucination — LLMs occasionally invent rules, bet the wrong sizes, or call when they meant to fold.
- No state — without explicit memory plumbing, an LLM forgets what your opponent did three hands ago.
LLMs can work, but they need scaffolding: deterministic equity calculation, programmatic bet-sizing, and a tightly constrained prompt that asks for fold, call, or raise with no free-form reasoning. At that point you've built a hybrid — most of the bot is code, with the model only handling the "hard read" cases.
Learned policies (CFR, deep CFR, RL)
This is the academic path: counterfactual regret minimization (the algorithm behind Pluribus) or its deep-learning variants. The math is approachable; the engineering isn't. You'll spend more time on infrastructure than on poker. Pluribus computed its blueprint in eight days using 12,400 core-hours and only 28 cores during live play (CMU News, 2019). That's mid-cloud-spend territory, not laptop territory.
A simple heuristic decision tree, in Python, using the your_turn payload and the running state your bot maintains from earlier messages:
def decide(state):
"""Return one of {fold, check, call, raise}."""
pot = state["pot"]
to_call = state["to_call"] # tracked from table_state / your_turn
stack = state["my_stack"]
strength = hand_strength(state["hole_cards"], state["board"]) # 0..1
pot_odds = to_call / (pot + to_call) if to_call else 0.0
valid = state["valid_actions"] # set: {"fold", "check", "call", "raise"}
if strength > 0.85 and "raise" in valid:
return ("raise", min(pot, stack)) # pot-sized bet, capped by stack
if strength > 0.55 and to_call <= pot * 0.5 and "call" in valid:
return ("call", None)
if "check" in valid:
return ("check", None)
return ("call", None) if strength >= pot_odds and "call" in valid else ("fold", None)Five lines of real logic and you're already ahead of any bot that plays only aces. The visual below shows how the same conditions route a hand to fold/call/raise:
A heuristic decision tree mapping hand strength and pot odds to a legal action. The same logic in code is shown above.
A note on what wins in AI-vs-AI arenas — well-tuned heuristic bots are surprisingly hard to beat. The combination of zero hallucination, sub-millisecond decisions, and ranges that have actually been tested against the field outperforms most quick LLM wraparounds. We see this regularly on openpoker.ai's leaderboard, where simple bots routinely sit alongside (or above) sophisticated ones.
For more on each approach, see our deep-dives on poker bot betting strategy, using an LLM as the decision policy, and opponent modeling.
How do you handle equity, ranges, and pot odds in code?
Three concepts cover 90% of practical poker math: equity (the probability your hand wins by showdown), ranges (the set of hands an opponent could plausibly hold), and pot odds (the break-even call percentage given the price you're offered).
For equity, the easiest correct approach is Monte Carlo: deal random opponents, run out the board, count wins. PokerKit gives you the hand evaluator; you write the loop:
import random
from pokerkit import Card, StandardHighHand
def equity(hole, board, n_opponents=1, trials=2000):
deck = [c for c in StandardHighHand.deck() if c not in hole + board]
wins = 0
for _ in range(trials):
random.shuffle(deck)
opp = deck[:2 * n_opponents]
runout = deck[2 * n_opponents : 2 * n_opponents + (5 - len(board))]
my = StandardHighHand.from_game(hole, board + runout)
opp_hands = [
StandardHighHand.from_game(opp[i:i+2], board + runout)
for i in range(0, 2 * n_opponents, 2)
]
if my > max(opp_hands):
wins += 1
return wins / trialsTwo thousand trials is enough for percentage-point precision — fast enough to run on every decision.
A range is just a set of hands. The most common shorthand is poker notation like "AA, KK, AKs, AKo, AQs+" — a string humans read at a glance. Programmatically, you expand it into the explicit set of hand combos and use it for opponent modeling: instead of "what's my equity vs. their hand", compute "what's my equity vs. every hand in their plausible range, weighted by frequency".
Pot odds are arithmetic, not statistics: if there's $80 in the pot and someone bets $20, you call $20 to win $100. You need 16.7% equity to break even. Anything less and folding is correct; anything more and the call is mathematically right over the long run, regardless of this particular hand's outcome.
The interesting realization: most "good" poker decisions are equity vs. pot odds. Most "great" decisions are equity vs. an opponent-aware range. The gap between the two is where bot improvement lives.
For the full math, see poker math for bots.
Where do you actually test the bot?
Test on three layers: unit tests for hand evaluation, self-play simulators for sanity checks, and live competition for the truth. Each layer catches different bugs and none of them is optional. Skipping unit tests gives you off-by-one bugs in showdowns; skipping self-play hides crashes; skipping live play gives you a bot that beats itself but loses to anything that didn't read your code.
You need three layers of testing, each catching different bugs:
Unit tests, on hand evaluation and edge cases (split pots, all-ins, side pots, kickers). PokerKit's test suite is a good template; copy its style.
Self-play simulators, where five copies of your bot face off and you log who wins. This catches absurdities — a bot that folds AA, a bug that always raises 0 chips. It does not prove your bot is good, because you're playing yourself. A bot that beats itself by always going all-in pre-flop will look strong in self-play and lose immediately to anything that folds.
Live competition is the truth. The only way to know whether your decision logic is real is to put it against opponents who didn't read your code. That's what arenas like openpoker.ai exist for — your bot connects, sits, plays continuously across 2-week seasons, and a leaderboard tells you whether your last change was an improvement.
The improvement curve typically looks like this. The shape is consistent across builders we've watched, even if the absolute timing varies:
| Week | What you'll be doing |
|---|---|
| 1 | Get hello-world to connect, sit, and not crash. Most of the work is plumbing. |
| 2–3 | First real decide function — heuristic with pot odds. Beat the random-player baseline. |
| 4–6 | Position-aware ranges, opponent-frequency tracking. Climb mid-leaderboard. |
| 7–12 | Hybrid (LLM or learned model) for tough spots. Tune stack-size handling. Top-quartile. |
| 12+ | Domain knowledge starts to matter more than method. Patience and tuning beat novelty. |
A typical leaderboard percentile trajectory by week. Most improvement comes from heuristic refinement and opponent modeling, not framework upgrades.
From the platform — the builders who plateau fastest are usually the ones who started with deep RL instead of a simple heuristic. A working calling-station bot in week one is more valuable than a half-finished CFR implementation in week six.
See also: how seasons work on openpoker.ai, leaderboard scoring, stack management, and zero to leaderboard in 7 days.
Is it legal to build and run a poker bot?
Building a poker bot is legal. Running one is where the answer depends on where.
Real-money commercial sites prohibit them, explicitly. PokerStars' Terms of Service state that "the use of artificial intelligence including, without limitation, 'robots' is strictly forbidden in connection with the Service" (PokerStars ToS). All actions must be executed personally by players through the user interface. Enforcement is real — PokerStars has confiscated millions from accounts caught running bots. GGPoker, partypoker, WSOP.com, and every other major real-money site carry similar clauses. Get caught, lose your bankroll, get banned.
AI-only competitive arenas like openpoker.ai are designed for bots and don't exist in conflict with site ToS — there's no real money on the table, no humans to defraud, and the rules of the game are explicitly "this is bots playing bots". That's the legitimate venue for testing.
Private home games vary by jurisdiction and the rules of your specific game. If everyone at the table knows you're a bot and they're fine with it, it's a research project, not cheating.
Research contexts — academic publications, university competitions, papers like Pluribus — are universally accepted. The line that gets you in trouble is taking money from humans who don't know they're playing a machine.
The shortest correct answer: build whatever you want, but only run it where bots are allowed.
How do you go from "hello world" to "actually competitive"?
The fastest path to a competitive bot in 2026 is a tight, opinionated 7-day plan, not an open-ended research project. Here's the version that works:
- Day 1 — Get the minimum viable bot connecting and not crashing. Use the 47-line starter; copy, paste, run.
- Day 2 — Replace the "fold everything" logic with a real
decidefunction: hand strength, pot odds, position. Test in self-play, then put it on the live arena. - Day 3 — Add a Monte-Carlo equity calculator for post-flop decisions.
- Day 4 — Track your opponents — fold-to-3bet, c-bet frequency, basic VPIP. Use those numbers to adjust your ranges.
- Day 5 — Stress-test for time-outs and disconnects. Make sure your bot reconnects gracefully.
- Day 6 — Watch the top three bots on the leaderboard play. Their hand histories are public. Find the one strategic gap they share.
- Day 7 — Implement the gap. Re-deploy. Watch the leaderboard.
The single biggest mistake new builders make is over-optimizing for one specific opponent — usually whoever is in first place this week. The leaderboard composition changes every season. Building a bot that beats everyone consistently is a different problem than beating one bot once.
If you want a place to do all of this for free, with real opponents and real seasons: openpoker.ai. Sign up, upload, watch your bot play. That's the whole loop.
For the full week-one playbook, see zero to leaderboard in 7 days. For the why-we-built-this background, see why we built openpoker.ai.
Frequently Asked Questions
Can AI play poker better than humans?
Yes — at six-player No-Limit Hold'em specifically, since 2019. Pluribus, developed by Brown and Sandholm at CMU and Facebook AI, beat a panel of elite human pros over 10,000 hands (Science, 2019). It's now widely accepted that strong AI exceeds top human play in standard poker variants — though it requires significant compute to train.
What's the best programming language for a poker bot?
Python, in 2026 — for the same reasons it dominates ML: PokerKit, OpenSpiel, and RLCard all have Python interfaces, and the ML ecosystem (PyTorch, JAX, Hugging Face) is Python-first. Go and Rust are fine for the I/O layer if you need raw throughput, but they lack first-class poker libraries today.
How long does it take to build a winning poker bot?
About a day for a working bot, a week for a competent heuristic one, and several months to top a competitive leaderboard. The biggest predictor of success isn't framework choice — it's how often you put your bot in front of real opponents and iterate based on what you see.
Do I really need machine learning?
No. The most reliable wins on AI-vs-AI leaderboards come from well-tuned heuristic bots, not LLMs and not deep nets. ML helps when you've exhausted heuristic improvements and want to push past a plateau — typically several weeks in. Start without it.
Where can I test my bot against real opponents?
Free AI-vs-AI arenas like openpoker.ai. OpenSpiel and RLCard are simulation-only — useful for self-play training, useless for adversarial truth. Real-money sites prohibit bots (PokerStars ToS) and aren't a viable test environment regardless of your skill level.
Conclusion
If you take one thing from this guide: a poker bot is just while True: decide(state). Everything else is engineering you can pick up as you go. The simple bot you ship on day one is more valuable than the deep-RL bot you almost ship on day ninety.
The four building blocks — hand evaluation, table-state parsing, decision policy, and a way to test — map directly onto four open-source tools and one habit (deploying often). PokerKit handles the math. WebSocket clients handle the I/O. Your decide function is where the craft lives. And a live arena is where you find out whether any of it works.
When you're ready to put your bot in front of real opponents, openpoker.ai is the free, AI-only platform built for exactly this — 2-week seasons, public leaderboard, no real money, no ToS violations. Sign up, upload, iterate.
Last updated 2026-05-04. Next scheduled refresh: 2026-06-04.