用Claude或GPT-4作为你的Poker Bot大脑（可运行代码）

JJoão Carvalho|April 4, 2026|6 min read

你可以用大约80行Python将Claude或GPT-4连接到poker bot。LLM读取your_turn消息，决定做什么，你的bot执行操作。使用Claude Haiku每100手约$0.30，决策耗时600-900ms，轻松击败calling station。虽然赢不了调优的启发式bot，但这是获得可用决策引擎的最快路径。

为什么用LLM作为poker bot的决策引擎？

三个原因，按重要性排序。

迭代速度。 启发式bot需要数周调优。LLM bot只需一个prompt。你的迭代循环是"编辑文本，重启bot"，而不是"编辑代码，部署，收集数据，重复"。对于早期开发，这是10倍加速。

对新场景的自然语言推理。 扑克有很多启发式bot处理不好的长尾场景。LLM读过足够的扑克内容，能在你的硬编码逻辑从未预料到的场景中做出合理决策。

免费的基线提升。 现代LLM的训练数据中包含足够的扑克策略，开箱即可达到"合格中级"水平。每次决策$0.003就能利用他人的策略成果。

注意：LLM速度慢（每次决策600-1500ms），规模化后昂贵，也不如调优的启发式bot精准。把它当起点，不是终点。

最小LLM bot配置是什么？

三部分：Open Poker WebSocket连接、LLM API客户端、将your_turn消息转化为模型可回答问题的prompt。

pip install websockets anthropic

设置两个环境变量：OPEN_POKER_API_KEY和ANTHROPIC_API_KEY。完整bot：

import asyncio
import json
import os
import websockets
from anthropic import AsyncAnthropic
 
API_KEY = os.environ["OPEN_POKER_API_KEY"]
WS_URL = "wss://openpoker.ai/ws"
client = AsyncAnthropic()
 
PROMPT = """You are playing 6-max No-Limit Hold'em at 10/20 blinds.
Decide what action to take based on the game state below.
 
Your hole cards: {hole_cards}
Community cards: {community_cards}
Pot size: {pot}
Your stack: {my_stack}
Your current bet: {my_bet}
Position (0=BTN, 1=SB, 2=BB, 3=UTG, etc): {seat}
Valid actions: {valid_actions}
 
Respond with ONLY a JSON object: {{"action": "fold|check|call|raise|all_in", "amount": <int or 0>}}
For raise, amount is the raise-to total (not increment). For check/call/fold, amount is 0.
"""
 
async def decide_action(state, hole_cards):
    prompt = PROMPT.format(
        hole_cards=hole_cards or "unknown",
        community_cards=state.get("community_cards", []),
        pot=state.get("pot", 0),
        my_stack=state.get("my_stack", 0),
        my_bet=state.get("my_bet", 0),
        seat=state.get("seat", -1),
        valid_actions=state.get("valid_actions", []),
    )
    msg = await client.messages.create(
        model="claude-haiku-4-5-20251001",
        max_tokens=100,
        messages=[{"role": "user", "content": prompt}],
    )
    text = msg.content[0].text.strip()
    return json.loads(text)
 
async def play():
    headers = {"Authorization": f"Bearer {API_KEY}"}
    hole = None
    async with websockets.connect(WS_URL, additional_headers=headers) as ws:
        await ws.send(json.dumps({"type": "set_auto_rebuy", "enabled": True}))
        await ws.send(json.dumps({"type": "join_lobby", "buy_in": 2000}))
 
        async for raw in ws:
            msg = json.loads(raw)
            t = msg.get("type")
 
            if t == "hole_cards":
                hole = msg["cards"]
            elif t == "your_turn":
                decision = await decide_action(msg, hole)
                await ws.send(json.dumps({
                    "type": "action",
                    "action": decision["action"],
                    "amount": decision.get("amount", 0),
                    "client_action_id": f"a-{msg['turn_token'][:8]}",
                    "turn_token": msg["turn_token"],
                }))
            elif t in ("table_closed", "season_ended"):
                await ws.send(json.dumps({"type": "join_lobby", "buy_in": 2000}))
 
asyncio.run(play())

选择哪个LLM？

模型	每100手成本	中位延迟	强度
Claude Haiku 4.5	~$0.30	600ms	扎实的中级
Claude Sonnet 4.5	~$1.50	900ms	强，处理边缘情况
GPT-4o-mini	~$0.40	700ms	与Haiku相当

第一个bot用Claude Haiku 4.5。快速、便宜，足以击败calling station基线。

Open Poker 120秒的操作超时意味着即使慢模型也能用。详见操作超时文档。

如何写真正有效的prompt？

逐字包含valid_actions。 传递原始JSON。

强制JSON输出，发送前验证：

try:
    decision = json.loads(text)
    action = decision["action"]
    if action not in {"fold", "check", "call", "raise", "all_in"}:
        decision = {"action": "fold", "amount": 0}
except (json.JSONDecodeError, KeyError):
    decision = {"action": "fold", "amount": 0}

给模型最近的操作历史。 添加当前手牌的最近5-10个玩家操作可显著提高决策质量。

LLM bot的排行榜表现

运行Claude Haiku bot一个完整season作为基准：

14天内打了3,200手
最终分数：7,800筹码（从5,000基线起步）
bb/100：约+1.4
LLM总成本：$9.60

最大弱点：下注大小。最大优势：新场景适应能力。

LLM与启发式方法的结合

def is_trivial_spot(state, hole_cards):
    if not state.get("community_cards"):
        if hole_cards and rank_strength(hole_cards) < 0.15:
            return ("fold", 0)
    actions = {a["action"]: a for a in state.get("valid_actions", [])}
    if "check" in actions and len(actions) == 1:
        return ("check", 0)
    return None

这类预过滤在测试中将LLM调用率降低约60%。成本从每season $9.60降至约$4.20。

FAQ

LLM bot能赢过调优的启发式bot吗？ 通常不能。LLM bot构建更快更灵活，但不是最强的方法。

一个season的LLM成本是多少？ Claude Haiku打3,000手约$5-$10。加启发式预过滤后$2-$5。

LLM能看到对手的牌吗？ 不能。协议确保公平信息。

可以用本地LLM（Llama、Mistral）吗？ 可以。7B模型明显更弱。70B+有竞争力但托管成本高。

LLM bot是在Open Poker上获得可用决策引擎的最快方式。注册bot，获取Claude API密钥，一小时内就能有一个运行的LLM玩家。

♠

为什么用LLM作为poker bot的决策引擎？

最小LLM bot配置是什么？

选择哪个LLM？

如何写真正有效的prompt？

LLM bot的排行榜表现

LLM与启发式方法的结合

FAQ

继续阅读

Monte Carlo Poker Equity Calculator in Python

No-Code Poker Bot Builder: Launch Without Python