Searchless Chess 9M Self-Play
A 9-million parameter transformer-based chess engine trained via self-play with Stockfish evaluation. This model learns to play chess without explicit search during inference, relying purely on learned pattern recognition.
Model Description
- Model Size: 9M parameters (8 layers, 256 embedding dim, 8 attention heads)
- Architecture: Decoder-only Transformer with learned positional encodings
- Training Method: Self-play with Stockfish rewards
- Framework: JAX + Haiku
- Q-Value Distribution: 128 return buckets for action-value prediction
This model predicts action-values (Q-values) for chess positions without performing tree search, making it extremely fast for inference while maintaining strong play.
Installation
CPU Installation
Install the required dependencies for CPU inference:
pip install jax jaxlib dm-haiku orbax-checkpoint numpy chess huggingface-hub jaxtyping apache-beam grain
GPU Installation (Recommended)
For GPU acceleration with CUDA 12:
pip install --upgrade "jax[cuda12_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
pip install dm-haiku orbax-checkpoint numpy chess huggingface-hub jaxtyping apache-beam grain
For other CUDA versions, see the JAX installation guide.
Note: This model includes all necessary code and can be used without cloning the original repository.
Quick Start
import sys
from huggingface_hub import snapshot_download
# Download model from HuggingFace Hub
model_path = snapshot_download(
repo_id="dbest-isi/searchless-chess-9M-selfplay",
local_dir="./searchless_chess_model"
)
# Add bundled code to Python path
sys.path.insert(0, f"{model_path}/searchless_chess_code")
# Import model wrapper
import hf_model
# Load the model
model = hf_model.SearchlessChessModel.from_pretrained(model_path)
# Make a prediction
fen = "rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq e3 0 1"
result = model.predict(fen, temperature=1.0)
print(f"Best move: {result['best_move']}")
print(f"Q-value: {result['q_value']:.4f}")
print(f"Action probabilities shape: {result['action_probs'].shape}")
Example Output
Best move: e7e5
Q-value: 0.0119
Action probabilities shape: (1968,)
Full Example with Multiple Positions
import sys
from huggingface_hub import snapshot_download
# Download and setup
model_path = snapshot_download(
repo_id="dbest-isi/searchless-chess-9M-selfplay",
local_dir="./searchless_chess_model"
)
sys.path.insert(0, f"{model_path}/searchless_chess_code")
import hf_model
# Load model
print("Loading model...")
model = hf_model.SearchlessChessModel.from_pretrained(model_path)
print("Model loaded!")
# Test on multiple positions
positions = [
("Starting position", "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1"),
("After 1.e4", "rnbqkbnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq e3 0 1"),
("Scandinavian Defense", "rnbqkbnr/ppp1pppp/8/3p4/4P3/8/PPPP1PPP/RNBQKBNR w KQkq d6 0 2"),
]
for name, fen in positions:
result = model.predict(fen)
print(f"\n{name}")
print(f" FEN: {fen}")
print(f" Best move: {result['best_move']}")
print(f" Q-value: {result['q_value']:.4f}")
Model Architecture
TransformerConfig(
vocab_size=1968,
output_size=128,
embedding_dim=256,
num_layers=8,
num_heads=8,
max_sequence_length=79,
num_return_buckets=128,
pos_encodings="LEARNED",
apply_post_ln=True,
apply_qk_layernorm=False,
use_causal_mask=False,
)
Training Details
- Base Model: Initialized from pretrained 9M checkpoint
- Training Method: Self-play reinforcement learning
- Reward Signal: Stockfish evaluation at depth 20
- Iteration: 22 (EMA parameters)
- Action Space: 1968 possible moves (all legal chess moves)
- Value Representation: Discretized into 128 buckets
Use Cases
- Fast chess move prediction without search
- Chess position evaluation
- Research on learned planning in board games
- Integration into chess applications requiring low-latency move suggestions
Limitations
- Does not perform explicit search (unlike traditional chess engines)
- May make suboptimal moves in complex tactical positions
- Performance depends on training data distribution
- Best suited for fast move suggestions rather than deep analysis
Background
This model is based on the architecture from DeepMind's Searchless Chess work. The self-play training implementation and this trained model are original work by Darrell Best.
For the full self-play training implementation and codebase, visit:
- Repository: https://github.com/DarrellBest/searchless_chess
License
Apache 2.0
Model Card Contact
For questions or issues, please open an issue on the GitHub repository.
- Downloads last month
- 21