A ground-up Star Realms engine with two modes: a tight headless environment for AI self-play, and a clean UI for humans. A hybrid learning framework combining human gameplay data, supervised bootstrapping, and reinforcement learning โ that trains from scratch, learns what wins, and turns that into insights that make you a better player.
A closed feedback loop between gameplay, data collection, training, and evaluation. Every game played โ human or AI โ feeds the next iteration of the model.
High volume, natural strategies and mistakes. Broad coverage of diverse game states.
Smaller dataset, higher strategic quality. Powers initial supervised bootstrapping.
Unlimited scale. Discovers strategies beyond human play through policy iteration.
Five layers, each with a clear responsibility. The engine handles gameplay; training is fully external so the live system is never affected.
Fully faithful Star Realms base set implementation. Card effects, faction synergies, ally triggers, scrap mechanics โ every rule encoded and covered by tests. Seeded RNG for deterministic replay and reproducible experiments.
Zero UI, zero I/O โ just state, actions, and outcomes. Gym-style step/reset interface so any framework plugs straight in. Discrete action space, serializable state, runs thousands of games per second.
Before RL begins, the agent trains on human gameplay data โ learning to predict strong moves and evaluate positions from real games. Eliminates the cold-start problem and gives RL a head start beyond random exploration.
The agent improves through self-play against versioned snapshots of itself. Policy improvement and value estimation evolve from game outcomes alone. Discovers strategies unreachable by human data.
Structured episode logs feed an external analysis pipeline โ win rates by faction, card impact by turn, deck composition correlations. Policy and data are distilled into a playable UI and a data-backed play guide.
Four distinct components with clean separation of concerns. The engine knows nothing about the UI. Training happens entirely outside the live system.
Full rules implementation with modular effect system, ally resolution, trade row management, and deck cycling. Dedicated test suite per feature area with extensive coverage.
Headless game wrapper with standardized observation space, discrete action interface, and serializable state. Designed for high-throughput simulation compatible with standard ML tooling.
Graphical interface for playing and observing. Real-time game log panel alongside the board. Watch the AI play and compare its decisions against your own in the same view.
External training and analysis system processing structured episode logs. Produces win rate analytics, card impact scores, and the reward signal design underpinning RL training.
The system becomes both a research platform and a practical tool for improving at the game.
A multi-paradigm agent that learned from scratch โ supervised bootstrapping from human data, then refined through self-play RL. Strategy emerges from data, not hardcoded heuristics.
Faction win rates, card priority by turn, deck pacing โ all grounded in simulation results. Every claim backed by a number, not conventional wisdom.
Query the trained policy mid-game. Given the current board state, what does the agent buy? Compare your line against its and understand why it diverges.
Watch strategy evolve over training iterations. See which cards the agent learned to value, when it discovered faction synergies, and how play style shifted over time.
A reusable environment for experimenting with ML approaches against a complex hidden-information game with discrete actions and long-horizon strategy.
Quantified analysis of what drives wins โ not opinion. Authority trajectories, optimal scrap timing, faction purity vs. splashing โ all evaluated at scale.