API Documentation
Docstrings for interface members can be accessed through Julia's built-in documentation system or in the list below.
Contents
Index
CounterfactualRegret.CFRSolverCounterfactualRegret.CSCFRSolverCounterfactualRegret.CallbackChainCounterfactualRegret.ESCFRSolverCounterfactualRegret.ExpectedValueBaselineCounterfactualRegret.ExploitabilityCallbackCounterfactualRegret.Games.KuhnCounterfactualRegret.Games.MatrixGameCounterfactualRegret.OSCFRSolverCounterfactualRegret.ThrottleCounterfactualRegret.ZeroBaselineCounterfactualRegret.actionsCounterfactualRegret.chance_actionCounterfactualRegret.chance_actionsCounterfactualRegret.evaluateCounterfactualRegret.exploitabilityCounterfactualRegret.histtypeCounterfactualRegret.infokeyCounterfactualRegret.infokeytypeCounterfactualRegret.initialhistCounterfactualRegret.isterminalCounterfactualRegret.next_histCounterfactualRegret.observationCounterfactualRegret.playerCounterfactualRegret.playersCounterfactualRegret.strategyCounterfactualRegret.train!CounterfactualRegret.utilityCounterfactualRegret.vectorized_histCounterfactualRegret.vectorized_info
Game Functions
CounterfactualRegret.infokeytype — Functioninfokeytype(g::Game)Returns information key type for game g
CounterfactualRegret.histtype — Functionhisttype(g::Game)Returns history type for game g
CounterfactualRegret.initialhist — Functioninitialhist(game::Game)Return initial history with which to start the game
CounterfactualRegret.isterminal — Functionisterminal(game::Game, h)Returns boolean - whether or not current history is terminal
i.e h ∈ Z
CounterfactualRegret.utility — Functionutility(game::Game, i::Int, h)Returns utility of some history h for some player i
CounterfactualRegret.player — Functionplayer(game::Game{H,K}, h::H)Returns integer id corresponding to which player's turn it is at history h 0 - Chance Player 1 - Player 1 2 - Player 2
If converting to IIE to Matrix Game need to implement: player(game::Game{H,K}, k::K)
CounterfactualRegret.chance_action — Functionchance_action(game::Game, h)Return randomly sampled action from chance player at a given history
CounterfactualRegret.chance_actions — Functionchance_actions(game::Game, h)Return all chance actions available for chance player at history h
CounterfactualRegret.next_hist — Functionnext_hist(game::Game, h, a)Given some history and action return the next history h′ = next_hist(game, h, a)
CounterfactualRegret.infokey — Functioninfokey(game::Game, h)Returns unique identifier corresponding to some information set
infokey(game, h1) == infokey(game, h2) ⟺ h1 and h2 belong to the same info set
(key must be immutable as it's being stored as a key in a dictionary)
CounterfactualRegret.actions — Functionactions(game::Game, k)Returns all actions available at some information state given by key k (See infokey)
CounterfactualRegret.players — Functionplayers(game)Returns number of players in game (excluding chance player)
CounterfactualRegret.observation — Functionobservation(game, h, a, h′)For tree building - information given to acting player in history h
CounterfactualRegret.vectorized_info — Functionvectorized_info(game::Game{H,K}, key::K) where {H,K}For converting information state representation to vector. Default behavior returns unmodified information state.
CounterfactualRegret.vectorized_hist — Functionvectorized_hist(game::Game{H}, h::H) where HFor converting history representation to vector. Default behavior returns unmodified history.
Solvers
CounterfactualRegret.train! — Functiontrain!(sol::AbstractCFRSolver, n; cb=()->(), show_progress=false)Train a CFR solver for n iterations with optional callbacks cb and optional progress bar show_progress
CounterfactualRegret.strategy — Functionstrategy(solver, k)Return the current strategy of solver sol for information key k
If sufficiently trained (train!), this should be close to a Nash Equilibrium strategy.
CounterfactualRegret.CFRSolver — TypeCFRSolver(game; method=Vanilla())Instantiate vanilla CFR solver with some game.
CounterfactualRegret.CSCFRSolver — TypeCSCFRSolver(game; debug=false, method=Vanilla())Instantiate chance sampling CFR solver with some game.
CounterfactualRegret.ESCFRSolver — TypeESCFRSolver(game::Game; method::Symbol=:vanilla, alpha::Float64 = 1.0, beta::Float64 = 1.0, gamma::Float64 = 1.0, d::Int)Instantiate external sampling CFR solver with some game.
Samples a single actions from all players for single tree traversal. Time to complete a traversal is O(|𝒜ᵢ|ᵈ), where d is the depth of the game and |𝒜ᵢ| is the size of the action space for the acting player.
CounterfactualRegret.OSCFRSolver — TypeOSCFRSolver(game; method=Vanilla(), baseline=ZeroBaseline(), ϵ::Float64 = 0.6)Instantiate outcome sampling CFR solver with some game.
Samples a single actions from all players for single tree traversal. Time to complete a traversal is O(d), where d is the depth of the game.
ϵ - exploration parameter
Available baselines:
ZeroBaseline- Equivalent to no baselineExpectedValueBaseline
Games
CounterfactualRegret.Games.MatrixGame — TypeMatrix game of arbitrary dimensionality
Defaults to 2-player zero-sum rock-paper-scissors
- NOTE: N>2 player general-sum games have ill-defined convergence properties for counterfactual regret solvers
CounterfactualRegret.Games.Kuhn — TypeKuhn Poker
"Kuhn poker is an extremely simplified form of poker developed by Harold W. Kuhn as a simple model zero-sum two-player imperfect-information game, amenable to a complete game-theoretic analysis. In Kuhn poker, the deck includes only three playing cards, for example a King, Queen, and Jack. One card is dealt to each player, which may place bets similarly to a standard poker. If both players bet or both players pass, the player with the higher card wins, otherwise, the betting player wins."
- https://en.wikipedia.org/wiki/Kuhn_poker
Extras
CounterfactualRegret.ExploitabilityCallback — TypeExploitabilityCallback(sol::AbstractCFRSolver, n=1; p=1)sol:n: Frequency with which to query exploitability e.g.n=10indicates checking exploitability every 10 CFR iterationsp: Player whose exploitability is being measured
Usage:
using CounterfactualRegret
const CFR = CounterfactualRegret
game = CFR.Games.Kuhn()
sol = CFRSolver(game)
train!(sol, 10_000, cb=ExploitabilityCallback(sol))CounterfactualRegret.Throttle — TypeWraps a function, causing it to trigger every n CFR iterations
test_cb = Throttle(() -> println("test"), 100)Above example will print "test" every 100 CFR iterations
CounterfactualRegret.CallbackChain — TypeChain together multiple callbacks
Usage:
using CounterfactualRegret
const CFR = CounterfactualRegret
game = CFR.Games.Kuhn()
sol = CFRSolver(game)
exp_cb = ExploitabilityCallback(sol)
test_cb = Throttle(() -> println("test"), 100)
train!(sol, 10_000, cb=CFR.CallbackChain(exp_cb, test_cb))CounterfactualRegret.exploitability — Functionexploitability(sol::AbstractCFRSolver, p::Int=1)Calculates exploitability of player p given strategy specified by solver sol
CounterfactualRegret.evaluate — Functionevaluate(solver::AbstractCFRSolver)Evaluate full tree traversed by CFR solver.
Returns tuple corresponding to game values for players given the strategies provided by the solver.
CounterfactualRegret.ExpectedValueBaseline — TypeExpected Value Baseline (Schmid 2018)
Uses aggregation counterfactual value estimates from previous runs as a baseline. "Learning rate" or exponential decay rate for learning the baseline is given by paramter α.
The stored action values for some information key k are retrieved by calling (b::ExpectedValueBaseline{K})(k, l), where l is the length of the action space at the given information state represented by k.
CounterfactualRegret.ZeroBaseline — TypeDefault static baseline of 0 - equivalent to not using a baseline