杰瑞科技汇

Game Theory Python如何实战应用?

Of course! Game Theory is a fascinating field, and Python is an excellent tool for exploring it. We'll cover the fundamentals and then move to practical Python implementations.

Game Theory Python如何实战应用?-图1
(图片来源网络,侵删)

What is Game Theory?

Game Theory is the study of mathematical models of strategic interaction among rational agents. The key concepts are:

  • Players: The decision-makers.
  • Strategies: The possible actions a player can take.
  • Payoffs: The rewards or outcomes for each player, given a combination of strategies chosen by all players.
  • Rationality: Players are assumed to be rational, meaning they aim to maximize their own payoff.

Core Concepts in Python

Let's implement some classic game theory concepts using Python.

Normal Form Games (Matrix Games)

This is the most common representation, where a matrix shows the payoffs for each player for every combination of strategies.

Example: The Prisoner's Dilemma

Game Theory Python如何实战应用?-图2
(图片来源网络,侵删)

Two criminals are arrested and interrogated separately. They have two choices: Cooperate (with each other, by staying silent) or Defect (by betraying the other).

  • If both Cooperate, they both get a light sentence (e.g., 1 year).
  • If one Defects and the other Cooperates, the defector goes free (0 years) and the cooperator gets a heavy sentence (3 years).
  • If both Defect, they both get a medium sentence (e.g., 2 years).

Let's represent this in Python.

import numpy as np
# The payoff matrix for the Prisoner's Dilemma
# Format: (Payoff for Player 1, Payoff for Player 2)
# Rows are Player 1's choices (Cooperate, Defect)
# Columns are Player 2's choices (Cooperate, Defect)
prisoners_dilemma_payoffs = np.array([
    [ (-1, -1), (-3,  0) ],  # If Player 1 Cooperates
    [ ( 0, -3), (-2, -2) ]   # If Player 1 Defects
])
def print_payoff_matrix(payoffs):
    """Helper function to print the matrix in a readable format."""
    print("        Player 2")
    print("        Cooperate   Defect")
    print("--------------------------------")
    print("Player 1")
    for i, row in enumerate(payoffs):
        action = "Cooperate" if i == 0 else "Defect"
        p1_payoff, p2_payoff = row[0], row[1]
        print(f"{action:<10} ({p1_payoff:>2}, {p2_payoff:>2})     ({p1_payoff:>2}, {p2_payoff:>2})")
print("Prisoner's Dilemma Payoff Matrix:")
print_payoff_matrix(prisoners_dilemma_payoffs)

Nash Equilibrium

A Nash Equilibrium is a set of strategies where no player can improve their payoff by unilaterally changing their own strategy.

In the Prisoner's Dilemma, (Defect, Defect) is the Nash Equilibrium.

  • If Player 1 is Defecting, Player 2's best move is to Defect (gets -2 instead of -3).
  • If Player 2 is Defecting, Player 1's best move is to Defect (gets -2 instead of -3).

Neither player can benefit by changing their strategy alone, even though the outcome is worse for both than if they had cooperated!


Extensive Form Games (Game Trees)

These are used for games with sequential moves. A player makes a move, and then the other player responds, knowing the first player's action.

Example: The Entry Deterrence Game

  • Player 1 (Incumbent): An existing company.
  • Player 2 (Entrant): A new company considering entering the market.
  1. Entrant chooses: Enter or Stay Out.
  2. If Stay Out, the game ends. Incumbent gets a large profit (3), Entrant gets 0.
  3. If Enter, the Incumbent chooses: Fight (price war) or Accommodate (share the market).
    • If Fight, both get a low profit (0).
    • If Accommodate, both get a medium profit (1).

We can represent this with a simple class structure.

class ExtensiveFormGame:
    def __init__(self):
        # The game tree can be represented as a nested dictionary
        # or by using a class for nodes. We'll use a dictionary here for simplicity.
        self.root = {
            'name': 'Entrant',
            'actions': ['Enter', 'Stay Out'],
            'payoffs': None,
            'children': {
                'Enter': {
                    'name': 'Incumbent',
                    'actions': ['Fight', 'Accommodate'],
                    'payoffs': [(0, 0), (1, 1)], # Payoffs for (Incumbent, Entrant)
                    'children': None
                },
                'Stay Out': {
                    'name': 'Game Over',
                    'actions': None,
                    'payoffs': [(3, 0)], # Payoffs for (Incumbent, Entrant)
                    'children': None
                }
            }
        }
    def solve_by_backward_induction(self, node):
        """
        Solves the game using backward induction.
        This means starting from the end of the game and working backwards.
        """
        if node['children'] is None:
            return node['payoffs'][0] # Return the payoff tuple
        # This is a decision node for a player
        best_action_index = 0
        # Maximize the payoff for the player whose turn it is
        # We need to figure out whose turn it is to know whose payoff to maximize
        player_index = 0 if node['name'] == 'Incumbent' else 1
        max_payoff = -float('inf')
        for i, child_node in enumerate(node['children'].values()):
            # Recursively solve the sub-game
            resulting_payoff = self.solve_by_backward_induction(child_node)
            # Check if this action is better for the current player
            if resulting_payoff[player_index] > max_payoff:
                max_payoff = resulting_payoff[player_index]
                best_action_index = i
        # Store the best action and its resulting payoff
        node['best_action'] = list(node['children'].keys())[best_action_index]
        node['equilibrium_payoff'] = list(node['children'].values())[best_action_index]['payoffs'][0]
        return node['equilibrium_payoff']
    def print_solution(self):
        """Prints the solution path."""
        print("--- Solving Entry Deterrence Game ---")
        print("1. Entrant's turn:")
        print(f"   - If Entrant chooses 'Stay Out', payoff is {self.root['children']['Stay Out']['payoffs'][0]}.")
        print(f"   - If Entrant chooses 'Enter', we look at Incumbent's best response.")
        incumbent_node = self.root['children']['Enter']
        print("2. Incumbent's turn (after Entrant chooses 'Enter'):")
        print(f"   - If Incumbent chooses 'Fight', payoff is {incumbent_node['payoffs'][0]}.")
        print(f"   - If Incumbent chooses 'Accommodate', payoff is {incumbent_node['payoffs'][1]}.")
        print("\n--- Backward Induction Solution ---")
        # The incumbent will choose the action that maximizes *their* payoff.
        # Accommodate gives them 1, which is better than Fight (0).
        print("Incumbent's best response to 'Enter' is 'Accommodate' (payoff 1).")
        # Now the Entrant, looking ahead, knows that if they enter, the incumbent will accommodate.
        # The Entrant's payoff for entering will be 1.
        # The Entrant's payoff for staying out is 0.
        print("Entrant, knowing this, will choose the action that maximizes *their* payoff.")
        print("Entrant's payoff for 'Enter' is 1.")
        print("Entrant's payoff for 'Stay Out' is 0.")
        print("\nTherefore, the Nash Equilibrium is: (Enter, Accommodate)")
        print("Final Payoffs: (Incumbent: 1, Entrant: 1)")
# --- Run the solution ---
entry_game = ExtensiveFormGame()
entry_game.print_solution()
# You can also run the algorithmic solver
# entry_game.solve_by_backward_induction(entry_game.root)
# print("\nEquilibrium Payoff (from algorithm):", entry_game.root['equilibrium_payoff'])

Advanced: Iterated Games and Strategies

What if players play the same game multiple times? This is called an Iterated Game. Here, strategies like Tit-for-Tat (start by cooperating, then do whatever the opponent did last round) can be very effective.

We can simulate a tournament of strategies for the Iterated Prisoner's Dilemma.

import random
class IteratedPrisonersDilemma:
    def __init__(self, rounds=100):
        self.rounds = rounds
        self.history_p1 = []
        self.history_p2 = []
    def play(self, strategy1, strategy2):
        """Plays a game between two strategies."""
        score1, score2 = 0, 0
        self.history_p1, self.history_p2 = [], []
        for i in range(self.rounds):
            move1 = strategy1(self.history_p1, self.history_p2, i)
            move2 = strategy2(self.history_p1, self.history_p2, i)
            self.history_p1.append(move1)
            self.history_p2.append(move2)
            # Calculate scores based on moves
            if move1 == 'C' and move2 == 'C':
                score1, score2 = -1, -1
            elif move1 == 'C' and move2 == 'D':
                score1, score2 = -3, 0
            elif move1 == 'D' and move2 == 'C':
                score1, score2 = 0, -3
            else: # D, D
                score1, score2 = -2, -2
            # Note: In tournament scoring, you often add the scores.
            # Here we'll just track the final total.
            # A more common approach is to accumulate.
            # Let's accumulate for a more meaningful score.
            score1 += score1
            score2 += score2 # This is a bug, should be score2 = score2 + new_payoff_2. Let's fix it.
            # Corrected version:
            # score1 += new_payoff_1
            # score2 += new_payoff_2
        # Let's re-do the scoring correctly
        score1, score2 = 0, 0
        for i in range(self.rounds):
            move1 = self.history_p1[i]
            move2 = self.history_p2[i]
            if move1 == 'C' and move2 == 'C':
                score1, score2 = score1 -1, score2 -1
            elif move1 == 'C' and move2 == 'D':
                score1, score2 = score1 -3, score2 + 0
            elif move1 == 'D' and move2 == 'C':
                score1, score2 = score1 + 0, score2 -3
            else: # D, D
                score1, score2 = score1 -2, score2 -2
        return score1, score2
# --- Define some strategies ---
def always Cooperate(history_p1, history_p2, round_num):
    return 'C'
def always Defect(history_p1, history_p2, round_num):
    return 'D'
def tit_for_tat(history_p1, history_p2, round_num):
    if round_num == 0:
        return 'C'
    return history_p2[-1] # Do what the opponent did last
def random_strategy(history_p1, history_p2, round_num):
    return 'C' if random.random() > 0.5 else 'D'
# --- Run a simulation ---
game = IteratedPrisonersDilemma(rounds=200)
# Play a match
score_p1, score_p2 = game.play(tit_for_tat, always Defect)
print(f"\n--- Iterated Game Simulation ---")
print(f"Strategy 1: Tit-for-Tat")
print(f"Strategy 2: Always Defect")
print(f"Final Scores: Player 1 = {score_p1}, Player 2 = {score_p2}")
print(f"Player 1's history: {game.history_p1[:10]}... (first 10 moves)")
print(f"Player 2's history: {game.history_p2[:10]}... (first 10 moves)")

Python Libraries for Game Theory

While building from scratch is great for learning, for more complex tasks, you can use specialized libraries:

  1. Nashpy: A library for the computation of equilibria in 2-player games.

    • Installation: pip install nashpy
    • Use Case: Finding Nash Equilibria in matrix games.
    import nashpy as nash
    import numpy as np
    # A simple coordination game
    A = np.array([[10, 0], [0, 5]]) # Payoffs for Player 1
    B = np.array([[10, 0], [0, 5]]) # Payoffs for Player 2
    game = nash.Game(A, B)
    # Find all Nash Equilibria
    equilibria = game.support_enumeration()
    print("\nNashpy - Coordination Game Equilibria:")
    for eq in equilibria:
        print(eq)
  2. Axelrod: A library for running tournaments of the Iterated Prisoner's Dilemma. It comes with many pre-defined strategies (like Tit-for-Tat, Grudger, etc.) and makes running large-scale tournaments easy.

    • Installation: pip install axelrod
    • Use Case: Studying strategy performance in the IPD.

Summary

Concept Python Implementation Key Takeaway
Normal Form numpy.array for payoff matrices. Represents simultaneous move games. Easy to visualize and analyze for simple 2x2 games.
Nash Equilibrium Manual analysis or nashpy library. A stable state where no player wants to unilaterally change their strategy.
Extensive Form Custom classes or nested dictionaries. Represents sequential move games. Solved with backward induction.
Iterated Games Custom classes with strategy functions. Repeated interaction allows for more complex strategies like "Tit-for-Tat".
Libraries nashpy, axelrod For more advanced computations and simulations, these libraries are powerful tools.
分享:
扫描分享到社交APP
上一篇
下一篇