ml02 贝叶斯决策理论 — demo.py 代码详解

运行方式

bash

cd ml02_bayesian_decision/code
python demo.py

代码逐段详解

第1步：GaussianBayesClassifier 类 — 三种协方差假设

python

class GaussianBayesClassifier:
    def __init__(self, cov_type='full'):
        self.cov_type = cov_type  # 'isotropic', 'shared', 'full'

这个分类器假设各类别的数据服从多元高斯分布：

P (x | ω_{i}) = N (x | μ_{i}, Σ_{i})

三种 cov_type 对应不同的协方差假设：

isotropic： $Σ_{i} = σ^{2} I$ （各类别共享，各向同性）。这是最简单的假设，只需要估计 1 个参数（ $σ^{2}$ ）+ 各类别的均值。决策边界是各类别均值连线的垂直平分线——等价于最近均值分类器。
shared： $Σ_{i} = Σ$ （各类别共享同一个协方差矩阵）。需要估计 $d (d + 1) / 2$ 个协方差参数。决策边界是线性超平面——等价于 LDA。
full：每个类别有自己独立的 $Σ_{i}$ 。需要估计 $C \cdot d (d + 1) / 2$ 个协方差参数。决策边界是二次曲面——等价于 QDA。

参数越少（isotropic）→ 偏差越高，方差越低。参数越多（full）→ 偏差越低，方差越高。这正是经典的 Bias-Variance 权衡。

第2步：fit() — 参数估计

python

def fit(self, X, y):
    # priors: P(omega_j) = n_j / n  (MLE)
    self.priors_[idx] = len(X_c) / len(X)
    # means: mu_j = mean of class j  (MLE)
    self.means_[idx] = np.mean(X_c, axis=0)

这里使用了最大似然估计（Maximum Likelihood Estimation, MLE）：

先验概率的 MLE 就是各类别的频率： $\hat{P} (ω_{j}) = n_{j} / n$
均值的 MLE 就是样本均值： ${\hat{μ}}_{j} = \frac{1}{n_{j}} \sum_{x_{i} \in ω_{j}} x_{i}$
协方差矩阵的 MLE 取决于 cov_type 的选择

对于 full 协方差，代码中加了微小的正则化：

python

cov_c = np.cov(X_c.T, bias=False)
cov_c += 1e-6 * np.eye(n_features)  # 防止奇异

这个 $10^{- 6} I$ 的"抖动"确保协方差矩阵可逆，即使样本数 < 特征数。

第3步：_log_likelihood() — 对数似然计算

python

def _log_likelihood(self, X):
    for idx in range(n_classes):
        diff = X - mu
        if self.cov_type == 'isotropic':
            mahal = np.sum(diff ** 2, axis=1) / sigma2
        else:
            mahal = np.sum((diff @ cov_inv) * diff, axis=1)
        log_gauss = -0.5 * (d * np.log(2 * np.pi) + log_det + mahal)

对数高斯密度的公式：

\ln N (x | μ, Σ) = - \frac{1}{2} [d \ln (2 π) + \ln | Σ | + (x - μ)^{T} Σ^{- 1} (x - μ)]

其中第三项 $(x - μ)^{T} Σ^{- 1} (x - μ)$ 称为马氏距离的平方。它度量了 $x$ 到 $μ$ 的"协方差校正"距离。

向量化技巧：np.sum((diff @ cov_inv) * diff, axis=1) 批量计算了 $n$ 个样本的马氏距离平方。解释如下：

diff @ cov_inv 得到 $(n, d)$ 矩阵，每一行是 ${diff}_{i} Σ^{- 1}$
再逐元素乘以 diff，得到 ${diff}_{i} Σ^{- 1} \cdot {diff}_{i}$
sum(axis=1) 得到每行的标量结果

第4步：predict_proba() — 后验概率与 Log-Sum-Exp

python

def predict_proba(self, X):
    log_joint = log_lik + log_prior
    log_evidence = logsumexp(log_joint, axis=1, keepdims=True)
    log_posteriors = log_joint - log_evidence
    return np.exp(log_posteriors)

后验概率的计算需要先用贝叶斯公式归一化：

P (ω_{j} | x) = \frac{P (x | ω_{j}) P (ω_{j})}{\sum_{k} P (x | ω_{k}) P (ω_{k})}

如果直接计算，分子和分母都涉及 exp(log_joint)，当 log_joint 非常负时，exp 会下溢到 0。解决方法是在对数空间中做归一化：

log_joint = log_lik + log_prior：在对数空间中计算分子
log_evidence = logsumexp(log_joint)： $\ln \sum e^{log_joint}$ （scipy.special.logsumexp 内部做了数值稳定性处理——先减最大值再 exp）
log_posteriors = log_joint - log_evidence： $\ln P (ω_{j} | x) = \ln P (x, ω_{j}) - \ln P (x)$ （对数中的除法 = 减法）

最后才 np.exp() 得到实际的后验概率值。

第5步：最小风险决策

python

def minimum_risk_predict(posteriors, loss_matrix):
    for i in range(n_classes):
        for j in range(n_classes):
            risks[:, i] += loss_matrix[i, j] * posteriors[:, j]
    return np.argmin(risks, axis=1)

条件风险的计算公式为：

R (α_{i} | x) = \sum_{j} λ (α_{i} | ω_{j}) \cdot P (ω_{j} | x)

loss_matrix[i, j] 表示当真实类别为 $j$ 时预测类别为 $i$ 的代价。

演示中使用的不对称损失矩阵：

python

loss_asymmetric = np.array([[0, 5],   # 将 class 1 误判为 class 0: 代价 5
                             [1, 0]])  # 将 class 0 误判为 class 1: 代价 1

当 FN（漏诊）代价远大于 FP（误报）代价时，分类器会更倾向于将样本判为"有病的"那个类别。

第6步：ROC 曲线绘制

python

fpr, tpr, thresholds = roc_curve(y_test, y_score)
roc_auc = auc(fpr, tpr)

sklearn.metrics.roc_curve 通过变化决策阈值，返回不同 (FPR, TPR) 点。原理：

按模型输出的后验概率从高到低排序
逐个将阈值设为每个样本的得分，计算此时的 FPR 和 TPR
连接所有点得到 ROC 曲线

AUC 使用梯形积分计算：sklearn.metrics.auc(fpr, tpr)。AUC 的概率解释：随机抽取一个正样本和一个负样本，正样本的得分高于负样本的概率。

关键概念速查表

概念	数学形式	代码位置	关键说明
贝叶斯定理	$P (ω \| x) \propto P (x \| ω) P (ω)$	`predict_proba()`	先验 + 似然 → 后验
高斯密度	$N (x \| μ, Σ)$	`_log_likelihood()`	假设各类别为正态分布
马氏距离	$(x - μ)^{T} Σ^{- 1} (x - μ)$	`_log_likelihood()` 中 `mahal`	协方差校正的距离
各向同性协方差	$Σ = σ^{2} I$	`cov_type='isotropic'`	1 参数，决策面为直线
共享协方差	$Σ$ 各类别相同	`cov_type='shared'`	类似 LDA，线性决策面
独立协方差	$Σ_{i}$ 各类别不同	`cov_type='full'`	类似 QDA，二次决策面
Log-Sum-Exp	$\ln \sum e^{x_{i}} = m + \ln \sum e^{x_{i} - m}$	`predict_proba()`	数值稳定归一化
条件风险	$R (α_{i} \| x) = \sum_{j} λ_{i j} P (ω_{j} \| x)$	`minimum_risk_predict()`	不对称损失
ROC 曲线	FPR vs TPR（变阈值）	`plot_roc_auc_eer()`	分类器排序能力
AUC	$\int_{0}^{1} TPR (FPR) d FPR$	`auc(fpr, tpr)`	正样本得分 > 负样本得分的概率
EER	FPR = FNR 时的错误率	代码中 `eer_idx`	安全与便利平衡点

完整代码

# -*- coding: utf-8 -*-
"""
===============================================================================
ml02_bayesian_decision/code/demo.py — 贝叶斯决策理论
===============================================================================
本演示通过实现高斯贝叶斯分类器，全面展示：
  1. 贝叶斯定理：先验概率、似然函数、后验概率的关系
  2. 三种协方差矩阵假设下的判别函数与决策面
  3. 最小错误率决策 vs 最小风险决策
  4. ROC 曲线、AUC 和 EER 的计算与可视化
  5. 与 sklearn 的 GaussianNB / LDA / QDA 对比

通过本演示，你将理解：
  - 贝叶斯决策如何将分类问题转化为概率推理
  - 协方差矩阵假设如何影响决策边界形态
  - 不对称损失如何改变最优决策
  - ROC/AUC 对分类器性能的评估含义

作者：learn-ai 项目
日期：2025
===============================================================================
"""

import os
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
from scipy.special import logsumexp
from sklearn.datasets import make_classification, make_blobs
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.naive_bayes import GaussianNB
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis, QuadraticDiscriminantAnalysis
from sklearn.metrics import roc_curve, auc, roc_auc_score
matplotlib.rcParams['axes.unicode_minus'] = False

_SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
_IMAGES_DIR = os.path.join(_SCRIPT_DIR, '..', 'images')
os.makedirs(_IMAGES_DIR, exist_ok=True)

def _save_path(filename):
    return os.path.join(_IMAGES_DIR, filename)


# ============================================================================
# 第一部分：高斯贝叶斯分类器（从零实现）
# ============================================================================

class GaussianBayesClassifier:
    """
    高斯贝叶斯分类器：假设每个类别的类条件密度为多元高斯分布。

    支持三种协方差类型：
      - 'isotropic': Sigma_j = sigma^2 * I（各向同性，各类别共享）→ 最近均值分类器
      - 'shared': Sigma_j = Sigma（各类别共享同一个协方差矩阵）→ 类似 LDA
      - 'full': 各类别有自己独立的协方差矩阵 → 类似 QDA

    参数:
        cov_type: str, 'isotropic', 'shared', 或 'full'
    """

    def __init__(self, cov_type='full'):
        self.cov_type = cov_type
        self.priors_ = None
        self.means_ = None
        self.covs_ = None
        self.classes_ = None

    def fit(self, X, y):
        """
        参数估计:
          1. priors: P(omega_j) = n_j / n
          2. means: mu_j = (1/n_j) * sum(x_i in class j)
          3. covs: 根据 cov_type 计算 Sigma_j
        """
        X = np.asarray(X, dtype=np.float64)
        y = np.asarray(y)
        self.classes_ = np.unique(y)
        n_classes = len(self.classes_)
        n_features = X.shape[1]

        self.priors_ = np.zeros(n_classes)
        self.means_ = np.zeros((n_classes, n_features))

        # 收集每个类别的均值和样本
        class_data = {}
        for idx, c in enumerate(self.classes_):
            X_c = X[y == c]
            class_data[c] = X_c
            self.priors_[idx] = len(X_c) / len(X)
            self.means_[idx] = np.mean(X_c, axis=0)

        # 根据 cov_type 估计协方差
        if self.cov_type == 'isotropic':
            # 各类别共享各向同性协方差: sigma^2 * I
            total_var = 0.0
            total_n = 0
            for idx, c in enumerate(self.classes_):
                X_c = class_data[c]
                mean_c = self.means_[idx]
                total_var += np.sum((X_c - mean_c) ** 2)
                total_n += len(X_c)
            sigma2 = total_var / (total_n * n_features)
            self.covs_ = [sigma2 * np.eye(n_features) for _ in range(n_classes)]

        elif self.cov_type == 'shared':
            # 各类别共享同一个协方差矩阵 (pooled)
            pooled_cov = np.zeros((n_features, n_features))
            for idx, c in enumerate(self.classes_):
                X_c = class_data[c]
                mean_c = self.means_[idx]
                pooled_cov += (X_c - mean_c).T @ (X_c - mean_c)
            pooled_cov /= len(X)
            self.covs_ = [pooled_cov for _ in range(n_classes)]

        elif self.cov_type == 'full':
            # 各类别有自己独立的协方差矩阵
            self.covs_ = []
            for idx, c in enumerate(self.classes_):
                X_c = class_data[c]
                mean_c = self.means_[idx]
                # 使用无偏估计，加上微小正则化防止奇异
                cov_c = np.cov(X_c.T, bias=False)
                cov_c += 1e-6 * np.eye(n_features)
                self.covs_.append(cov_c)

        return self

    def _log_likelihood(self, X):
        """计算每个样本属于每个类别的 log P(x | omega_j)。"""
        n_samples = X.shape[0]
        n_classes = len(self.classes_)
        d = X.shape[1]
        log_lik = np.zeros((n_samples, n_classes))

        for idx in range(n_classes):
            mu = self.means_[idx]
            cov = self.covs_[idx]

            # 计算 cov 的逆和行列式的对数
            # 对于 isotropic 情况: cov = sigma^2 * I
            if self.cov_type == 'isotropic':
                sigma2 = cov[0, 0]
                # log N(x|mu, sigma^2 I) = -(d/2)log(2*pi) - (d/2)log(sigma^2)
                #                            - (1/(2*sigma^2)) * ||x-mu||^2
                log_det = d * np.log(sigma2)
                # 马氏距离 = ||x-mu||^2 / sigma^2
                diff = X - mu
                mahal = np.sum(diff ** 2, axis=1) / sigma2
            else:
                # 用伪逆和 slogdet 保证数值稳定性
                try:
                    cov_inv = np.linalg.pinv(cov)
                    sign, log_det = np.linalg.slogdet(cov)
                except Exception:
                    cov_inv = np.linalg.pinv(cov)
                    log_det = 0.0

                diff = X - mu
                # 马氏距离: (x-mu)^T Sigma^{-1} (x-mu)
                # 向量化: sum_j (diff @ cov_inv)_j * diff_j
                mahal = np.sum((diff @ cov_inv) * diff, axis=1)

            # log 高斯密度
            log_gauss = -0.5 * (d * np.log(2 * np.pi) + log_det + mahal)
            log_lik[:, idx] = log_gauss

        return log_lik

    def predict_proba(self, X):
        """返回后验概率 P(omega_j | x)。"""
        X = np.asarray(X, dtype=np.float64)
        log_lik = self._log_likelihood(X)
        log_prior = np.log(self.priors_)

        # log joint = log P(x|omega_j) + log P(omega_j)
        log_joint = log_lik + log_prior

        # log evidence = log sum_j exp(log_joint_j)
        # 使用 logsumexp 保证数值稳定性
        log_evidence = logsumexp(log_joint, axis=1, keepdims=True)

        # log posterior = log_joint - log_evidence
        log_posteriors = log_joint - log_evidence
        return np.exp(log_posteriors)

    def predict(self, X):
        """预测: 选择后验概率最大的类别。"""
        log_lik = self._log_likelihood(X)
        log_prior = np.log(self.priors_)
        log_joint = log_lik + log_prior
        best_idx = np.argmax(log_joint, axis=1)
        return self.classes_[best_idx]


# ============================================================================
# 第二部分：最小风险决策的演示
# ============================================================================

def minimum_risk_predict(posteriors, loss_matrix):
    """
    最小风险决策: 对每个样本，选择使条件风险最小的行动。

    R(alpha_i | x) = sum_j loss_matrix[i, j] * P(omega_j | x)
    alpha* = argmin_i R(alpha_i | x)

    参数:
        posteriors: (n_samples, n_classes), P(omega_j | x)
        loss_matrix: (n_classes, n_classes), loss_matrix[i, j] = cost of
                     predicting class i when true class is j
    返回:
        predictions: (n_samples,)
    """
    # 条件风险: (n_samples, n_classes) = posteriors @ loss_matrix.T
    # 简化版本: 每个样本的每个决策的风险
    n_samples, n_classes = posteriors.shape
    risks = np.zeros((n_samples, n_classes))

    for i in range(n_classes):
        for j in range(n_classes):
            risks[:, i] += loss_matrix[i, j] * posteriors[:, j]

    return np.argmin(risks, axis=1)


# ============================================================================
# 第三部分：可视化函数
# ============================================================================

def plot_decision_boundaries_cov_types():
    """
    对比三种协方差假设下的决策边界。
    """
    np.random.seed(42)
    # 生成三类数据，各类别有不同协方差结构
    n_samples = 150

    # 类别 0: 圆形分布
    X0 = np.random.randn(n_samples, 2) * 1.0 + np.array([-4, 0])
    # 类别 1: 拉长 + 旋转的椭圆
    rot = np.array([[1.5, 0.8], [0.8, 0.5]])
    X1 = np.random.randn(n_samples, 2) @ rot + np.array([2, 2])
    # 类别 2: 另一个方向的椭圆
    rot2 = np.array([[0.8, -0.8], [-0.8, 1.5]])
    X2 = np.random.randn(n_samples, 2) @ rot2 + np.array([0, -3])

    X = np.vstack([X0, X1, X2])
    y = np.hstack([np.zeros(n_samples), np.ones(n_samples), 2 * np.ones(n_samples)])

    cov_types = ['isotropic', 'shared', 'full']
    titles = [
        'Isotropic Covariance\n(Same spherical for all classes)',
        'Shared Covariance\n(Same ellipsoid for all classes)',
        'Full Covariance\n(Each class has own ellipsoid)'
    ]

    fig, axes = plt.subplots(1, 3, figsize=(18, 5.5))

    for ax, cov_type, title in zip(axes, cov_types, titles):
        clf = GaussianBayesClassifier(cov_type=cov_type)
        clf.fit(X, y)

        # 管网格预测
        x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
        y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
        xx, yy = np.meshgrid(np.linspace(x_min, x_max, 200),
                            np.linspace(y_min, y_max, 200))
        grid = np.c_[xx.ravel(), yy.ravel()]
        Z = clf.predict(grid).reshape(xx.shape)

        ax.contourf(xx, yy, Z, alpha=0.3, cmap=plt.cm.RdYlBu)
        ax.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', cmap=plt.cm.RdYlBu, s=30)
        # 绘制各类别的均值和协方差椭圆
        for idx in range(3):
            mu = clf.means_[idx]
            ax.scatter(mu[0], mu[1], c='lime', marker='X', s=200, edgecolors='black', linewidths=1.5)

        ax.set_xlabel('Feature 1', fontsize=11)
        ax.set_ylabel('Feature 2', fontsize=11)
        ax.set_title(title, fontsize=12, fontweight='bold')

    plt.tight_layout()
    fp = _save_path('cov_type_boundaries.png')
    fig.savefig(fp, dpi=150, bbox_inches='tight')
    plt.close(fig)
    print(f"[Done] covariance type boundaries saved to {fp}")


def plot_risk_decision_demo():
    """
    演示最小风险决策：非对称损失矩阵导致不同的决策边界。
    """
    np.random.seed(42)
    X, y = make_blobs(n_samples=300, centers=2, cluster_std=[2.0, 1.5],
                      center_box=(-5, 5), random_state=42)
    X = StandardScaler().fit_transform(X)

    # 训练高斯贝叶斯分类器
    clf = GaussianBayesClassifier(cov_type='full')
    clf.fit(X, y)
    posteriors = clf.predict_proba(X)

    # 损失矩阵设置（疾病诊断场景）
    # class 0 = 良性, class 1 = 恶性
    # loss[0,0]=0 (良性→良性), loss[0,1]=5 (良性→恶性, 漏诊代价高!)
    # loss[1,0]=1 (恶性→良性, 误报代价低), loss[1,1]=0 (恶性→恶性)
    loss_symmetric = np.array([[0, 1],
                               [1, 0]])  # 0-1 loss = 最小错误率

    loss_asymmetric = np.array([[0, 5],
                                [1, 0]])  # 漏诊代价 5 倍于误报

    pred_sym = minimum_risk_predict(posteriors, loss_symmetric)
    pred_asym = minimum_risk_predict(posteriors, loss_asymmetric)

    # 绘制
    fig, axes = plt.subplots(1, 2, figsize=(14, 5.5))

    titles = [
        'Minimum Error Rate Decision\n(Symmetric 0-1 Loss)',
        'Minimum Risk Decision\n(FN cost = 5x FP cost)'
    ]
    preds = [pred_sym, pred_asym]

    for ax, title, pred in zip(axes, titles, preds):
        ax.scatter(X[:, 0], X[:, 1], c=pred, cmap=plt.cm.RdYlBu, edgecolors='k', s=40, alpha=0.8)
        ax.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.RdYlBu, edgecolors='gray',
                  s=40, alpha=0.2, marker='s')
        ax.set_xlabel('Feature 1', fontsize=11)
        ax.set_ylabel('Feature 2', fontsize=11)
        ax.set_title(title, fontsize=12, fontweight='bold')

    plt.tight_layout()
    fp = _save_path('risk_decision_comparison.png')
    fig.savefig(fp, dpi=150, bbox_inches='tight')
    plt.close(fig)
    print(f"[Done] risk decision saved to {fp}")


def plot_roc_auc_eer():
    """
    绘制 ROC 曲线，计算 AUC 和 EER。
    """
    np.random.seed(42)
    X, y = make_classification(n_samples=500, n_features=20,
                                n_informative=8, n_redundant=2,
                                random_state=42)
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.3, random_state=42
    )

    # 训练多个模型
    models = {
        'Gaussian NB (Isotropic)': GaussianBayesClassifier(cov_type='isotropic'),
        'Gaussian NB (Full)': GaussianBayesClassifier(cov_type='full'),
        'sklearn GNB': GaussianNB(),
        'sklearn LDA': LinearDiscriminantAnalysis(),
        'sklearn QDA': QuadraticDiscriminantAnalysis(),
    }

    fig, ax = plt.subplots(figsize=(8, 7))

    for name, model in models.items():
        model.fit(X_train, y_train)

        # 获取预测概率（对于我们的自定义模型用 predict_proba）
        if hasattr(model, 'predict_proba'):
            y_score = model.predict_proba(X_test)[:, 1]
        elif hasattr(model, 'decision_function'):
            y_score = model.decision_function(X_test)
        else:
            continue

        # ROC 曲线
        fpr, tpr, thresholds = roc_curve(y_test, y_score)
        roc_auc = auc(fpr, tpr)

        # 绘制
        ax.plot(fpr, tpr, linewidth=2, label=f'{name} (AUC={roc_auc:.3f})')

        # EER: 找到 FPR ≈ FNR 的点
        fnr = 1 - tpr
        eer_idx = np.argmin(np.abs(fpr - fnr))
        eer = (fpr[eer_idx] + fnr[eer_idx]) / 2
        ax.scatter(fpr[eer_idx], tpr[eer_idx], s=80, zorder=5)

    # 对角线 (随机猜测)
    ax.plot([0, 1], [0, 1], 'k--', linewidth=1.5, alpha=0.5, label='Random (AUC=0.5)')
    ax.fill_between([0, 1], [0, 1], [1, 1], alpha=0.05, color='green')

    ax.set_xlabel('False Positive Rate (FPR)', fontsize=12)
    ax.set_ylabel('True Positive Rate (TPR / Recall)', fontsize=12)
    ax.set_title('ROC Curves — Multiple Bayesian Classifiers\n(Dots mark EER points)',
                 fontsize=13, fontweight='bold')
    ax.legend(loc='lower right', fontsize=9)
    ax.grid(True, alpha=0.3)
    ax.set_xlim([-0.02, 1.02])
    ax.set_ylim([-0.02, 1.02])

    plt.tight_layout()
    fp = _save_path('roc_auc_multi_model.png')
    fig.savefig(fp, dpi=150, bbox_inches='tight')
    plt.close(fig)
    print(f"[Done] ROC/AUC plot saved to {fp}")


def plot_posterior_3d():
    """
    一维特征的贝叶斯决策三维可视化（后验概率面）。
    """
    np.random.seed(42)
    # 两类高斯分布，一维特征
    n_samples = 100
    X0 = np.random.randn(n_samples) * 1.0 - 2.0  # class 0
    X1 = np.random.randn(n_samples) * 1.5 + 1.5  # class 1

    X = np.hstack([X0, X1]).reshape(-1, 1)
    y = np.hstack([np.zeros(n_samples), np.ones(n_samples)])

    clf = GaussianBayesClassifier(cov_type='full')
    clf.fit(X, y)

    # 计算后验概率
    x_line = np.linspace(-8, 8, 500).reshape(-1, 1)
    posteriors = clf.predict_proba(x_line)
    likelihood_x = clf._log_likelihood(x_line)
    likelihood = np.exp(likelihood_x)

    fig, axes = plt.subplots(1, 2, figsize=(14, 5))

    # 左图: 似然函数和先验
    ax = axes[0]
    prior0 = clf.priors_[0]
    prior1 = clf.priors_[1]
    ax.plot(x_line, likelihood[:, 0] * prior0, 'b-', linewidth=2,
            label=f'P(x|Class 0) * P(Class 0) = likelihood*prior')
    ax.plot(x_line, likelihood[:, 1] * prior1, 'r-', linewidth=2,
            label=f'P(x|Class 1) * P(Class 1)')
    ax.fill_between(x_line.ravel(), 0, likelihood[:, 0] * prior0, alpha=0.3, color='blue')
    ax.fill_between(x_line.ravel(), 0, likelihood[:, 1] * prior1, alpha=0.3, color='red')
    # 决策边界: 两曲线交点
    ax.set_xlabel('Feature x', fontsize=12)
    ax.set_ylabel('Likelihood * Prior', fontsize=12)
    ax.set_title('Class-conditional Densities * Priors\n(Decision boundary at intersection)',
                 fontsize=12, fontweight='bold')
    ax.legend(fontsize=9)

    # 右图: 后验概率
    ax = axes[1]
    ax.plot(x_line, posteriors[:, 0], 'b-', linewidth=2, label='P(Class 0 | x)')
    ax.plot(x_line, posteriors[:, 1], 'r-', linewidth=2, label='P(Class 1 | x)')
    ax.axhline(y=0.5, color='gray', linestyle='--', alpha=0.5)
    # 标注决策点: P(Class 0 | x) = P(Class 1 | x) = 0.5
    ax.fill_between(x_line.ravel(), 0, posteriors[:, 0], alpha=0.3, color='blue')
    ax.fill_between(x_line.ravel(), 0, posteriors[:, 1], alpha=0.3, color='red')
    ax.set_xlabel('Feature x', fontsize=12)
    ax.set_ylabel('Posterior Probability', fontsize=12)
    ax.set_title('Posterior Probabilities\n(P(Class 0|x) + P(Class 1|x) = 1 everywhere)',
                 fontsize=12, fontweight='bold')
    ax.legend(fontsize=9)

    plt.tight_layout()
    fp = _save_path('bayesian_posterior_demo.png')
    fig.savefig(fp, dpi=150, bbox_inches='tight')
    plt.close(fig)
    print(f"[Done] posterior probability demo saved to {fp}")


def plot_sklearn_comparison():
    """
    对比自定义实现与 sklearn 的 GaussianNB / LDA / QDA。
    """
    np.random.seed(42)
    X, y = make_classification(n_samples=300, n_features=2,
                                n_redundant=0, n_clusters_per_class=1,
                                random_state=42)
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    # 自定义模型
    custom = GaussianBayesClassifier(cov_type='full')
    custom.fit(X_train, y_train)

    # sklearn 模型
    sk_gnb = GaussianNB()
    sk_gnb.fit(X_train, y_train)
    sk_qda = QuadraticDiscriminantAnalysis(reg_param=1e-6)
    sk_qda.fit(X_train, y_train)

    from sklearn.metrics import accuracy_score
    custom_acc = accuracy_score(y_test, custom.predict(X_test))
    gnb_acc = accuracy_score(y_test, sk_gnb.predict(X_test))
    qda_acc = accuracy_score(y_test, sk_qda.predict(X_test))

    print(f"\n[Comparison] Test Accuracy:")
    print(f"  Custom Gaussian Bayes (full): {custom_acc:.4f}")
    print(f"  sklearn GaussianNB:           {gnb_acc:.4f}")
    print(f"  sklearn QDA:                  {qda_acc:.4f}")


# ============================================================================
# 主函数
# ============================================================================

def main():
    print("=" * 70)
    print("ml02_bayesian_decision/demo.py — 贝叶斯决策理论")
    print("=" * 70)

    print("\n[1/5] 对比三种协方差假设的决策边界...")
    plot_decision_boundaries_cov_types()

    print("\n[2/5] 展示最小风险决策...")
    plot_risk_decision_demo()

    print("\n[3/5] 绘制 ROC 曲线 & 计算 AUC/EER...")
    plot_roc_auc_eer()

    print("\n[4/5] 一维后验概率可视化...")
    plot_posterior_3d()

    print("\n[5/5] 与 sklearn 对比验证...")
    plot_sklearn_comparison()

    print("\n" + "=" * 70)
    print("全部可视化完成! 图像保存在:", _IMAGES_DIR)
    print("=" * 70)


if __name__ == "__main__":
    main()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530

ml02 贝叶斯决策理论 — demo.py 代码详解 ​

运行方式 ​

代码逐段详解 ​

第1步：GaussianBayesClassifier 类 — 三种协方差假设 ​

第2步：fit() — 参数估计 ​

第3步：_log_likelihood() — 对数似然计算 ​

第4步：predict_proba() — 后验概率与 Log-Sum-Exp ​

第5步：最小风险决策 ​

第6步：ROC 曲线绘制 ​

关键概念速查表 ​

完整代码 ​

ml02 贝叶斯决策理论 — demo.py 代码详解

运行方式

代码逐段详解

第1步：GaussianBayesClassifier 类 — 三种协方差假设

第2步：fit() — 参数估计

第3步：_log_likelihood() — 对数似然计算

第4步：predict_proba() — 后验概率与 Log-Sum-Exp

第5步：最小风险决策

第6步：ROC 曲线绘制

关键概念速查表

完整代码