1. 证明:对偶函数的极大化=模型的极大似然估计

1.1. 模型的极大似然估计

条件概率分布P(Y|X)的对数似然函数表示为:

\begin{aligned}
L_{\tilde P}(P_w) = \log \prod_{x,y}P(y|x)^{\tilde P(x, y)} \\
= \sum_{x,y}\tilde P(x, y)\log P(y|x) && {1}
\end{aligned}

当条件概率分布P(Y|X)是最大熵模型时,即
P(yx)=exp(i=0n)wi+1fi(x,y)exp(1w+0)2 \begin{aligned} P(y|x) = \frac{exp(\sum_{i=0}^n)w_{i+1}f_i(x, y)}{exp(1-w+0)} && {2} \end{aligned}

将公式(2)代入公式(1)得:
LP~(Pw)=x,yP~(x,y)i=0nwi+1fi(x,y)xP~(x)logZw(x) L_{\tilde P}(P_w) = \sum_{x,y} \tilde P(x, y) \sum_{i=0}^n w_{i+1}f_i(x, y) - \sum_x \tilde P(x) \log Z_w(x)

1.2. 对偶函数的极大化

对偶函数定义如下:
Ψ(w)=L(Pw,w)=x,yP~(x)Pw(yx)logPw(yx)=x,yP~(x,y)i=0nwi+1fi(x,y)xP~(x)logZw(x) \begin{aligned} \Psi (w) = L(P_w, w) \\ = \sum_{x,y}\tilde P(x) P_w(y|x)\log P_w(y|x) \\ = \sum_{x,y} \tilde P(x, y) \sum_{i=0}^n w_{i+1}f_i(x, y) - \sum_x \tilde P(x) \log Z_w(x) \end{aligned}

1.3. 结论

LP~(Pw)=Ψ(w) L_{\tilde P}(P_w) = \Psi (w) 即:对偶函数的极大化=模型的极大似然估计

results matching ""

    No results matching ""