Contents

Support Vector Machine(3)-SMO

序列最小化算法

Recall from the last section: $ \begin{align} & \min\limits_{\alpha} \quad \frac{1}{2} \sum\limits_{i=1}^{N} \sum\limits_{j=1}^{N} \alpha_i \alpha_j y^{(i)} y^{(j)} K(x^{(i)}, x^{(j)}) - \sum\limits_{i=1}^{N} \alpha_i \end{align} \tag{3.2}$

$ s.t. \quad \sum\limits_{i=1}^{N} \alpha_i y^{(i)} = 0, \quad 0 \le \alpha_i \le C \tag{3.4}$

To solve the optimization problem above, we introduce the Sequantial Minimal Optimization(SMO) method.


Solve the 2-variable Quadratic Programming

Assume from all of the $\alpha_i$, set only $\alpha_1, \alpha_2$ to be variables, all the other $\alpha_i(i \neq 1, 2)$ are constant, set $\alpha_2^{i}$ to be the $i$th iterate of $\alpha_2$, first we define some things for convenience:

$$ L = \begin{cases} max\lbrace 0, \alpha_2^i - \alpha_1^i \rbrace, & y^{(1)} \neq y^{(2)} \\ max\lbrace 0, \alpha_2^i + \alpha_1^i - C \rbrace, & y^{(1)} = y^{(2)} \end{cases} $$

$$ H = \begin{cases} min \lbrace C, \alpha_2^i - \alpha_1^i + C \rbrace, & y^{(1)} \neq y^{(2)} \\ min \lbrace C, \alpha_2^i + \alpha_1^i \rbrace, & y^{(1)} = y^{(2)} \end{cases} $$

$$E_i = g(x^{(i)}) - y^{(i)} = \left( \sum\limits_{j=1}^{N} \alpha_j y^{(j)} K(x^{(i)}, x^{(j)})+b \right)- y^{(i)} \tag{SMO.1}$$

Then using $H$ and $L$ to cut the $\alpha_2$, in the $(i+1)$th iterate, we have:

$$ \alpha_2^{i+1, unc} = \alpha_2^{i} + \frac { y^{(2)}(E_1 - E_2)}{K_{11}+K_{22}-2K_{12}} \tag{SMO.2}$$

$$ \alpha_2^{i+1} = \begin{cases} H, & \alpha_2^{i+1, unc} > H \\ \alpha_2^{i+1, unc}, & L \le \alpha_2^{i+1, unc} \le H \\ L, & \alpha_2^{i+1, unc} < L \end{cases} \tag{SMO.3} $$

$$ \alpha_1^{i+1} = \alpha_1^{i} + y^{(1)}y^{(2)}(\alpha_2^{i} - \alpha_2^{i+1}) \tag{SMO.4} $$


The Choosing of $\alpha_1, \alpha_2$

The KKT conditions for each $\alpha_i$ are: $$ \begin{align} \alpha_i = 0 \quad & \Leftrightarrow \quad y^{(i)} g(x^{(i)}) \ge 1 \quad \text{(Out of the border)} \\ 0 < \alpha_i < C \quad & \Leftrightarrow \quad y^{(i)} g(x^{(i)}) = 1 \quad \text{(On the border)} \\ \alpha_i = C \quad & \Leftrightarrow \quad y^{(i)} g(x^{(i)}) \le 1 \quad \text{(Inside the border)} \end{align} \tag{SMO.5} $$

  1. Select $\alpha_1$:
  • Iterate through the points on the border(which is most likely to be the support vector), then all the other points, for the points that breaks KKT conditions
  1. Select $\alpha_2$:
  • Find $(x^{(i)}, y^{(i)})$ which gives the largest $\vert E^i_1 - E^i_2 \vert$. In order to save time for calculation, save all the $E_i$ in advance

Calculate new $b$ and $E_i$

After

  1. Calculate $b$ $$ b_1^{i+1} = b^i -[E_1^i + y^{(1)}K_{11}(\alpha_1^{i+1} - \alpha_1^{i}) + y^{(2)}K_{21}(\alpha_2^{i+1} - \alpha_2^{i}) ] \\ b_2^{i+1} = b^i -[E_2^i + y^{(1)}K_{12}(\alpha_1^{i+1} - \alpha_1^{i}) + y^{(2)}K_{22}(\alpha_2^{i+1} - \alpha_2^{i}) ] \tag{SMO.6} $$

$$b^{i+1} = \begin{cases} b_{1}^{i+1} = b_{2}^{i+1}, & 0 < \alpha_1^{i+1}, \alpha_2^{i+1} < C \\ \frac {1}{2} (b_{1}^{i+1} + b_{2}^{i+1}), & \alpha_1^{i+1}, \alpha_2^{i+1} = 0 ; or ; C \end{cases} $$ 2. Calculate $E_i$ using $\alpha_1^{i+1}, \alpha_2^{i+1}, b^{i+1}$ $$ E_i = \sum\limits_{j \in S} \alpha_j y^{(j)} K(x^{(i)}, x^{(j)})+b^{i+1} - y^{(i)} $$