Support Vector Machine

The SVM is a decision machine and so does not provide posterior probabilities.

Max Margin Classifier

We will begin by two-class classification problem using linear models of the form.

where denotes a fixed feature-space transformation, and is the bias. The training data set comprises N input vectors , with corresponding target values where ∈{−1,1}, and new data points are classified according to the sign of .

We currently assume that the dataset is linear separable, so that, their must exist at least one set of paramter and that a function of form satisfies for points having and for points having , so that for the whole dataset.

Differences between SVM and other solutions

There may of course exist many such solutions(perceptron, LR) that separate the classes exactly. Solutions find b

Model Formalization

Dual Representation

linear regression

svm

gradient computing

kenel tricks

Reference

  1. Pattern Recognition and Machine Learning, chapter 6, 7 and Appendix E
  2. Gradient and Lagrange