The SVM is a decision machine and so does not provide posterior probabilities.
We will begin by two-class classification problem using linear models of the form.
where denotes a fixed feature-space transformation, and is the bias. The training data set comprises N input vectors , with corresponding target values where ∈{−1,1}, and new data points are classified according to the sign of .
We currently assume that the dataset is linear separable, so that, their must exist at least one set of paramter and that a function of form satisfies for points having and for points having , so that for the whole dataset.
There may of course exist many such solutions(perceptron, LR) that separate the classes exactly. Solutions find b