Skip to content
Snippets Groups Projects
Commit 1f2b5098 authored by Felix Matthias Krumm's avatar Felix Matthias Krumm
Browse files

Merge branch 'fix-remaining-svm-issues' into 'main'

Resolve "Explain non trivial formulars & maths"

Closes #108 and #105

See merge request !130
parents 82b0a1d2 3dcdf819
No related branches found
No related tags found
1 merge request!130Resolve "Explain non trivial formulars & maths"
Pipeline #93984 passed
book/src/AI-Models/Support-Vector-Machine/SVM-Margin.png

191 KiB

......@@ -32,20 +32,27 @@ $$\{(x_i,y_i) | i = 1,...,m; y_i \in \{-1,1\}\}$$
and builds a hyperplane which tries its best to separate the two classes.
The hyperplane is build by a normal vector \\(w\\) trough the origin. Perpendicular to this are the hyperplanes with a set distance from the origin called bias
The hyperplane is build by a normal vector \\(w\\) trough the origin. Perpendicular to this are the hyperplanes with a set distance from the origin called bias or offset. For this distance we will take a fixed initial bias and divide it by the norm of the normal vector.
$$ \frac{b}{||w||_2} $$
$$ \frac{b}{||w||} $$
With this we have unique hyperplane where its points have the characteristics of.
With this we have unique hyperplane based on the normal vector and the bias where each point of the hyperplane \\(x\\) satisfies the following condition. Each point paired with the normal vector - bias is 0.
$$ \langle w,x \rangle + b = 0 $$
$$ (w*x) - b = 0 $$
All points, which position are not on the hyperplane the value is either positive or negative respective to its class.
To now find this Hyperplane we say that the labels for our training data are \\(y_i = \pm1\\)
Therefore the following formal condition is given:
$$ y_i = sgn(\langle w,x_i \rangle +b) $$
$$ y_i = sigmoid((w*x) - b) $$
We now use the formula of the hyperplane and plug the output into a sigmoid function to get a result between 0 and 1.
Here is a visualized example to help you understand the different parameters of the hyperplane.
![Margin](./SVM-Margin.png)
Source: <https://commons.wikimedia.org/wiki/File:SVM_margin.png>
Furthermore the hyperplanes should be positioned to have the biggest margin possible, therefore the closest points to the hyperplane are used as support vectors for training which should give us the best combination for \\(w\\) and \\(b\\). After that the hyperplane can simply be used as a decision function.
......
%% Cell type:markdown id:07dc56ff-b066-46b0-9253-7654ac05bf88 tags:
# Support Vector Machines
## Imports
We use pyplot to display our data as nicely as possible and sklearn datasets for our iris plant data.
%% Cell type:code id:c977c538-55f2-4330-a931-3237b7d28778 tags:
``` python
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
import numpy as np
```
%% Cell type:markdown id:bb64e863-3b40-4d94-bd12-e9ca47bc92be tags:
## Graph Erstellung
## Plot functions
Here the graph for the output is created. The training data is displayed differently depending on the classes. In addition, the hyperplane and its border are drawn into the graph. In the end, this vector should separate the two classes.
%% Cell type:code id:496cfa1b-16d7-4010-8b04-f0d7c6740cdc tags:
``` python
def plot_svc_decision_boundary(linear_svm, xmin, xmax):
w = linear_svm.coef_[0]
b = linear_svm.intercept_[0]
# At the decision boundary, w0*x0 + w1*x1 + b = 0
# => x1 = -w0/w1 * x0 - b/w1
x0 = np.linspace(xmin, xmax, 200)
decision_boundary = -w[0]/w[1] * x0 - b/w[1]
margin = 1/w[1]
gutter_up = decision_boundary + margin
gutter_down = decision_boundary - margin
support_vectors = linear_svm.support_vectors_
plt.scatter(support_vectors[:, 0], support_vectors[:, 1], s=90, facecolors='#FFAAAA')
plt.plot(x0, decision_boundary, "k-", linewidth=2)
plt.plot(x0, gutter_up, "k--", linewidth=2)
plt.plot(x0, gutter_down, "k--", linewidth=2)
def create_plot(linear_svm, training_data, training_target_data, new_irises, new_predicted_groups):
plt.figure(figsize=(8,8))
# Plot the Trainingpoints with different colors
plt.plot(training_data[:, 0][training_target_data==1], training_data[:, 1][training_target_data==1], "co")
plt.plot(training_data[:, 0][training_target_data==0], training_data[:, 1][training_target_data==0], "mo")
# Plot the predicted irises
#plt.plot(new_irises[:, 0][new_predicted_groups==1], new_irises[:, 1][new_predicted_groups==1], "bD")
#plt.plot(new_irises[:, 0][new_predicted_groups==0], new_irises[:, 1][new_predicted_groups==0], "rD")
plot_svc_decision_boundary(linear_svm, -2, 2)
plt.xlabel("Petal Width normalized", fontsize=12)
plt.ylabel("Petal Length normalized", fontsize=12)
plt.title("SVM", fontsize=16)
plt.axis([-2, 2, -2, 2])
```
%% Cell type:markdown id:ef63ede0-358d-4a6d-9f1c-12358a73af14 tags:
## Support Vector Machine
Here the actual support vector machine of the sklearn library is used. This often happens in 5 steps
1. Data import. In this case we import the data of the iris dataset. Here we use only the petal length and width
2. Data preparation. Here, the data is normalized, scaled or filtered. Here we only need the normalisation
3. Data splitting. Our data is divided into training data and evaluation data (data, target).
4. Model training. The training data is used to train the support vector machine.
5. Evaluation. The evaluation data is used to evaluate the accuracy of the model. If the accuracy is still not sufficient, we jump back to 4.
In this case we use a linear support vector machine with an infinite C. The C determines how far we want to exclude an erroneous classification of future data. This makes the edge of our hyperplane separating the data smaller and allows us to determine more precisely in which class a future point would lie. However, data that are close to each other, of different classes, can also produce errors.
%% Cell type:code id:15e90cc0-ecd0-455a-94e3-04fc632fee6c tags:
``` python
# Loading iris dataset [petal length, petal width]
iris_dataset = datasets.load_iris()
iris_training_data = iris_dataset["data"][:, (2, 3)]
# Scaling learning data
scaler = StandardScaler()
iris_training_data = scaler.fit_transform(iris_training_data)
# Getting the targets for the learning data, which say in which group every datapoint belongs
iris_training_target = iris_dataset["target"]
# Filtering after setosa and versicolor, because the dataset has 3 types and we only need two
setosa_or_versicolor = (iris_training_target == 0) | (iris_training_target == 1)
iris_training_data = iris_training_data[setosa_or_versicolor]
iris_training_target = iris_training_target[setosa_or_versicolor]
# Loading of the Support Vector Machine wit linear kernel
linear_svm = SVC(kernel="linear", C=1)
# train model
linear_svm.fit(iris_training_data, iris_training_target)
# New Iris Date which the now trained model should classify
new_irises = [[1.7, 0.3],
[0.7,0.8],
[2.9,2.2],
[1.5,2.5],
[0.6,1.7],
[0.5,1.3],
[1.4,2.4]]
new_irises = scaler.fit_transform(new_irises)
# Predict groups 0 or 1
new_predicted_groups = linear_svm.predict(new_irises)
# Build output plot
create_plot(linear_svm, iris_training_data, iris_training_target, new_irises, new_predicted_groups)
```
%% Output
%% Cell type:markdown id:c2393a3f-ad79-4d54-863b-0e1934925934 tags:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment