Autoencoder.md

# Autoencoder
The autoencoder is a versatile artificial neural network with multiple variations that can be utilized for
[unsupervised learning](../Neural-Networks.md#unsupervised). Depending on the 
specific variant it can be used to perform tasks from anomaly detection and 
[feature extraction](https://en.wikipedia.org/wiki/Feature_extraction), over image denoising to 
even image generation.

This is achieved by taking input data and compressing it into a smaller intermediate representation 
and then decompressing it again for the output.

## Structure
The basic structure of the autoencoder consists of 3 parts: The encoder, the decoder and a 
bottleneck in the middle.

![Basic autoencoder schema](./Autoencoder_schema.png)

Source: <https://en.wikipedia.org/wiki/File:Autoencoder_schema.png> 28.12.2021

### Encoder
The encoder compresses the input into the latent space representation in the middle. This is done 
by using one or more [convolutional and pooling layers](../Convolutional-Neural-Networks/Convolutional-Neural-Networks.md) 
in series with each consecutive layer producing a smaller output than the previous. The output of 
the last encoding layer is generally much smaller than the input and since this encoding process is 
lossy, the original data can not be perfectly reconstructed from this compressed representation.

### Bottleneck
The output of the last encoding layer is the smallest representation of the data inside the network 
and creates a bottleneck that restricts how much information can pass from the encoder through the 
decoder. This is used to restrict the information flow to only the important parts for a given 
usecase. In the case of a denoising autoencoder for example, the bottleneck should filter out the 
noise.

Smaller bottlenecks lower the risk of [overfitting](../../Glossary.md#overfitting) 
since it can't contain enough information relative to the input size to effectively learn 
specific inputs.
However the smaller the bottleneck is the larger is the risk of losing important data.

### Decoder
The last stage is the decoder which takes in the compressed representation and tries to decompress 
it. In the most simple case the goal would be to just reconstruct the original image from the 
compressed form as accurately as possible. Since the bottleneck restricts how much information can 
pass through, the reconstruction won't be perfect but instead only an approximation. In a more 
interesting example with the denoising autoencoder the decoder should reconstruct the input image
but remove the noise from the input in the process. 

This step is generally performed by a Deconvolutional Network. As the name suggests, this is quite 
similar to a [Convolutional Network](../Convolutional-Neural-Networks/Convolutional-Neural-Networks.md), 
as it was used in the encoding step, but in reverse. Where the Convolutional Network was taking a 
large amount of input data and reducing it to a much smaller dataset in order to isolate certain 
bits of information, the Deconvolutional Network is mapping a small dataset onto a much larger 
one. This enables the generation of data from a given set of isolated features like the compressed 
representation created by the encoder.

## References
{{#include ../../References.md:AUTOENCODER}}

*Written by Daniel Müller*