Skip to content
Snippets Groups Projects
Autoencoder.md 3.25 KiB
Newer Older
  • Learn to ignore specific revisions
  • Daniel Müller's avatar
    Daniel Müller committed
    # Autoencoder
    
    Tobias Stein's avatar
    Tobias Stein committed
    The autoencoder is a versatile artificial neural network with multiple variations that can be utilized for
    [unsupervised learning](../Neural-Networks.md#unsupervised). Depending on the 
    
    Daniel Müller's avatar
    Daniel Müller committed
    specific variant it can be used to perform tasks from anomaly detection and 
    [feature extraction](https://en.wikipedia.org/wiki/Feature_extraction), over image denoising to 
    even image generation.
    
    Daniel Müller's avatar
    Daniel Müller committed
    This is achieved by taking input data and compressing it into a smaller intermediate representation 
    and then decompressing it again for the output.
    
    Daniel Müller's avatar
    Daniel Müller committed
    
    ## Structure
    
    Tobias Stein's avatar
    Tobias Stein committed
    The basic structure of the autoencoder consists of 3 parts: The encoder, the decoder and a 
    
    Daniel Müller's avatar
    Daniel Müller committed
    bottleneck in the middle.
    
    Daniel Müller's avatar
    Daniel Müller committed
    
    ![Basic autoencoder schema](./Autoencoder_schema.png)
    
    Source: <https://en.wikipedia.org/wiki/File:Autoencoder_schema.png> 28.12.2021
    
    ### Encoder
    
    Daniel Müller's avatar
    Daniel Müller committed
    The encoder compresses the input into the latent space representation in the middle. This is done 
    by using one or more [convolutional and pooling layers](../Convolutional-Neural-Networks/Convolutional-Neural-Networks.md) 
    in series with each consecutive layer producing a smaller output than the previous. The output of 
    
    Tobias Stein's avatar
    Tobias Stein committed
    the last encoding layer is generally much smaller than the input and since this encoding process is 
    lossy, the original data can not be perfectly reconstructed from this compressed representation.
    
    Daniel Müller's avatar
    Daniel Müller committed
    
    ### Bottleneck
    
    Daniel Müller's avatar
    Daniel Müller committed
    The output of the last encoding layer is the smallest representation of the data inside the network 
    and creates a bottleneck that restricts how much information can pass from the encoder through the 
    
    Tobias Stein's avatar
    Tobias Stein committed
    decoder. This is used to restrict the information flow to only the important parts for a given 
    
    Daniel Müller's avatar
    Daniel Müller committed
    usecase. In the case of a denoising autoencoder for example, the bottleneck should filter out the 
    
    Tobias Stein's avatar
    Tobias Stein committed
    noise.
    
    Smaller bottlenecks lower the risk of [overfitting](../../Glossary.md#overfitting) 
    
    Tobias Stein's avatar
    Tobias Stein committed
    since it can't contain enough information relative to the input size to effectively learn 
    specific inputs.
    However the smaller the bottleneck is the larger is the risk of losing important data.
    
    Daniel Müller's avatar
    Daniel Müller committed
    
    ### Decoder
    
    Daniel Müller's avatar
    Daniel Müller committed
    The last stage is the decoder which takes in the compressed representation and tries to decompress 
    it. In the most simple case the goal would be to just reconstruct the original image from the 
    compressed form as accurately as possible. Since the bottleneck restricts how much information can 
    pass through, the reconstruction won't be perfect but instead only an approximation. In a more 
    
    Tobias Stein's avatar
    Tobias Stein committed
    interesting example with the denoising autoencoder the decoder should reconstruct the input image
    but remove the noise from the input in the process. 
    
    Daniel Müller's avatar
    Daniel Müller committed
    
    This step is generally performed by a Deconvolutional Network. As the name suggests, this is quite 
    similar to a [Convolutional Network](../Convolutional-Neural-Networks/Convolutional-Neural-Networks.md), 
    as it was used in the encoding step, but in reverse. Where the Convolutional Network was taking a 
    
    Tobias Stein's avatar
    Tobias Stein committed
    large amount of input data and reducing it to a much smaller dataset in order to isolate certain 
    
    Daniel Müller's avatar
    Daniel Müller committed
    bits of information, the Deconvolutional Network is mapping a small dataset onto a much larger 
    
    Tobias Stein's avatar
    Tobias Stein committed
    one. This enables the generation of data from a given set of isolated features like the compressed 
    
    Daniel Müller's avatar
    Daniel Müller committed
    representation created by the encoder.
    
    ## References
    
    {{#include ../../References.md:AUTOENCODER}}
    
    *Written by Daniel Müller*