Commit ce08461e authored by Daniel Müller's avatar Daniel Müller 💬
Browse files

Merge branch 'kd_dataset_description' into 'main'

Described missing Datasets and fixed wrong number in exercise_deep_learning

See merge request !131
parents 1f2b5098 5c3a10c9
Pipeline #94240 passed with stage
in 24 seconds
# Context
Mileage per gallon performances of various cars. The data is technical specs of cars.
Origin: This dataset was taken from the StatLib library which is
maintained at Carnegie Mellon University. The dataset was
used in the 1983 American Statistical Association Exposition.
(c) Date: July 7, 1993
# Content
The dataset has 398 entries and 9 attributes.
This file contains the basic information (mpg, cylinders, displacement, horsepower, weight, acceleration, model year, origin, car name) about the cars. Be careful, there are 6 invalid values in the 'horsepower' column.
# Example Entries
|mpg|cylinders|displacement|horsepower|weight|acceleration|model year|origin|car name|
|----|----|----|----|----|----|----|----|----|
|18|8|307|130|3504|12|70|1|chevrolet chevelle malibu|
|15|8|350|165|3693|11.5|70|1|buick skylark 320|
|18|8|318|150|3436|11|70|1|plymouth satellite|
|16|8|304|150|3433|12|70|1|amc rebel sst|
|17|8|302|140|3449|10.5|70|1|ford torino|
# Credit
Dataset acquired from [Kaggle](https://www.kaggle.com/uciml/autompg-dataset)
https://www.kaggle.com/shubhampundir/autompg-dataset
\ No newline at end of file
# Context
This dataset holds example images(28x28 pixels) of handwritten letters.
# Content
The dataset has 1499 training and 3999 test images with labels.
# Credit
## Test Data
Dataset acquired from [Kaggle](https://www.kaggle.com/crawford/emnist?select=emnist-letters-test.csv)
## Train Data
Dataset acquired from [Kaggle](https://www.kaggle.com/crawford/emnist/version/3?select=emnist-letters-train.csv)
## Test Data
https://www.kaggle.com/crawford/emnist?select=emnist-letters-test.csv
## Train Data
https://www.kaggle.com/crawford/emnist/version/3?select=emnist-letters-train.csv
\ No newline at end of file
......@@ -41,7 +41,7 @@
"metadata": {},
"source": [
"## Load Test and Trainig Data\n",
"We reduced the training dataset to **15000** and the test dataset to **4000** entries. Otherwise the nodebook will fail because of RAM issues (especially for the **Raspberry PI 3**) \n",
"We reduced the training dataset to **14999** and the test dataset to **3999** entries. Otherwise the nodebook will fail because of RAM issues (especially for the **Raspberry PI 3**) \n",
"Unfortunately it will reduce the accuracy of the model"
]
},
......
%% Cell type:markdown id:97b380db-e5c4-47fc-9239-723ef2b96c89 tags:
# Deep Learning
Your task is to build a deep neural network with Dense Layers to classify the letters from the emnist dataset
%% Cell type:markdown id:22cfd362-a197-4d6b-97b4-7ebcd09eb091 tags:
## Imports
%% Cell type:code id:8604e125-06c3-4710-b812-029499e87d21 tags:
``` python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from keras.models import Sequential
from keras.layers.core import Dense
from keras.layers import Lambda
from keras.utils import np_utils
from mpl_toolkits.axes_grid1 import ImageGrid
```
%% Cell type:markdown id:ddce98e9-e324-4a64-93ac-d76285673b27 tags:
## Load Test and Trainig Data
We reduced the training dataset to **15000** and the test dataset to **4000** entries. Otherwise the nodebook will fail because of RAM issues (especially for the **Raspberry PI 3**)
We reduced the training dataset to **14999** and the test dataset to **3999** entries. Otherwise the nodebook will fail because of RAM issues (especially for the **Raspberry PI 3**)
Unfortunately it will reduce the accuracy of the model
%% Cell type:code id:2f607d91-ca98-4204-9d09-d02504c327b4 tags:
``` python
train = pd.read_csv('../data/Letters/emnist-train.csv.gz')
test = pd.read_csv('../data/Letters/emnist-test.csv.gz') # load data into Test and Training Data
```
%% Cell type:markdown id:1505a3f0-3e03-4e2e-8bfe-d4df14990378 tags:
## Split into image and lable
%% Cell type:code id:2c08d87c-fed0-4036-9460-daca9602a888 tags:
``` python
# code here
```
%% Cell type:markdown id:819d75a2-8eca-4d3b-8792-e075dccac862 tags:
## Show the first 9 images
%% Cell type:code id:931516c6-bc6c-4245-8dfe-4a0e52a14c9e tags:
``` python
# code here
```
%% Cell type:markdown id:04a3c6bb-68b6-468c-8643-0c1d92b4b362 tags:
## Put image into a single vector
%% Cell type:code id:abb287f2-0f5c-4717-8986-af92a1db0e98 tags:
``` python
# code here
```
%% Cell type:markdown id:f7673eb1-a8bd-41e0-9e21-c1a72d2cf971 tags:
## Function to normalize pixel values
%% Cell type:code id:a6fbc91d-9127-4c61-8f59-0c3ced262d5b tags:
``` python
def preprocess_image(image): # input for this method needs to be an image
return image / np.float32(255.0) # divide each pixel by 255 and return the new image
```
%% Cell type:markdown id:788d6b65-341a-4fd0-8778-49554e12a7f6 tags:
## How many classes do we have?
%% Cell type:code id:76d7a70f-35c0-4ccb-a834-52371b23fb7d tags:
``` python
# code here
```
%% Cell type:markdown id:e9ddaa85-9695-4368-9d2f-30526d195f93 tags:
## Categorize Dataset
%% Cell type:code id:8531ff4c-4039-4bf5-ae48-752ea954a1b1 tags:
``` python
# code here
# use to_categorical()
```
%% Cell type:markdown id:c96e71f0-ed34-47ad-a93e-0e1a18cfb567 tags:
## Model for the neural network
%% Cell type:code id:41142012-e981-4444-a77b-583724b6d254 tags:
``` python
# model_One = Sequential() # A Sequential model is appropriate for a plain stack of layers where each layer has exactly one input tensor and one output tensor.
# model_One.add(Lambda(preprocess_image)) # in this first Layer we normalize the image
# ...
```
%% Cell type:markdown id:d5ac1df1-5259-458e-beaa-796a6db973da tags:
## Plot Data
%% Cell type:code id:add7c2e8-e1ab-4acd-8b02-526090f8d7f7 tags:
``` python
# code here
```
%% Cell type:markdown id:340efb59-8119-4f1d-bdfc-5623abc3b7bf tags:
## Evaluate with test data
%% Cell type:code id:bd626a2b-dfc4-402b-8e7c-ca8aa40143cd tags:
``` python
# Use model_One.evaluate()
# print result
#print("Training accuracy: " + str(np.round(train_accuracy1 * 100, 2)) + "%") # print the train accuracy in percent
#print("Test accuracy: " + str(np.round(test_accuracy1 * 100, 2)) + "%") # print the test accuracy in percent
```
%% Cell type:code id:b0b53870-fa56-4c42-a9de-e406fb0bc07d tags:
``` python
```
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment