Skip to content
Snippets Groups Projects
Commit e5c27207 authored by Lennart Eichhorn's avatar Lennart Eichhorn
Browse files

Merge branch 'master' of code.fbi.h-da.de:istlteich/psd-outbreak-modeling

parents 94935dee 67b8cff0
No related branches found
No related tags found
No related merge requests found
Pipeline #48950 passed
......@@ -28,6 +28,9 @@ for general information about COVID-19 spreading.
* 6.2 [Parameters](https://code.fbi.h-da.de/istlteich/psd-outbreak-modeling#62-parameters)
* 6.3 [Metrics](https://code.fbi.h-da.de/istlteich/psd-outbreak-modeling#63-metrics)
* 6.4 [Re-run the application](https://code.fbi.h-da.de/istlteich/psd-outbreak-modeling#64-re-run-the-application)
7. [Dataset](https://code.fbi.h-da.de/istlteich/psd-outbreak-modeling#7-dataset)
* 7.1 [SHL dataset](https://code.fbi.h-da.de/istlteich/psd-outbreak-modeling#71-The-SHL-dataset)
* 7.2 [geolife dataset](https://code.fbi.h-da.de/istlteich/psd-outbreak-modeling#72-The-geolife-dataset)
# 1 Overview
There are two simulators
......@@ -501,3 +504,79 @@ Metrics which are displayed and can be downloaded as a csv file.
## 6.4 Re-run the application
To get a new run with default parameters, reload the page. To keep parameter settings press the re-run button at the end of the simulation.
The simulation also automatically restarts when any parameter gets changed.
# 7 Datasets
Originally it was planned to base the simulation on real data extracted from the
(Sussex-Huawei Locomotion)[http://www.shl-dataset.org/] and the
(geolife)[https://www.microsoft.com/en-us/research/publication/geolife-gps-trajectory-dataset-user-guide/] datasets,
but we ran into various problems in trying to apply a real world dataset to an abstract simulation like ours.
Although the datasets are not used in the project, the scripts to download and process the data
are still there.
## 7.1 The SHL dataset
The SHL dataset is a versatile annotated dataset for multimodal locomotion analytics of mobile users.
It contains 750 hours of labeled locomotion data from 3 Users. Each user had 4
phones attached to them over a period of 7 months. Only the data for one phone
from one user and a preview of the other users is publicly available.
### Downloading
To download and extract the SHL dataset for user one go to the `dataset` directory and use
`make download-shl-userone`
This will create the directory `data/SHLDataset_User1Hips_v1` containing the download.
You need at least 120 GB of free space to download this dataset.
### Processing
You can create a summary of the data for each day by running `make process-shl-userone`
This will create a file called `merged.csv` in the directories containing the data
for each day.
## 7.2 The geolife dataset
The geolife GPS trajectoy dataset consists of movement data collected from 178 user
over a period of four years. The dataset contains 17,621 trajectories with a total
distance of 1,251,654 kilometers and a total duration of 48,203 hours. The data was
collected on various gps trackers and phones at various samplerates. Some of it is
labeled with the mode of transport / activity.
### Downloading
To download and extract the geolife dataset go to the `dataset` directory and use
`make download-geolife`
This will create the directory `data/SHLDataset_User1Hips_v1` containing the
extracted geolife dataset.
You need at least 3 GB of space to download this dataset.
## Processing
You can load the geolife dataset into a mariadb/mysql database by running `make process-geolife`
This will start a mariadb instance with docker and create a geolife database containing the dataset.
## Using the dataset
After processing the datatset you can start a mariadb instance with phpmyadmin to
view the dataset using `docker-compose up -d` in the `dataset` directory.
You can access phpmyadmin on `http://localhost:8081`.
### Database structure
The generated database contains three tables `user`, `label` and `location`.
#### User table
The user table specifies a user id and whether their data is labeled or not
- id : The id of the user
- labeled : Whether the data for this user is labeled or not / has entries in the label table
#### Lable table
The label table specifies which mode of transport a user used in a specific timeframe
- mode : The mode of transport used in in this timeframe. Can be walk, bike, bus, car, subway, train, airplane, boat, run, motorcycle or taxi
- userid : The userid of the labeled user
- start : When the user started to use this mode of transport
- end : When the user stopped to use this mode of transport
#### Location table
The location table contains the recorded locations for each user
- userid : The user this entry belongs to
- time : When this entry was recorded
- latitude : The latitude of the recorded location
- longitude : The longitude of the recorded location
- altitude : The alitude of the recorded locatin in feet
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment