SBWL 1: Data Processing 1 (PI2.0)

Winter Term 2018/19
Aniko Hannak, Axel Polleres, Stefan Sobernig


Table of contents

Schedule
Organisational
Unit details
Jupyter Notebook
Supplemental Reading

Syllabus

Overall, students shall gain fundamental knowledge for dealing with different data formats and in using methods and tools to integrate data from various sources in this course

Schedule

Unit Date Room Topic
1 Tue 02.10.2018 14:00 – 18:00 TC.3.12 Course introduction
2 Tue 09.10.2018 14:00 – 18:00 EA.6.032 Data access
3 Tue 16.10.2018 10:00 – 14:00 D1.1.074 Data processing (basics)
4 Tue 23.10.2018 14:00 – 18:00 TC3.05 Data processing (cont'd)
5 Tue 30.10.2018 10:00 – 14:00 D1.1.074 Advanced topics (pandas, visualisation)
6 Tue 06.11.2018 13:00 – 17:00 D4.0.022 Data storage
7 Tue 27.11.2018 10:00 – 14:00 D4.0.022 Project presentation

Organisational

Instructor(s)

Aniko Hannak

aniko.hannak@wu.ac.at

Axel Polleres

axel.polleres@wu.ac.at

Stefan Sobernig

stefan.sobernig@wu.ac.at

Konstantin Kueffner (Tutor)

konstantin.kueffner@wu.ac.at

Grading

See the authoritative details at Learn@WU.


Course Material

Unit details

Unit 1: Course Overview & Introduction

Slides: This unit is also available in a PDF format and as a single HTML Page

Readings:

Notebook of Unit1

Unit1: Homework

Task:

Details: Assignment 1 on Learn@WU

Submission: Via Assignment 1 on Learn@WU, until Tue, October 16 2018, 23:59.

Unit 2: Data access, formats, & encoding

Slides: This unit is also available in a PDF format and as a single HTML Page

Readings:

Notebook of Unit2

Unit 3: Data cleaning and preparation (Basics)

Slides: This unit is also available in a PDF format and as a single HTML Page

Notebook of Unit3

Unit 2: Homework

Task:

* number of data items * data columns, data rows (if applicable) * nesting level (if applicable)

Details: Assignment 2 on Learn@WU

Submission: Via Assignment 2 on Learn@WU, until Fri, October 26 2018, 23:59.

Unit 4: Data cleaning and preparation (Cont'd)

Slides: This unit is also available in a PDF format and as a single HTML Page

Notebooks of Unit 4

Unit 5: Data storage & Persistence

Connection to and loading data into and from a database system (vs. storing/loading from a file)

Slides: This unit is also available in a PDF format and as a single HTML Page

Readings:

Notebook of Unit5

Unit 6: Advanced topics

Slides: This unit is also available in a PDF format and as a single HTML Page

Readings:

Notebooks of Unit 6

Jupyter Notebook

The theoretical part of the course is accompanied by practical code examples and hands on exercises using the interactive Python environment Jupyter.

Supplemental Reading

Coding