# The CSV library

The [CSV](https://docs.python.org/3/library/csv.html) Python library allows you to parse and generate CSV in Python.
The library is by default already installed in the core Python3 installation.

The method

        csv.reader( iteratableObject ) # iteratableObject: e.g. file, lists, etc..


return a reader object which will iterate over lines.

1. Load the library

In [1]:
import csv

### An example for loading CSV from a file

Let's load an CSV file from the file system.

In [2]:
# loading a CSV file from disk ( path local to the Jupyter notebook)
filePath='./data/allcoursesandevents16w-small.csv'

with open(filePath, 'r') as f:
    csvreader = csv.reader(f)
    for row in csvreader:
        print(row)

## An example for loading CSV from a URL

Let's load an CSV file from a Web resource.

We load the course events for the Winter Semester 2016, located at the [data.wu.ac.at Open Data portal](http://data.wu.ac.at/dataset/all_course_events_2016w).

In [3]:
# loading a "CSV" file from data.gv.at
import urllib.request
import csv
import codecs # we use this libary to properly decode the byte stream of the HTTP response

#url , copied from the data.wu.ac.at portal
url='http://data.wu.ac.at/portal/dataset/2ad2fec6-170c-4282-bb86-21058ce9a6ee/resource/f8b872ad-78b4-473d-a650-63644e53972a/download/allcoursesandevents17s.csv'

**Careful**: This file contains **several thousand lines**. We do not want to output all of the lines on the notebook, just the first N.

In [4]:
resp = urllib.request.urlopen(url) # open the connection
#We need to decode the byte stream to string
csvfile = csv.reader(codecs.iterdecode(resp, 'utf-8')) # csv.reader requires a iterator object
c=0 # counter for the parsed line numbers
N=10 # MAX number of lines we want to output
for line in csvfile:
    c+=1
    print(line)
    if c>N:
        break # break exists the loop ( see also Python3 tutorial in chapter 4)

## CSV is not alway "Comma-seperated values" file

In fact, many "CSV" files are more "character-seperate-values file", meaning that the used value delimiter is not ",".
Very often, the ";" delimiter is used, rather than the "," delimiter. Especially, for MS Excel exported CSV files from German speaking countries.

## Dealing with non-',' delimiters

Lets look at some statistics about tourism from the city of Innsbruck, taken from [data.gv.at](https://www.data.gv.at/katalog/dataset/touribk-1011/resource/a89d9ae3-b855-492d-81e7-bdb2f29605df)

In [5]:
#Tourism, arrivals, and over-night-stays in 2010-2011
url="http://www.innsbruck.gv.at/data.cfm?vpath=diverse/ogd/statistik9/tourismus22/tourismusjahr-2010-2011csv"
resp=urllib.request.urlopen(url) # open the connection

#lets have a quick look how the data looks
data=resp.read().decode('utf-8') #decoding the byte content to string
c=0 # counter for the parsed line numbers
N=10 # MAX number of lines we want to output

#to iterate over the content line by line, we need to tell Python to split the string content at the "newline" character
for line in data.split('\n'):
  c+=1
  print(line)
  if c>N:
      break # break exists the loop ( see also Python3 tutorial in chapter 4)

**Notice.**

 Well, it does not look like they are using ',' as delimiter. Seems it is more likely that it is the ";" delimiter.

In [6]:
#Tourism, arrivals, and over-night-stays in 2010-2011
url="http://www.innsbruck.gv.at/data.cfm?vpath=diverse/ogd/statistik9/tourismus22/tourismusjahr-2010-2011csv"
resp = urllib.request.urlopen(url) # open the connection
csvfile = csv.reader(codecs.iterdecode(resp, 'utf-8'), delimiter=";") # csv.reader requires a iterator object
c=0 # counter for the parsed line numbers
N=10 # MAX number of lines we want to output
for line in csvfile:
    c+=1
    print(line)
    if c>N:
        break # break exists the loop ( see also Python3 tutorial in chapter 4)

Ok, with ';' as delimiter, the reader object returns a row which contains 4 items