Commit b0ffeacc authored by Galtier Virginie's avatar Galtier Virginie
Browse files

updated version of the source code

parent ce773317
%% Cell type:markdown id: tags:
# PYTHON AND DATABASE
This file showcases how to interact with a database using Python:
how to create tables, how to insert data or run other queries.
As a relational database management system (RDBMS) we use SQLite
as we did in a previous tutorial of this course
except that we used the program DB Browser for SQLite (command 'sqlitebrowser')
to create and manipulate a SQLite database
while here we'll create and manipulate a SQLite database directly within our Python application.
%% Cell type:markdown id: tags:
In order to make a Python program communicate with a SQLite database,
we need to have the Python package sqlite3 installed.
%% Cell type:code id: tags:
```
import sqlite3
```
%% Cell type:markdown id: tags:
## Presentation of the example
We'll create a database with 2 tables:
```
City(
INSEE-code (primary key, it can contain digits and letters!),
Name,
Department (foreign key on Departement number, it can contain digits and letters
such as "2B" or "974" for instance),
ZipCode-s, (if a city covers multiple zipcodes there are separated by a dash:
for instance 06000-06100-06200-06300 for Nice)
Population-1999,
Population-2010,
Estimated-population-2012,
Density,
Surface, expressed in km² (decimal number)
Longitude, (city center, decimal number, expressed in Degree, Minute, Second)
Latitude,
Elevation-Min (decimal number),
Elevation-Max)
```
```
Departement (
Number (primary key, it can contain digits and letters),
Name)
```
The DB will be populated with data from 2 CSV files available on Edunao:
- ```departements.cvs```
- ```villes.cvs```
(download them in the current directory now)
%% Cell type:markdown id: tags:
## DB Creation
%% Cell type:markdown id: tags:
```db_file``` will be the path to the file that will contain the database
%% Cell type:code id: tags:
```
db_file = "france.db"
```
%% Cell type:markdown id: tags:
Open a connection to the database:
%% Cell type:code id: tags:
```
conn = sqlite3.connect(db_file)
```
%% Cell type:markdown id: tags:
Execute the code above and notice how a file ```france.db``` is created in your workspace.
Open it with sqlitebrowser: it contains no table (yet).
%% Cell type:markdown id: tags:
Enables the foreign key contraints support in SQLite:
%% Cell type:code id: tags:
```
conn.execute("PRAGMA foreign_keys = 1")
```
%% Output
<sqlite3.Cursor at 0x7f35243d1240>
%% Cell type:markdown id: tags:
Lastly, create the ```cursor``` object used to execute queries to the database.
%% Cell type:code id: tags:
```
cursor = conn.cursor()
```
%% Cell type:markdown id: tags:
## Tables creation
Read and complete the code of the ```create_tables``` function below to create the 2 tables.
%% Cell type:code id: tags:
```
def create_tables(conn, cursor):
"""
Creates the database tables
Parameters
----------
conn :
The object used to manage the database connection.
cursor :
The object used to query the database.
Returns
-------
bool
True if the database tables could be created,
False otherwise.
"""
# Open a transaction
# ------------------
# A transaction is a sequence of read/write statements that
# have a permanent result in the database only if they all succeed.
#
# More concretely, in this function we create two tables in the database.
# The transaction is therefore a sequence of CREATE TABLE statements such as :
# BEGIN
# CREATE TABLE T1
# CREATE TABLE T2
# If no error occurs, all the tables are permanently created in the database.
# If an error occurs while creating a table (for instance T2), no table will be created,
# even those for which the statement CREATE TABLE has already been executed
# (in this example, T1).
#
# When we start a transaction with the statement BEGIN, we must end it with
# either COMMIT or ROLLBACK.
# * You usually call COMMIT when no error occurs.
# After calling COMMIT, the result of all the statements in the transaction
# is permanetly written to the database.
# In our example, COMMIT results in actually creating the 2 tables T1 and T2
# * ROLLBACK is usually called when any error occurs in the transaction.
# Calling ROLLBACK means that the database is not modified
# (in our example, no table is created).
cursor.execute("BEGIN")
# Create the tables
# -----------------
try:
# The 'cursor.execute()' function executes the SQL statement passed as a parameter;
# it can raise a 'sqlite3.Error' exception,
# that's why we write the code for creating the tables in a 'try...except' block.
print("Creating Department table...")
cursor.execute('''
CREATE TABLE IF NOT EXISTS Department(
number TEXT PRIMARY KEY,
name TEXT
)
''')
##### TODO: COMPLETE THE CODE HERE TO CREATE THE CITY TABLE ####
print("Creating City table...")
###################################################################
# Commit or rollback
# ------------------
# Exception raised when something goes wrong while creating the tables.
except sqlite3.Error as error:
print("An error occurred while creating the tables: ", format(error))
# IMPORTANT: rollback the transaction to avoid creating only one table in the database.
conn.rollback()
# Return False to indicate that something went wrong.
return False
# If we get here, that means that no error occurred.
# IMPORTANT: we must COMMIT the transaction,
# so that all tables are actually created in the database.
conn.commit()
print("Tables created successfully")
# Returns True to indicate that everything went well!
return True
```
%% Cell type:markdown id: tags:
Execute the newt cell and check the structure of your DB with sqlitebrowser.
Execute the next cell and check the structure of your DB with sqlitebrowser.
%% Cell type:code id: tags:
```
create_tables(conn, cursor)
```
%% Cell type:markdown id: tags:
## Data insertion
Read and complete the code of the ```insert_department``` function below.
%% Cell type:code id: tags:
```
def insert_department(department, conn, cursor, silent=False):
"""
Inserts a department into to the database.
Parameters
----------
department : dict
A dictionary holding the department data:
department["number"], department["name"].
conn :
The object used to manage the database connection.
cursor :
The object used to query the database.
silent : bool, optional
if True, success message is not printed
Returns
-------
bool
True if no error occurs, False otherwise.
"""
# remove 'pass', uncomment and complete the code
pass
"""
cursor.execute("BEGIN")
try:
# Our insert query contains two question marks (?) that indicate that
# the values will be specified later.
#
# IMPORTANT:
# * The query assumes that you called the table 'Department'.
# If you gave it another name, CHANGE the query accordingly.
# * The query assumes that in the Department table the columns are defined in this order:
# number, name.
# If the order in which you created the columns is different,
# change this variable accordinly.
insert_query = "INSERT INTO Department (number, name) VALUES (?, ?)"
# A tuple with the values that will replace the '?' in 'insert_query'.
# The values are obtained from the dictionary 'department' passed as a parameter.
query_values = TODO
# We pass the function 'cursor.execute()' two parameters:
# the first is the insert_query; the second is the query_values.
# This is called a "PARAMETERIZED QUERY",
# where the values of the query are passed as a parameter.
cursor.execute(
insert_query,
query_values
)
# We catch here a sqlite3.IntegrityError that is raised whenever
# an integrity constraint is violated in the database.
# Here the only integrity constraint that might be violated is ...
# TODO
except sqlite3.IntegrityError as error:
print("Insertion of", department['name'], "failed: ", end="")
# TODO: print the reason it fails
print(format(error))
conn.rollback()
return False
# Here we catch any other database error that can arise from this insert query.
except sqlite3.Error as error:
print("Insertion of", department['name'], "failed: ", end="")
print(format(error))
conn.rollback()
return False
# Everything is OK
if not silent:
print(department['name'], "successfully added")
conn.commit()
return True
"""
```
%% Cell type:markdown id: tags:
Execute the next cell and check that 'Ain' was added to the database with sqlitebrowser.
%% Cell type:code id: tags:
```
ain = {"number": 1, "name": "Ain"}
insert_department(ain, conn, cursor)
```
%% Cell type:markdown id: tags:
### Primary Key check
Try to insert another department using the same '1' number.
Check that the operation fails,
and use sqlitebrowser to check that the DB was not modified.
%% Cell type:code id: tags:
```
numerouno = {"number": 1, "name": "Numero Uno"}
insert_department(numerouno, conn, cursor)
```
%% Cell type:markdown id: tags:
### Remember Pandas?
Read and complete the code of the ```insert_all_departments``` function below.
(make sure your workspace contains the ```departements.csv``` file)
%% Cell type:code id: tags:
```
import pandas as pd
from pandas.core.frame import DataFrame
def insert_all_departments(conn, cursor, silent):
"""
Reads the 'departements.csv' file and populates the database.
"""
# remove 'pass', uncomment and complete the code
pass
"""
# use Pandas 'read_csv' function to read DataFrames from the file
# https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
input_df = pd.read_csv( TODO
# use Pandas 'iterrows' function to iterate through the rows of the file
# https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iterrows.html
for index, row in TODO:
# create a department datastructure from the DataFrame
department = {"number":row['numéro'], "name":row['nom']}
# use the 'insert_department' function to add the department
#TODO
"""
```
%% Cell type:markdown id: tags:
Execute the next cell and use sqlitebrowser to check that the departments were inserted in the DB.
%% Cell type:code id: tags:
```
insert_all_departments(conn, cursor, silent=True)
```
%% Cell type:markdown id: tags:
## Delete records
Write a ```remove_all_departments``` function that deletes all rows from the Department table.
%% Cell type:code id: tags:
```
def remove_all_departments(conn, cursor):
"""
Removes all departments from the database.
Parameters
----------
conn :
The object used to manage the database connection.
cursor :
The object used to query the database.
Returns
-------
bool
True if no error occurs, False otherwise.
"""
# remove 'pass', uncomment and complete the code
pass
"""
cursor.execute("BEGIN")
try:
delete_query = "DELETE FROM Department"
cursor.execute(
delete_query
)
# The only integrity constraint that might be violated is ...
#TODO
except sqlite3.IntegrityError as error:
#TODO: print why it fails
conn.rollback()
return False
# Here we catch any other database error that can arise from this insert query.
except sqlite3.Error as error:
print("A database error occurred while removing the departments: ", format(error))
conn.rollback()
return False
# Everything is OK
print("All departments have been removed.")
conn.commit()
return True
"""
```
%% Cell type:markdown id: tags:
Execute the next cell and check the effect using sqlitebrowser.
%% Cell type:code id: tags:
```
remove_all_departments(conn, cursor)
```
%% Cell type:markdown id: tags:
## Insert data using the Pandas ```to_sql``` function
Read and complete the code of the 'insert_all_departments_v2' function.
%% Cell type:code id: tags:
```
def insert_all_departments_v2(conn, cursor):
"""
Reads the 'departements.csv' file and populates the database
"""
# remove 'pass', uncomment and complete the code
pass
"""
# use Pandas 'read_csv' function to read DataFrames from the file
input_df = pd.read_csv(TODO
# use Pandas 'rename' function to rename the column 'numéro' to 'number'
# and 'nom' to 'name'
# !!! rename will only return a new dataframe with the new headers
# use input_df = input_df.rename(...) to change the headers of the current dataframe
# https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rename.html
input_df = input_df.rename(TODO
# use Pandas 'to_sql' function to automatically iterate through the rows
# to insert records in the database
# https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html
cursor.execute("BEGIN")
try:
input_df.to_sql(TODO
conn.commit()
except TypeError as error:
print("A database error occurred while inserting departments: ", format(error))
conn.rollback()
except ValueError as error:
print("A database error occurred while inserting departments: ", format(error))
conn.rollback()
"""
```
%% Cell type:markdown id: tags:
Execute the next cell and use sqlitebrowser to check that the departments were inserted in the DB.
%% Cell type:code id: tags:
```
insert_all_departments_v2(conn, cursor)
```
%% Cell type:markdown id: tags:
## Foreign key constraint
Complete the following function to handle errors which might occur when inserting a city into the base:
%% Cell type:code id: tags:
```
def insert_city(city, conn, cursor, silent=False):
"""
Inserts a city into to the database.
Parameters
----------
city : dictionary
City data: INSEE-code, Name, Department, ZipCode-s,
Population-1999, Population-2010, Estimated-population-2012,
Density, Surface, Longitude, Latitude, Elevation-Min, Elevation-Max
conn :
The object used to manage the database connection.
cursor :
The object used to query the database.
silent : bool, optional
if True, success message is not printed
Returns
-------
bool
True if no error occurs, False otherwise.
"""
cursor.execute("BEGIN")
try:
insert_query = "INSERT INTO City (INSEE_code, name, department_number, zip_code_s, \
population_1999, population_2010, estimated_population_2012, \
density, surface, longitude, latitude, elevation_min, elevation_max) \
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"
query_values = (city["INSEE_code"], city["name"], city["department_number"], city["zip_code_s"], \
city["population_1999"], city["population_2010"], city["estimated_population_2012"], \
city["density"], city["surface"], city["longitude"], city["latitude"], \
city["elevation_min"], city["elevation_max"])
cursor.execute(
insert_query,
query_values
)
"""
# We catch here a sqlite3.IntegrityError that is raised whenever
# an integrity constraint is violated in the database.
# Here 2 integrity constraints that might be violated:
# - TODO
# - TODO
"""
```
%% Cell type:markdown id: tags:
Execute the next cell and check the insert was made using sqlitebrowser.
%% Cell type:code id: tags:
```
metz = {"INSEE_code": 57463,
"name": "Metz",
"department_number": 57,
"zip_code_s": 57000-57050-57070,
"population_1999": 123704,
"population_2010": 120738,
"estimated_population_2012": 122800,
"density": 2878,
"surface": 41.94,
"longitude": 61037,
"latitude": 490711,
"elevation_min": 162,
"elevation_max": 256}
insert_city(metz, conn, cursor)
```
%% Cell type:markdown id: tags:
What happens if you try to remove all departments now that the City table contains Metz?
%% Cell type:code id: tags:
```
remove_all_departments(conn, cursor)
```
%% Cell type:markdown id: tags:
What happens if you try to insert a new city with the same INSEE code?
%% Cell type:code id: tags:
```
schoenau = {"INSEE_code": 57463,
"name": "Schœœnau",
"department_number": 67,
"zip_code_s": 67390,
"population_1999": 474,
"population_2010": 584,
"estimated_population_2012": 500,
"density": 56,
"surface": 10.37,
"longitude": 73846,
"latitude": 481323,
"elevation_min": 164,
"elevation_max": 172}
insert_city(schoenau, conn, cursor)
```
%% Cell type:markdown id: tags:
What happens if you try to insert a new city with a department number that doesn't exist in the Department table?
%% Cell type:code id: tags:
```
joyeux = {"INSEE_code": 1198,
"name": "Joyeux",
"department_number": 1000,
"zip_code_s": 1800,
"population_1999": 206,
"population_2010": 223,
"estimated_population_2012": 200,
"density": 13,
"surface": 16.58,
"longitude": 50558,
"latitude": 455740,
"elevation_min": 272,
"elevation_max": 298}
insert_city(joyeux, conn, cursor)
```
%% Cell type:markdown id: tags:
## Data extraction
Before we can extract data from both tables, lets populate the City table:
%% Cell type:code id: tags:
```
def remove_all_cities(conn, cursor):
"""
Removes all cities from the database.
Parameters
----------
conn :
The object used to manage the database connection.
cursor :
The object used to query the database.
Returns
-------
bool
True if no error occurs, False otherwise.
"""
cursor.execute("BEGIN")
try:
delete_query = "DELETE FROM City"
cursor.execute(
delete_query
)
# Here we catch any database error that can arise from this insert query.
except sqlite3.Error as error:
print("A database error occurred while removing the cities: ", format(error))
conn.rollback()
return False
# Everything is OK
print("All cities have been removed.")
conn.commit()
return True
def insert_all_cities(conn, cursor):
"""
Reads the 'villes.csv' file and populates the database
"""
print("Loading cities...")
input_df = pd.read_csv("villes.csv", delimiter=',')
# Ignore Saint-Pierre-et-Miquelon which is not part of a known department
# (975 is a 'collectivité')
input_df = input_df.loc[input_df["code INSEE"] != "97501"]
input_df = input_df.rename(columns={\
"code INSEE":"INSEE_code", \
"nom":"name",\
"département":"department_number", \
"code(s) postal/taux":"zip_code_s", \
"population en 1999":"population_1999", \
"population en 2010":"population_2010", \
"population estimée en 2012":"estimated_population_2012", \
"densité":"density", \
"superficie":"surface", \
"longitude DMS":"longitude", \
"latitude DMS":"latitude", \
"altitude min":"elevation_min", \
"altitude max":"elevation_max"})
input_df.to_sql('City', conn, if_exists="append", index=False)
print("Loading cities DONE!")
remove_all_cities(conn, cursor)
insert_all_cities(conn, cursor)
```
%% Cell type:markdown id: tags:
We want to print the name and the number of inhabitants in 1999 of cities from the Gard department where the maximum elevation if below 20 m.
Expected result:
```
Le Grau-du-Roi population in 1999: 5874
Aigues-Mortes population in 1999: 6019
Saint-Laurent-d'Aigouze population in 1999: 2741
Fourques population in 1999: 2544
Aimargues population in 1999: 3440
```
Read the documentation:
https://docs.python.org/3/library/sqlite3.html#sqlite3.Cursor.fetchall
https://docs.python.org/3/library/sqlite3.html#sqlite3.Cursor.fetchone
%% Cell type:code id: tags:
```
try:
#query = TODO
cursor.execute(
query
)
#data = cursor.TODO
for city_name, population in data:
print(city_name, "population in 1999:", population)
# Here we catch any database error that can arise from this query.
except sqlite3.Error as error:
print("A database error occurred while querying the database: ", format(error))
```
%% Cell type:markdown id: tags:
## Close the connection to the database
%% Cell type:code id: tags:
```
cursor.close()
conn.close()
```
......
No preview for this file type
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment