Lesson 14: Databases
Homepage | Content | Slides | Video |
Warning
This lesson is under construction. Learn from it at your own risk. If you have any feedback, please fill out our General Feedback Survey.
Homepage | Content | Slides | Video |
Warning
This lesson is under construction. Learn from it at your own risk. If you have any feedback, please fill out our General Feedback Survey.
- About Databases.
- When to use Databases.
- When not to use Databases.
- Database Concepts.
- SQL Syntax.
- Ways to use a Database.
A program that can efficiently store and retrieve large amounts of data.
Imagine a kitchen cupboard program that stores food currently in stock, where it is, recipes using it, expiration dates, etc.
Everything has a relationship with everything else. How would you store it in the program?
- Ingredients for recipes.
- Location.
- Classifications.
<Table 1>
+---------------+-----------+-----------+
| <Primary key> | <Field 1> | <Field 2> |
+---------------+-----------+-----------+
| 1 | value | value` |
| ... | ... | ... |
+---------------+-----------+-----------+
<Table 2>
+---------------+-----------+--------------------------+
| <Primary key> | <Field 1> | <Foreign key to Table 1> |
+---------------+-----------+--------------------------+
| 1 | val | 7 |
| ... | ... | ... |
+---------------+-----------+--------------------------+
<Table 1>
+------------------+------------------+
| <Name> | <Major> |
+------------------+------------------+
| Linus Torvalds | Computer Science |
| Richard Stallman | Computer Science |
+------------------+------------------+
<Table 2>
+------------------+--------------+----------------+
| <Major> | <School> | <Advisor Name> |
+------------------+--------------+----------------+
| Computer Science | Engineering | Dennis Ritchie |
+------------------+--------------+----------------+
<Table 1> JOIN <Table 2>
+------------------+------------------+-------------+----------------+
| <Name> | <Major> | <School> | <Advisor Name> |
+------------------+------------------+-------------+----------------+
| Linus Torvalds | Computer Science | Engineering | Dennis Ritchie |
| Richard Stallman | Computer Science | Engineering | Dennis Ritchie |
+------------------+------------------+-------------+----------------+
When you have to work with a lot of well structured data.
Note: 1 PB = 1,000,000 GB
There are two broad types of databases.
Schemas are how you define what a table looks like, what data will populate it, and what each field will be called. The schema also defines relationships between tables; more or less the blueprint of your database.
CREATE TABLE nobel (
id int(11)
NOT NULL
AUTO_INCREMENT,
yr int(11),
subject varchar(15),
winner varchar(50)
)
ENGINE = InnoDB;
Migrations are the process of updating tables and fields in your database. Since databases might need to change in the future (you never know!) you can create and run a migrations to modify your schema as needed.
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
('app', '0001_initial')
]
operations = [
migrations.AddField("Nobel", "topic", models.CharField(80))
]
There are many tools out there that allow you to avoid writing raw SQL, but it's always good to know the syntax. One day you may need to write raw SQL queries, and at the very least you'll need to read SQL for debugging purposes.
Select statements get data from the database which matches the requirements you have.
SELECT
yr, subject, winner
FROM
nobel
WHERE
yr = 1960 AND subject='medicine';
+------+------------+-------------------------------+
| yr | subject | winner |
+------+------------+-------------------------------+
| 1960 | "medicine" | "Sir Frank Macfarlane Burnet" |
| 1960 | "medicine" | "Sir Peter Brian Medawar" |
+------+------------+-------------------------------+
Insert statements create an entry into a table and populate the fields appropriately.
INSERT INTO
nobel
VALUES
('2013','Literature','Herta Müller');
+-----+------+--------------+----------------+
| id | yr | subject | winner |
+-----+------+--------------+----------------+
| ... | ... | ... | ... |
| 873 | 2013 | "Literature" | "Herta Müller" |
| ... | ... | ... | ... |
+-----+------+--------------+----------------+
Update statements modify an existing entry in a table.
UPDATE
nobel
SET
winner='Andrew Ryan'
WHERE
subject='Peace' AND yr='1951';
+-----+------+---------+----------------+
| id | yr | subject | winner |
+-----+------+---------+----------------+
| ... | ... | ... | ... |
| 120 | 1951 | "Peace" | "Andrew Ryan" |
| ... | ... | ... | ... |
+-----+------+---------+----------------+
Delete statements... You can guess what a delete statement does I bet.
DELETE FROM
nobel
WHERE
yr = 1989 AND subject = 'peace';
Craft a query to get the following data out of our Nobel table:
Don't worry about getting it exactly right! Craft pseudo-SQL!
SELECT winner FROM nobel
WHERE yr=1952 AND subject='medicine'; #(Selman A. Wksman)
SELECT * FROM nobel
WHERE yr=1903 AND subject='physics'; #(3)
SELECT * FROM nobel
WHERE winner='Linus Pauling'; #(2)
SELECT COUNT(*) FROM nobel
AS n0 INNER JOIN nobel AS n1 on n0.winner=n1.winner
AND (n0.yr!=n1.yr or n0.subject!=n1.subject); #(16)
Now that we have belabored the theory of databases and SQL, lets actually start doing work with databases.
Throughout this exercise we will load it up with some data (nobel.sql.gz) and learn to interact with it via the command line interface.
# Create a table for Nobel prizes
$ mysqladmin -u root create nobel
# Get the database from the osl server
$ wget http://osl.io/nobel -O nobel.sql.gz
# Gunzip the file and import it into the nobel db
$ gunzip nobel.sql.gz
$ mysql nobel < nobel.sql
# OR do it in one step!
$ zcat nobel.sql.gz | mysql nobel
# Open up mysql shell to execute queries
$ mysql nobel
# List all the tables
SHOW TABLES;
# Print the layout of the database to the screen
DESCRIBE nobel;
Now that you have a working database you have a few options for how you want to use it.
We've already done this in the previous exercise. You use your choice of program to interact with the database exclusively via SQL and run the queries you want. This is rarely the way to go and isn't very useful for most applications. The SQL language is only good for doing database stuff.
mysql> SELECT subject, yr, winner FROM nobel
WHERE yr=1960;
+------+------------+-----------------------------+
| yr | subject | winner |
+------+------------+-----------------------------+
| 1960 | Chemistry | Willard F. Libby |
| 1960 | Literature | Saint-John Perse |
| ... | ... | ... |
+------+------------+-----------------------------+
See nobel.py
#!/usr/bin/python
import MySQLdb
import os
db = MySQLdb.connect(
os.environ['MYSQL_PORT_3306_TCP_ADDR'],
'root',
os.environ['MYSQL_ENV_MYSQL_ROOT_PASSWORD'],
"nobel"
)
cursor = db.cursor()
cursor.execute("SELECT subject, yr, winner FROM nobel WHERE yr = 1960")
data = cursor.fetchall()
for winner in data:
print "%s winner in %s: %s " % (winner[0], winner[1], winner[2])
db.close()
# SELECT * FROM nobel WHERE yr = 1960
for subject, yr, winner in session.query(Nobel).filter_by(yr=1960):
print "%s winner in %s: %s " % (subject, yr, winner)