Lesson 14: Databases

Homepage Content Slides Video

Warning

This lesson is under construction. Learn from it at your own risk. If you have any feedback, please fill out our General Feedback Survey.

Overview

  • About Databases.
  • When to use Databases.
  • When not to use Databases.
  • Database Concepts.
  • SQL Syntax.
  • Ways to use a Datbase.

Databases

A program that can efficiently store and retrieve large amounts of data.

Relating Data

Imagine a kitchen cupboard program that stores food currently in stock, where it is, recipes using it, expiration dates, etc.

Databases and Structure

Structure
SQL databases are based on around Relational Algebra
<Table 1>
+---------------+-----------+-----------+
| <Primary key> | <Field 1> | <Field 2> |
+---------------+-----------+-----------+
| 1             | value     | value`    |
| ...           | ...       | ...       |
+---------------+-----------+-----------+

<Table 2>
+---------------+-----------+--------------------------+
| <Primary key> | <Field 1> | <Foreign key to Table 1> |
+---------------+-----------+--------------------------+
| 1             | val       | 7                        |
| ...           | ...       | ...                      |
+---------------+-----------+--------------------------+

Concept: Relational Algebra

<Table 1>
+------------------+------------------+
| <Name>           | <Major>          |
+------------------+------------------+
| Linus Torvalds   | Computer Science |
| Richard Stallman | Computer Science |
+------------------+------------------+
<Table 2>
+------------------+--------------+----------------+
| <Major>          | <School>     | <Advisor Name> |
+------------------+--------------+----------------+
| Computer Science | Engineering  | Dennis Ritchie |
+------------------+--------------+----------------+
<Table 1> JOIN <Table 2>
+------------------+------------------+-------------+----------------+
| <Name>           | <Major>          | <School>    | <Advisor Name> |
+------------------+------------------+-------------+----------------+
| Linus Torvalds   | Computer Science | Engineering | Dennis Ritchie |
| Richard Stallman | Computer Science | Engineering | Dennis Ritchie |
+------------------+------------------+-------------+----------------+

Concept: Relational Algebra

Relational Algebra Example

When to use a Database

When you have to work with a lot of well structured data.
Databases are useful for two situations:
  1. Lots of data.
  2. High throughput.

Lots of Data

Global Internet traffic by year

Note: 1 PB = 1,000,000 GB

Concurrent Read/Writes

When not to use a Database

Databases might not be particularly useful for:
  • Storing content for a website that rarely updates
    • Alternative: Use a static site generator such as Pelican or Jekyll
  • Hosting large individual files
    • Alternative: Store the files on disk

Types of Databases

There are two broad types of databases.

SQL

Examples:
  • MySQL/MariaDB
  • PostgreSQL
  • SQLite

NoSQL

Examples:
  • MongoDB
  • Apache Casandra
  • Dynamo
  • Redis

Database Concepts

Schemas

CREATE TABLE nobel (
    id int(11)
        NOT NULL
        AUTO_INCREMENT,
    yr int(11),
    subject varchar(15),
    winner varchar(50)
)
ENGINE = InnoDB;

Migrations

from django.db import migrations, models

class Migration(migrations.Migration):
    dependencies = [
        ('app', '0001_initial')
    ]

    operations = [
        migrations.AddField("Nobel", "topic", models.CharField(80))
    ]

Raw SQL Syntax

SELECT

SELECT
    yr, subject, winner
FROM
    nobel
WHERE
    yr = 1960 AND subject='medicine';
+------+------------+-------------------------------+
| yr   | subject    | winner                        |
+------+------------+-------------------------------+
| 1960 | "medicine" | "Sir Frank Macfarlane Burnet" |
| 1960 | "medicine" | "Sir Peter Brian Medawar"     |
+------+------------+-------------------------------+

INSERT

INSERT INTO
    nobel
VALUES
    ('2013','Literature','Herta Müller');
+-----+------+--------------+----------------+
| id  | yr   | subject      | winner         |
+-----+------+--------------+----------------+
| ... | ...  | ...          | ...            |
| 873 | 2013 | "Literature" | "Herta Müller" |
| ... | ...  | ...          | ...            |
+-----+------+--------------+----------------+

UPDATE

UPDATE
    nobel
SET
    winner='Andrew Ryan'
WHERE
    subject='Peace' AND yr='1951';
+-----+------+---------+----------------+
| id  | yr   | subject | winner         |
+-----+------+---------+----------------+
| ... | ...  | ...     | ...            |
| 120 | 1951 | "Peace" | "Andrew Ryan"  |
| ... | ...  | ...     | ...            |
+-----+------+---------+----------------+

DELETE

DELETE FROM
   nobel
WHERE
   yr = 1989 AND subject = 'peace';

TODO: Crafting Queries!

Craft a query to get the following data out of our Nobel table:

Don't worry about getting it exactly right! Craft pseudo-SQL!

TODO: Crafting Queries!

Answers

SELECT winner FROM nobel
WHERE yr=1952 AND subject='medicine'; #(Selman A. Wksman)

SELECT * FROM nobel
WHERE yr=1903 AND subject='physics'; #(3)

SELECT * FROM nobel
WHERE winner='Linus Pauling'; #(2)

SELECT COUNT(*) FROM nobel
AS n0 INNER JOIN nobel AS n1 on n0.winner=n1.winner
AND (n0.yr!=n1.yr or n0.subject!=n1.subject); #(16)

TODO: Using a Real Database

Installing MySQL

# Install mysql -- hit 'enter' to name your user root, and then enter
# again for password
# On Debian-based systems:
$ sudo apt update && sudo apt install mysql-server
# On Red Hat/Fedora based systems:
$ sudo yum install mysql-server

$ sudo /etc/init.d/mysql start  # Start the mysql service

$ mysql_secure_installation # Use this to set the root password

# Hit 'yes' or 'y' for all options
# Add a sensible password which you will remember
# DO NOT MAKE IT YOUR USUAL PASSWORD.

$ sudo /etc/init.d/mysql status

$ mysqladmin -u root -p ping # Ping the database

$ mysqladmin -u root -p create nobel # Create a table for Nobel prizes

Users

Login to the mysql shell with your root user credentials:

$ sudo mysql -p
mysql> CREATE USER 'me'@'localhost'
       IDENTIFIED BY 'password';

mysql> GRANT ALL PRIVILEGES ON nobel.*
       TO 'me'@'localhost'
       WITH GRANT OPTION;

mysql> exit

Importing Data

# Get the database from the osl server
$ sudo apt install wget
$ wget http://osl.io/nobel -O nobel.sql
# put the database in a file called nobel.sql
$ sudo mysql -p nobel < nobel.sql
# Open up mysql shell to execute queries
$ sudo mysql -p nobel
# List all the tables
SHOW TABLES;
# Print the layout of the database to the screen
DESCRIBE nobel;

Ways to Use a Database

Raw Queries

mysql> SELECT subject, yr, winner FROM nobel
       WHERE yr=1960;
+------+------------+-----------------------------+
| yr   | subject    | winner                      |
+------+------------+-----------------------------+
| 1960 | Chemistry  | Willard F. Libby            |
| 1960 | Literature | Saint-John Perse            |
| ...  | ...        | ...                         |
+------+------------+-----------------------------+

Native Queries

#!/usr/bin/python
import MySQLdb

db = ("localhost","testuser","test123","nobel" )

cursor = db.cursor()

cursor.execute("SELECT subject, yr, winner FROM nobel WHERE yr = 1960")

data = cursor.fetchall()

for winner in data:
    print "%s winner in %s: %s " % (winner[0], winner[1], winner[2])

db.close()

Object Relational Mappers

# SELECT * FROM nobel WHERE yr = 1960
for subject, yr, winner in session.query(Nobel).filter_by(yr=1960):
    print "%s winner in %s: %s " % (subject, yr, winner)

Further Reading

CS 340
The CS 340 course at OSU (titled "Databases") is a great introduction to this topic. If you have the option to take it you should!