The Basics

As we have already mentioned, many database management systems support SQL. Whilst we could install an industry strength system to learn SQL with, there is also a much more convenient – and cross-platform – route we can get started with: SQLite.

SQLite is a self-contained database system that doesn’t require any sort of server support, which means it’s easy to get up an running. To work through this activity, you will need to install SQLite on your computer.

Download and install SQLite: find the appropriate Precompiled Binaries link for your platform and download and unzip the corresponding file.

Starting SQLite

The L in SQL stands for language – that means you have to speak to your computer – on a command line. Don’t be afraid: the basics of the language are not too hard to learn (and for more complex things there is always internet search).

On Mac

Open the application “Terminal” and type sqlite3.

On Windows

Save the Sqlite executable in the folder you want to work with and double click it.

The command line

Sqlite starts up in a command line. You should now see some information about the version of sqlite you’re using and a prompt saying sqlite. Don’t worry you’re not stuck in here: You can always exit sqlite by typing .quit.

SQLite version 3.7.13 2012-07-17 17:46:21
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite>

The first thing we want to look at is some built in functions of sqlite. Type .help followed by enter to see some of the commands you have in there – this is useful for reference. Note: These commands are not part of the sql language.

Creating your first table

Data in databases is stored in tables. To use spreadsheets as an analogy: a database is a workbook, a table is a single sheet. As in well designed spreadsheets data is organized in rows and columns. With the single variables in columns and each row a specific data point. Unlike spreadsheets, databases are a lot more stringent when it comes to what is in a column. To store data we have to explicitly define what is to be stored in a table – let’s do this.

We’ll store the following data in the table.

Name	Mood	Age
Michael	Grumpy	33
Anders	Happy	35
Zara	Happy	28

First, we need to define the columns – we have Name (which is text), Mood (also text) and Age (a number without decimals). We’ll tell the computer: Make a table with the columns: Name Text, Mood Text, Age Number. Let’s translate this to SQL. As with natural language we use words to tell the computer what we want:

CREATE TABLE people (name char(255), mood char(255), age int);

Note several things: First our text has become a “char” – this is because names are reasonably short and we estimate the maximum length for a name at 255 characters. Char actually stands for character – here we’re telling the database we want a field that’s at max 255 characters long. The same for mood. Age has become an “Integer” that is a number without decimals – also see the “;” at the end. This tells SQL this is where our sentence stops. The sentences we’re telling SQL are called “queries”. You will have noticed that there are certain words written in all caps above. This is not required but standard convention in SQL: It makes it easier to read for humans.

Fantastic, we have now created our first table. We can find it using the sqlite command .tables which will return:

people

and look at it’s definition using .schema people, which will return our original query:

CREATE TABLE people (name char(255), mood char(255), age int);

This is a very easy and small table – for larger tables and for interactions across multiple tables we will use one or more columns to uniquely identify each row. This column is specified as the “primary key”. Having a primary key helps you to retrieve data from your database much faster if you use to access it.

Now let’s add some data to it – we will tell the database to insert a record for Michael, who is grumpy and 33 years old. In SQL this is

INSERT INTO people VALUES (’Michael’,’Grumpy’,33);

Let’s do the same for Anders and Zara:

INSERT INTO people VALUES (’Anders’,’Happy’,35);
INSERT INTO people VALUES (’Zara’,’Happy’,28);

Fantastic! We’ve entered some information into the table! Note: You can omit the column names only if you’re entering all values and in the order of the columns. Otherwise you have to specify the column. E.q. if we don’t want to enter a mood we can use

INSERT INTO people (name, age) VALUES ('Tom', 33);

Filtering and Sorting SQL style

Let’s now look at our data. To see the contents of our table we can tell the database: Show me everything in the people table. In SQL:

SELECT * FROM people;

Note the * – it is a “wildcard” character. This means as much as select all the columns.

The database will respond with:

Michael|Grumpy|33
Anders|Happy|35
Zara|Happy|28

Fantastic! We can now retrieve our data. But databases would be pretty useless if we could just get everything we stored (Why not use a file in that case…?) – then excel when it comes to sorting and filtering!

In a giant database, you might want to get all the records that correspond to a particular entry (name = ‘John’, age = 51). This is when databases become very useful, as SQL queries are very powerful filters. Let’s now only get the happy people. We’ll tell the database show me everything in the people table if the mood is happy. in SQL:

SELECT * FROM people WHERE mood=’Happy’;

Databases can do this very very quickly even if the dataset you work with is giant. Note the WHERE clause – this defines our filters (here give me all the rows where the mood is happy). You can combine multiple conditions with AND, OR and NOT.

Another thing we might want to do is sorting (and figuring out the range of values within a column).

Let’s first sort the dataset by age – youngest first. We say: give me everything ordered by age ascending. SQL:

SELECT * FROM people ORDER BY age ASC;

Wasn’t that easy? Of course you can combine filters and sorting. ORDER BY serves as another kind of filter. Note that ASC stands for Ascending (and DESC stands for descending)

Maxima and Minima

Let’s figure out the maximum and minimum age.

we’ll say give us the maximum age of the people in SQL:

SELECT max(age) FROM people;

Can you figure out how to get the minimum? Note this will only return the maximum age – not the entry with the maximum age.

A good thing about SQL is it allows us to use nested queries (queries within queries). E.g. if we want to find out which people have the maximum age we can use:

SELECT * FROM people WHERE age=(SELECT max(age) FROM people);

And see it’s Anders!

Nested Queries

A very powerful feature of SQL is that allows us to use nested queries (queries within queries). For example, if we want to find out which people have the maximum age we can use:

SELECT * FROM people WHERE age=(SELECT MAX(age) FROM people);