Lab 4

Goal

To gain experience using SQL, and a terminal interface to MySQL .

World Database Lab 1

The world database

This lab uses the mysql console program.

We are going to use a world database, which can be found on the MySQL official website.

There are two options. A world database with the MyISAM storage engine or with the InnoDB engine. We choose the latter.

There is a a copy of each in /home/class/world:

skon@Taris:~$ cd /home/class/csc3031/world/
skon@Taris:/home/class/csc3031/world$ ls
world_innodb.sql world.sql
skon@Taris:/home/class/csc3031/world$

Copy to your directory:

skon@Taris:/home/class/csc3031/world$ cp world_innodb.sql ~
skon@Taris:/home/class/csc3031/world$ cd ~
skon@Taris:/home/class/csc3031/world$ ls
world_innodb.sql 

skon@Taris:/home/class/csc3031/world$ ls world_innodb.sql </pre>

Log into the database

skon@Taris:/home/class/csc3031/world$ mysql -u yourusername -p
Enter password: 

mysql> 

Normally you would need to create a database. I have done this for you. So you don't need to do the below.

mysql> CREATE DATABASE yourusernameworld;

The world database is created.

mysql> USE yourusernameworld;

We changed to your world database. Now the world database is the current database.

mysql> source world_innodb.sql

We build the tables of the world database by executing this SQL script. It takes some time.

Examining the database

In this section, we are going to look at the tables of the world database in general.

mysql> SHOW TABLES;
+-----------------+
| Tables_in_world |
+-----------------+
| City            |
| Country         |
| CountryLanguage |
+-----------------+

We show all available tables with the SHOW TABLES statement. There are three.

mysql> DESCRIBE City;
+-------------+----------+------+-----+---------+----------------+
| Field       | Type     | Null | Key | Default | Extra          |
+-------------+----------+------+-----+---------+----------------+
| ID          | int(11)  | NO   | PRI | NULL    | auto_increment |
| Name        | char(35) | NO   |     |         |                |
| CountryCode | char(3)  | NO   | MUL |         |                |
| District    | char(20) | NO   |     |         |                |
| Population  | int(11)  | NO   |     | 0       |                |
+-------------+----------+------+-----+---------+----------------+

With the DESCRIBE statement, we can see the table structure of the City table. We see the column names and their data types. Plus other important information.

mysql> SHOW CREATE TABLE City;
+-------+-----------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                                                                                                                                                                                                                                         |
+-------+-----------------------------------------------------------------------------------------+
| City  | CREATE TABLE `City` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `Name` char(35) NOT NULL DEFAULT '',
  `CountryCode` char(3) NOT NULL DEFAULT '',
  `District` char(20) NOT NULL DEFAULT '',
  `Population` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`ID`),
  KEY `CountryCode` (`CountryCode`),
  CONSTRAINT `city_ibfk_1` FOREIGN KEY (`CountryCode`) REFERENCES `Country` (`Code`)
) ENGINE=InnoDB AUTO_INCREMENT=4080 DEFAULT CHARSET=latin1 |
+-------+----------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

If we wanted to find out the SQL to create the City table, we would issue the SHOW CREATE TABLE City statement.

Queries

Queries are used to look up data from the database tables.

Limiting data output

There are thousands of rows in the tables of the database. They cannot be displayed all on the screen. We can control the number of rows to be displayed with the LIMIT clause.

mysql> SELECT Id, Name, Population FROM City LIMIT 10;
+----+----------------+------------+
| Id | Name           | Population |
+----+----------------+------------+
|  1 | Kabul          |    1780000 |
|  2 | Qandahar       |     237500 |
|  3 | Herat          |     186800 |
|  4 | Mazar-e-Sharif |     127800 |
|  5 | Amsterdam      |     731200 |
|  6 | Rotterdam      |     593321 |
|  7 | Haag           |     440900 |
|  8 | Utrecht        |     234323 |
|  9 | Eindhoven      |     201843 |
| 10 | Tilburg        |     193238 |
+----+----------------+------------+

In the above query, we show three of the five columns of the City table. There are lots of rows in the table. We limit the query to the first 10 rows.

mysql> SELECT Id, Name, Population FROM City LIMIT 15, 5;
+----+-------------------+------------+
| Id | Name              | Population |
+----+-------------------+------------+
| 16 | Haarlem           |     148772 |
| 17 | Almere            |     142465 |
| 18 | Arnhem            |     138020 |
| 19 | Zaanstad          |     135621 |
| 20 | ´s-Hertogenbosch  |     129170 |
+----+-------------------+------------+

The LIMIT clause can be followed by two numbers. The first one is the offset and the second one is the number of rows to display. Our query shows rows 16-20.

mysql> pager less
PAGER set to 'less'
mysql> SELECT * FROM City;
+------------------------------------+------------+
| Name                               | Population |
+------------------------------------+------------+
| Kabul                              |    1780000 |
| Qandahar                           |     237500 |
| Herat                              |     186800 |

Since the City table has more than four thousand rows, we cannot see them in one screen.

We can use the "pager" command to show the data in a less program. We can navigate through the data with the cursor keys or page down, page up keys. If we want to use no pager program, simply hit the pager without an argument.

PAGER set to 'less'
mysql> SELECT Name FROM City;

+------------------------------------+
| Name                               |
+------------------------------------+
| Kabul                              |
| Qandahar                           |
| Herat                              |
| Mazar-e-Sharif                     |
| Amsterdam                          |
| Rotterdam                          |
...
:

Some queries result in duplicate rows:

mysql> select Region from Country;
+---------------------------+
| Region                    |
+---------------------------+
| Caribbean                 |
| Southern and Central Asia |
| Central Africa            |
| Caribbean                 |
| Southern Europe           |
| Southern Europe           |
| Caribbean                 |
....
| Middle East               |
| Southern Europe           |
| Southern Africa           |
| Eastern Africa            |
| Eastern Africa            |
+---------------------------+
239 rows in set (0.00 sec)

This can be stopped with the distinct clause:

mysql> select distinct Region from Country;
+---------------------------+
| Region                    |
+---------------------------+
| Caribbean                 |
| Southern and Central Asia |
| Central Africa            |
| Southern Europe           |
| Middle East               |
| South America             |
| Polynesia                 |
| Antarctica                |
| Australia and New Zealand |
| Western Europe            |
| Eastern Africa            |
| Western Africa            |
| Eastern Europe            |
| Central America           |
| North America             |
| Southeast Asia            |
| Southern Africa           |
| Eastern Asia              |
| Nordic Countries          |
| Northern Africa           |
| Baltic Countries          |
| Melanesia                 |
| Micronesia                |
| British Islands           |
| Micronesia/Caribbean      |
+---------------------------+
25 rows in set (0.00 sec)

You do:

  • Create and execute a SQL command to display the names of all languages. Limit it to the top 10. *
  • Do it again, limiting it to rows 21-25.
  • Create and execute a SQL command to display the names of all languages that for which "IsOffical" is true. Limit to first 10.
  • List ALL distinct government types.
The mysql command tool can be used in a non-interactive way. We specify the SQL statement after the -e option and redirect the result to the city file. Now we can use any text editor to display the data.

The COUNT(), MAX(), MIN() functions

mysql> SELECT COUNT(Id) AS 'Number of rows' FROM City;
+----------------+
| Number of rows |
+----------------+
|           4079 |
+----------------+

There are 4079 cities in the table. We use the built-in COUNT() function to find out the number of rows.

mysql> SELECT Name, Population FROM City
    -> WHERE Population = (SELECT Max(Population) FROM City);
+-----------------+------------+
| Name            | Population |
+-----------------+------------+
| Mumbai (Bombay) |   10500000 |
+-----------------+------------+

The above query shows the most populated city in the table. The SQL is a special type of a query called a subquery. The outer query uses the data returned by the inner query. The inner query is bounded by parentheses.

mysql> SELECT Name, Population FROM City 
    -> WHERE Population = (SELECT Min(Population) FROM City);
+-----------+------------+
| Name      | Population |
+-----------+------------+
| Adamstown |         42 |
+-----------+------------+

This subquery shows the least populated city in the table.

mysql> SELECT AVG(SurfaceArea) from Country;
+------------------+
| AVG(SurfaceArea) |
+------------------+
|    623248.146025 |
+------------------+
1 row in set (0.00 sec)
This subquery shows the average surface area of all the countries.
How many continents? Try:
mysql> SELECT COUNT(Continent) FROM Country;
+------------------+
| COUNT(Continent) |
+------------------+
|              239 |
+------------------+
1 row in set (0.00 sec)
Opps!  That's a lot of continents! We are counting every row, and thus counting all the countries. We need to reduce the count to the distinct values of Continent:
mysql> SELECT COUNT(DISTINCT Continent) FROM Country;
+---------------------------+
| COUNT(DISTINCT Continent) |
+---------------------------+
|                         7 |
+---------------------------+
1 row in set (0.00 sec) 
Why is this? Consider: 
SELECT DISTINCT Continent FROM Country;
+---------------+
| Continent     |
+---------------+
| North America |
| Asia          |
| Africa        |
| Europe        |
| South America |
| Oceania       |
| Antarctica    |
+---------------+
7 rows in set (0.00 sec)
How about the country with the biggest change in GNP?  Try:  
mysql> SELECT Name, MAX(GNP-GNPOld) FROM Country;
+-------+-----------------+
| Name  | MAX(GNP-GNPOld) |
+-------+-----------------+
| Aruba |       399800.00 |
+-------+-----------------+
1 row in set (0.00 sec)
Doesn't work!  Why - because an Aaggregate function simply picks the first field is a group (Aruba).  Try this:  
mysql> SELECT Name, GNP-GNPOld FROM Country WHERE GNP-GNPOld = (SELECT MAX(GNP-GNPOld) FROM Country); +---------------+------------+ | Name | GNP-GNPOld | +---------------+------------+ | United States | 399800.00 | +---------------+------------+
1 row in set (0.00 sec)
This works because FIRST we find the value, than use it as a selector to find the right country

You do:

  • What is the average population of all the counties in the world? *
  • List the name and continent wih the largest population. With the smallest population.
  • What is the average life expectancy of the world?
  • List the country name and average life span with the highest and lowest life expectancy.
  • What are the number of distinct governments types?

Selecting specific rows with the WHERE clause

The WHERE clause can be used to filter the results. It provides a selection criteria to select only specific rows from the data.

mysql> SELECT Name, Population FROM City 
    -> WHERE Population > 1000000;
+--------------------------+------------+
| Name                     | Population |
+--------------------------+------------+
| Kabul                    |    1780000 |
| Alger                    |    2168000 |
| Luanda                   |    2022000 |
| Buenos Aires             |    2982146 |
| La Matanza               |    1266461 |
| Córdoba                  |    1157507 |
...

The above SQL statement returns all cities with a population above one million people.

mysql> SELECT Name FROM City WHERE Name LIKE 'Kal%';
+-------------+
| Name        |
+-------------+
| Kalookan    |
| Kalyan      |
| Kalemie     |
| Kallithea   |
| Kalisz      |
| Kaliningrad |
| Kaluga      |
+-------------+
7 rows in set (0.00 sec)

Here we select all city names which begin with Kal. We have found seven cities in the table. We can look for a specific pattern in the column with the LIKE clause.

mysql> SELECT Name, Population FROM City 
    -> WHERE ID IN (5, 32, 344, 554);
+-------------------+------------+
| Name              | Population |
+-------------------+------------+
| Amsterdam         |     731200 |
| Alkmaar           |      92713 |
| Guarapuava        |     160510 |
| Santiago de Chile |    4703954 |
+-------------------+------------+
4 rows in set (0.00 sec)

This SQL code returns Cities and their populations for rows with 5, 32, 344 and 554 IDs.

mysql> SELECT * FROM City WHERE Name = 'Bratislava';
+------+------------+-------------+------------+------------+
| ID   | Name       | CountryCode | District   | Population |
+------+------------+-------------+------------+------------+
| 3209 | Bratislava | SVK         | Bratislava |     448292 |
+------+------------+-------------+------------+------------+
1 row in set (0.00 sec)

With the above SQL statement we select all columns for one specific city, namely Bratislava.

mysql> SELECT Name, Population FROM City 
    -> WHERE Population BETWEEN 670000 AND 700000;
+----------------+------------+
| Name           | Population |
+----------------+------------+
| Teresina       |     691942 |
| Natal          |     688955 |
| Bandar Lampung |     680332 |
| Gwalior        |     690765 |
| Kermanshah     |     692986 |
| Palermo        |     683794 |
| Toronto        |     688275 |
| Huainan        |     700000 |
| Jixi           |     683885 |
| Antananarivo   |     675669 |
| Chihuahua      |     670208 |
| Kano           |     674100 |
| Tunis          |     690600 |
+----------------+------------+
13 rows in set (0.00 sec)

Here we limit the rows to those with populations between a given value.

You do:
  • List the names of countries with with a life expectancy below 60. *
  • Using the IN clause, list the names of countries in the Caribbean or Polynesia regions.
  • List the countries in ANY of the "Europe" Regions (Use the LIKE pattern match regions with "Europe" in the region name).
  • List the countries with between 100,000 and 1,000,000 square miles of surface area.
  • List the countries with between 100,000 and 1,000,000 square miles of surface area where the official language is Spanish.
  • List the distinct languages that are official but have less than 50% of a countries citizens as speakers.
Say we wanted to find out cities with a population between two specific values. There is aBETWEEN operator for this. We have found 13 cities with a population in the range 670,000 and 700,000.

Ordering data

Ordering data can be done with the ORDER BY clause.

mysql> SELECT Name, Population FROM City
    -> ORDER BY Population DESC LIMIT 10;
+-------------------+------------+
| Name              | Population |
+-------------------+------------+
| Mumbai (Bombay)   |   10500000 |
| Seoul             |    9981619 |
| São Paulo         |    9968485 |
| Shanghai          |    9696300 |
| Jakarta           |    9604900 |
| Karachi           |    9269265 |
| Istanbul          |    8787958 |
| Ciudad de México  |    8591309 |
| Moscow            |    8389200 |
| New York          |    8008278 |
+-------------------+------------+
10 rows in set (0.00 sec)

We find the 10 most populated cities. We order the data by population from the most populated to the least populated city. We limit the output with the LIMIT clause.

mysql> SELECT Name, Population FROM City 
    -> ORDER BY Population ASC LIMIT 10;
+---------------------+------------+
| Name                | Population |
+---------------------+------------+
| Adamstown           |         42 |
| West Island         |        167 |
| Fakaofo             |        300 |
| Città del Vaticano  |        455 |
| Bantam              |        503 |
| Yaren               |        559 |
| The Valley          |        595 |
| Alofi               |        682 |
| Flying Fish Cove    |        700 |
| Kingston            |        800 |
+---------------------+------------+
10 rows in set (0.01 sec)

Here we get the least populated cities. This time we order the data in the ascending order. For this we use the ASC keyword.

mysql> SELECT Name, Population FROM City 
    -> ORDER BY Name LIMIT 10;
+------------------------+------------+
| Name                   | Population |
+------------------------+------------+
| A Coruña (La Coruña)   |     243402 |
| Aachen                 |     243825 |
| Aalborg                |     161161 |
| Aba                    |     298900 |
| Abadan                 |     206073 |
| Abaetetuba             |     111258 |
| Abakan                 |     169200 |
| Abbotsford             |     105403 |
| Abeokuta               |     427400 |
| Aberdeen               |     213070 |
+------------------------+------------+
10 rows in set (0.01 sec)

In the above SQL statement we order data by city name and get the first ten cities.

You do:
  • List the countries sorted asending by region name. *
  • List the countries sorted descending by GNP.
  • List the countries sorted asending first by region, then by GNP.

Grouping data

The GROUP BY clause is used to combine database records with identical values into a single record. It is often used with the aggregation functions (AVG, MIN, MAX, etc).

mysql> SELECT District, SUM(Population) FROM City
    -> WHERE District = 'New York' GROUP BY District;
+----------+-----------------+
| District | SUM(Population) |
+----------+-----------------+
| New York |         8958085 |
+----------+-----------------+
1 row in set (0.00 sec)

The above SQL statement returns the total number of people in the towns of the New York district, which are listed in our database.

mysql> SELECT Name, District, Population FROM City
    -> WHERE District = 'New York';
+-----------+----------+------------+
| Name      | District | Population |
+-----------+----------+------------+
| New York  | New York |    8008278 |
| Buffalo   | New York |     292648 |
| Rochester | New York |     219773 |
| Yonkers   | New York |     196086 |
| Syracuse  | New York |     147306 |
| Albany    | New York |      93994 |
+-----------+----------+------------+
6 rows in set (0.01 sec)

The previous number is a sum of these six cities.

mysql> SELECT District, SUM(Population) FROM City
    -> WHERE CountryCode = 'USA' GROUP BY District
    -> HAVING SUM(Population) > 3000000;
+------------+-----------------+
| District   | SUM(Population) |
+------------+-----------------+
| Arizona    |         3178903 |
| California |        16716706 |
| Florida    |         3151408 |
| Illinois   |         3737498 |
| New York   |         8958085 |
| Texas      |         9208281 |
+------------+-----------------+
6 rows in set (0.00 sec)

We select all districts which have population over 3 million people. When we work with groups of data, we use the HAVING clause instead of the WHERE clause.

You do:
  • What is the total surface area of all the countries on the planet? *
  • What the total surface area of each region on the planet?
  • List each region in the Europe, along with it's total population.
  • (More challenging, but possible) List the all the regions, along with the percentage of the total surface area they encompass.

Updating, deleting and inserting data

Next we will concern ourselves with updating, deleting and inserting data.

mysql> SELECT Name, HeadOfState FROM Country
    -> WHERE Name = 'United States';
+---------------+----------------+
| Name          | HeadOfState    |
+---------------+----------------+
| United States | George W. Bush |
+---------------+----------------+
1 row in set (0.00 sec)

As we have already stated, the world database is outdated. George Bush is not the president of the USA anymore.

mysql> UPDATE Country SET HeadOfState = 'Barack Obama'
    -> WHERE Name = 'United States';

With the UPDATE statement we change the row to the actual data.

mysql> SELECT Name, HeadOfState FROM Country WHERE Name = 'United States';
+---------------+--------------+
| Name          | HeadOfState  |
+---------------+--------------+
| United States | Barack Obama |
+---------------+--------------+
1 row in set (0.00 sec)

We have successfully updated the row.

mysql> CREATE TABLE First10 engine=MEMORY SELECT * FROM City LIMIT 10;

We create a temporary table in the memory. It will contain first ten cities from the City table.

mysql> SELECT * FROM First10;
+----+----------------+-------------+---------------+------------+
| ID | Name           | CountryCode | District      | Population |
+----+----------------+-------------+---------------+------------+
|  1 | Kabul          | AFG         | Kabol         |    1780000 |
|  2 | Qandahar       | AFG         | Qandahar      |     237500 |
|  3 | Herat          | AFG         | Herat         |     186800 |
|  4 | Mazar-e-Sharif | AFG         | Balkh         |     127800 |
|  5 | Amsterdam      | NLD         | Noord-Holland |     731200 |
|  6 | Rotterdam      | NLD         | Zuid-Holland  |     593321 |
|  7 | Haag           | NLD         | Zuid-Holland  |     440900 |
|  8 | Utrecht        | NLD         | Utrecht       |     234323 |
|  9 | Eindhoven      | NLD         | Noord-Brabant |     201843 |
| 10 | Tilburg        | NLD         | Noord-Brabant |     193238 |
+----+----------------+-------------+---------------+------------+
10 rows in set (0.00 sec)

This is the contents of the First10 table.

mysql> DELETE FROM First10 WHERE ID IN (2, 4, 6, 8, 10);

With the DELETE FROM statement and the WHERE clause we delete every second row from the First10 table.

mysql> SELECT * FROM First10;
+----+-----------+-------------+---------------+------------+
| ID | Name      | CountryCode | District      | Population |
+----+-----------+-------------+---------------+------------+
|  1 | Kabul     | AFG         | Kabol         |    1780000 |
|  3 | Herat     | AFG         | Herat         |     186800 |
|  5 | Amsterdam | NLD         | Noord-Holland |     731200 |
|  7 | Haag      | NLD         | Zuid-Holland  |     440900 |
|  9 | Eindhoven | NLD         | Noord-Brabant |     201843 |
+----+-----------+-------------+---------------+------------+
5 rows in set (0.00 sec)

We have five rows left in the table.

mysql> TRUNCATE TABLE First10;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT * FROM First10;
Empty set (0.00 sec)

We delete all rows from the table with the TRUNCATE statement. There is no data left.

mysql> INSERT INTO First10 VALUES(1, 'Kabul', 'AFG', 'Kabol', 1780000);

mysql> SELECT * FROM First10;
+----+-------+-------------+----------+------------+
| ID | Name  | CountryCode | District | Population |
+----+-------+-------------+----------+------------+
|  1 | Kabul | AFG         | Kabol    |    1780000 |
+----+-------+-------------+----------+------------+
1 row in set (0.00 sec)

With the INSERT INTO statement, we insert one row into the table.

mysql> DROP TABLE First10;
Query OK, 0 rows affected (0.00 sec)

We drop the table from the database.

You do:
  • Who is the listed president of Guatemala? *
  • Change the president of Guatemala to "Otto Pérez Molina". Show the result.
  • Create a temporary table called oceania made up of all countries in Oceania.
  • The ice is melting, and the water is rising, covering more land. Decrease the surface area of all countries in the new Oceania table by 5%. List the before and after results.
  • In the new table, drop all countries with less than 100 Square Miles. How many did you lose?
  • Insert into the new Oceania table a country called "Boatland", with all the other fields made up, and suitably amusing. Show the resulting row.
  • Drop the Oceania table.

What to turn in:

  1. Since you are doing this in a terminal, you can simply copy and paste the results into a document.
  2. Do all the exercises, but turn in the "You do" sections.
Topic revision: r8 - 2013-10-21 - JimSkon
 
This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback