CS101b - Assignment 2

The Structured Query Language

Please keep track of the total time you spend on this assignment, and make a note of that on what you turn in. This way we can tune the assignments for future classes.

Suggested Reading

This week's assignment draws largely from Chapter 3. The sections for lectures 4 and 5 give the detailed break-down. Generally, you don't have to worry about WITH, views, or nulls for this assignment - those will be part of future assignments.

Also, the problems will only include queries that can be implemented in terms of Cartesian products, or nested subqueries in the FROM or WHERE clause. You will not need to use the extended join syntax in these problems, and you will not need outer joins.

(OPTIONAL) If you are interested in the roots of relational database systems in general, and SQL in particular, you might find this paper interesting:

System R is a really important relational database system -- it was one of the very first implementations of the relational model, and the first database system to implement SEQUEL, the predecessor to the modern SQL language. The paper includes many points regarding the implementation of relational database systems that are taken as givens, now, but had to be discovered through trial and error. Great stuff!

Book Exercises

Do these problems from the "Exercises" section of Chapter 3. These problems require you to write SQL statements to perform various queries or modifications to a database. Follow good SQL coding style.

You can save your answers into a plain-text file, and send me the file.

Problems:

Each part of a problem is worth three points. There are 17 parts in all, for a total of 51 points.

MySQL Database Exercises

These exercises use the SQL schema for the bank database that we have discussed so far. Note that the schema in many of the lecture slides is different from the schema in the book (and the SQL file), so review the SQL file before trying these exercises.

As mentioned earlier, none of these queries should require outer joins. All of the queries should only require the Cartesian product FROM syntax (i.e. FROM rel1, rel2, ...), and nested subqueries.

(Each separate query or operation is worth 3 points, as in the previous section.)

Feel free to consult the MySQL 5.0 online documentation, if you have any questions.

As you complete each of these exercises, save the commands you write into a file, marking them with the particular exercises they are for. You can use the MySQL client's "tee" feature to do this very easily. Just run the client as follows (assuming you want to be the root user in your database):

    mysql -u root --tee=hw2.txt

Now, as you use the MySQL client, every single command you type, and its output, will be stored into the file hw2.txt. You can then easily edit this into a form suitable for turning in.

Also, if you get really aggravated by the MySQL client's tendency to beep loudly at you for typing errors, look in the manual for the no-beep option. It's so useful that you might want to put it into mysql.cnf.

Exercise 1

Start the MySQL client, and create a new database using the CREATE DATABASE command. You can call it bank, or if you are using someone else's installation of MySQL 5, you should come up with your own unique name for it. (Do NOT use someone else's database; you should do all of these steps yourself.) Make sure to switch to this new database with the USE command before doing anything else.

Import the make-banking.sql schema and data using the source command in the client. Note that this is not a SQL command per se; it is simply a command that the client provides for importing SQL files. (Use the help command to learn about source and the other client commands you can use.)

If the file imports correctly, you should see no errors. You may see a few warnings for the very first commands; these are because the SQL schema file will try to drop the bank database tables if they already exist; if they don't exist, you get a little warning.

Exercise 2 (6 parts)

Here are some general exercises to try out. Write a SQL query for each of these questions. Make sure your query runs against the database and generates a correct result. You can include the result if you like, but the main point is to turn in the SQL query itself.

Exercise 3

Try out your answers to the book problem 3.15 on the bank database. Make sure that the answers to both parts a and b are the same.

(If you have to tweak your SQL syntax from what the book specifies, make sure to note what the differences are. MySQL doesn't follow the standards perfectly...)

Exercise 4 (2 parts)

In class we discussed using a Cartesian product to find the loan with the largest amount. This involved a two step process, where the first step is to find all loans with an amount that is smaller than some other loan. That part of the SQL statement looked like this:

    SELECT DISTINCT loan.loan_number FROM loan, loan AS test
      WHERE loan.amount < test.amount;

The rest of the query involves taking the set-difference of this result from the set of all loan numbers. We know that the SQL EXCEPT operation will do this for us; however, MySQL 5.0 doesn't implement the EXCEPT operation! Fortunately, you can work around this limitation using the set-membership tests and a nested query.

Write a SQL query that reports the loan-number and the amount of the loan with the largest amount. Your answer should use the above SELECT as a nested query, and it shouldn't use any aggregation functions.

Once you have completed that query and tried it against the database, write a second query that uses aggregate functions to show the loan-number and the amount of the loan with the largest amount.

Exercise 5 (4 parts)

The bank wants to do some things with its branches. Write SQL commands for the following queries and operations, and run them against the database.

Exercise 6

The bank wants a dense rank of its bank branches, based on the amount of assets the branch holds. A dense rank assigns a 1 to the top-most value, a 2 to the second highest value, and so forth. If several entries have the same value, they are assigned the same rank, but there is a corresponding gap to the next rank value. For example, if you had the following values:

Their dense rank would be as follows:

In particular, note that there is no value with a rank of 3, because two values have a rank of 2.

If a value has a dense rank of 2, this means there is one other value that is greater. If a value has a dense rank of 8, there are seven other values that are greater. Given this, you can actually use grouping and aggregation, against a Cartesian product, to compute a dense rank.

Write a SQL query that computes a dense rank of the bank branches, based on the amount of assets that each branch holds. You should take note that the trickiest part of computing a dense rank is handling the situations where multiple values are assigned the same rank, so test your query against data where multiple branches have the same amount of assets. (A good candidate would be to set the Mianus branch to have the same amount of assets as the Central branch.)

Hint: You may find that you need to compute the dense rank of all entries with rank 2 and lower, in one query, and then UNION in the entry (or entries!) with rank 1 in a separate query.

Hint: You can select values, just like you can select attributes or mathematical expressions using attributes. For example, SELECT 10 AS ten; is a valid SQL statement.


Copyright (c) 2007, California Institute of Technology.
Last updated January 18, 2007.