CS101 - Intro to Relational Databases
Final Exam
The final exam consists of two problems, a design problem and a functional
dependency problem. The exam is designed to be completed in under 3 hours, but
you can have up to 4 hours, just in case you end up needing it.
You may consult any course materials, including lecture slides, the text book
(and errata on the book website!), graded homeworks and solution sets, and the
MySQL user's manual. You may also try your SQL DDL and/or DML commands in MySQL
if you wish.
You are not allowed to discuss the contents of the final exam with anyone else,
until after you have submitted your completed exam.
If you run into some ambiguity on the final exam, note the ambiguity and make a
reasonable choice about how to resolve it. If it really is an issue, I won't
penalize you for my inability to write a final exam. Please also let me know so
that I can make an update.
Problem 1
The first problem is a simple database schema design problem. Following is a
specification for the database schema.
The database schema you will be designing is intended to be used in a photo and
video sharing website. Users can create accounts on the website, and then
upload photos or video files to the website to share with others. Photos and
videos can be organized into groups as well. Finally, users can leave comments
on other people's photos and videos.
Initial Notes
None of the specifications below state that you must create numeric IDs for any
of these items, although you may find it useful to do so. Feel free to add
numeric ID fields wherever you see fit.
In addition, if you see a need to add any other attributes to any of these
entities, feel free to do so. Simply comment briefly as to why you are adding
each extra attribute.
Finally, you may notice a number of constraints that one might want to
enforce, if this were going to be a real database schema worthy of a production
site, that aren't stated below. Don't worry about that. You probably don't
realize how much is actually missing here!
User Accounts
User accounts must have the following attributes:
-
A username that is no more than 20 characters. Every user must have a
unique username.
-
Every user must also specify an e-mail address. User e-mail addresses don't
need to be unique.
Yes, no passwords! (You can add a password for each account if you like, but it
is not a requirement for the problem.)
Photos and Videos
Photos and videos must be represented within the database. The main difference
between the two is that each photo's data will be stored within the database (as
blobs), but the data for each video will be stored on the filesystem, and only a
path to the video file is stored within the database. Both photos and video
files must have these details:
-
Every photo or video must have a one-line title. This title doesn't have
to be unique, but some value must be specified.
-
Every photo or video may also optionally have a text description. This
description is expected to be no larger than a single paragraph.
-
Each photo or video has an "upload time" specifying when it was added to the
website. This may not be null.
-
Every photo or video will also have a total number of views, which is the
number of times it has been downloaded from the website. This value should
default to 0 when the item is first uploaded to the website, and should be
present on all items.
-
Of course, every photo or video must be associated with the user who
uploaded the item. This is required.
-
Each photo or video may be put into at most one group. Items are not
required to appear within a group, but if they are, they may only be stored
within one group.
In addition to the above properties, photos must also specify these additional
values:
- The filename of the photo. This value excludes any path information.
-
The data for the photo itself. You can use a BLOB column to store
the photo data.
Videos, on the other hand, have these properties:
-
The total length of the video in seconds. This value can be a whole
number; fractional seconds are not necessary. The value will be formatted
into a more human-readable format in the presentation layer. This value
must be specified for each video.
-
A path and filename for where the video file is stored on the server. This
should be at least 1K characters, since most filesystems support paths of
at least this length. Again, this value must of course be specified.
Groups
Each user can create some number of groups, to organize their photos. There is
no requirement to store media items in groups, but they are available if a
user wants to use them. Each group has these characteristics:
-
A group name must be specified. The group name is not required to be
unique. (It would be nice to enforce that a particular user's groups are
all uniquely named, but don't worry about doing this.)
-
An optional group description, which may be up to a paragraph in length.
Each group must be associated with the user who created it. Also, as mentioned
before, each photo or video may be stored in at most one group; an item is not
allowed to appear in multiple groups.
Each group may also optionally specify a "summary photo" for the group. This is
a photo that will be displayed for the group when a user's groups are listed on
their webpage. Videos may not be used for this, only photos.
NOTE: Don't worry about enforcing that a group can only contain items
uploaded by the group's owner. Also, don't worry about enforcing that a group's
summary photo was uploaded by the group's owner. These constraints definitely
make the database more complex...
Comments
Users are allowed to leave comments on the photos or videos on the website. The
database should store:
- The user who left the comment
- The time the comment was created
- The item (photo or video) that the comment is for
-
The text of the comment itself, which should not be more than a few
sentences.
A user should be able to leave multiple comments on the same item; all comments
for an item will be listed in time order. Each comment should be associated
with only one photo or video.
Your Tasks
-
Create an E-R diagram for a database schema that satisfies the above
specification. Make sure to indicate mapping cardinalities, participation
constraints, and any other constraints required by your design. Missing
constraints will be penalized. Also, make sure your diagram is neat and
readable; sloppiness will also be penalized. (10 points)
-
Map the E-R diagram to a relational model schema. Make sure to indicate
all primary keys in your schemas. In addition, make a note of any other
candidate keys that must be enforced. (6 points)
-
Create the sequence of SQL DDL commands necessary for creating the database
schema. (You must use valid SQL syntax. You can constrain yourself to
MySQL syntax, if you are so inclined.) Make sure to specify reasonable
types, sizes, and nullity constraints on each column. Also, specify all
primary keys, foreign keys, and other candidate keys in your SQL DDL. Bad
type choices or missing constraints will be penalized. (8 points)
-
Finally, write SQL queries against your schema, for the following tasks:
-
Report the title, time uploaded, and the number of views for the top 20
photos and videos uploaded by the user "bopeep", in descending order of
number of views. (Clarification: Top 20 items; you don't need to
compute the top 20 photos, and then also top 20 videos.)
-
For every group created by the user "garth", report the group's name,
its description, the number of photos and videos in the group, and the
filename of the group's summary photo (if any). (Clarification:
As before, I mean the total number of media items; no need to treat
photos and videos separately.)
-
Write a query that reports the top comment-posters on the website. The
result should contain each user's username, and the total number of
comments that user has posted. List only the top 10 users, in
descending order of comments posted.
(6 points)
Problem 2
Given the relation-schema R(A, B, C,
D, E, G)
and a set of functional dependencies
F = { D → BCE,
BG → A,
C → BE,
CE → B }.
(The schema has no attribute F to avoid confusion...)
-
Find a candidate key for R, and demonstrate that it is a superkey
for R. (5 points)
-
Compute a canonical cover of F. Make sure to show all steps in
your computation. (9 points)
-
Create a 3NF decomposition of R. (6 points)
Last updated March 19, 2007.