You are on page 1of 2

Institutt for datateknikk og informasjonsvitenskap

Norges teknisk-naturvitenskapelige universitet

TDT4145 Data modeling and database management systems

Exercise 4: Normalization, B+-trees and hashing


Problem 1: Normalization theory
(a) Define the following concepts: Key, super key and functional dependency.
(b) Explain what is meant by the closure, X+, of a set of attributes, X, with respect to a set of
functional dependencies. Show an algorithm to find X+.
(c) Assume F = {ae, acd, bc}. Calculate the following closures: a+, ab+, e+
(d) How do you decide if a set of attributes is a super key for a table? How do you decide that a super
key is also a key?
(e) How do you decide if a decomposition of a table (R) into two projections (R1 og R2) has the
lossless join property?
(f) Assume a table R(a, b, c, d) and the functional dependency F = {bc}. Below three
decompositions of R are shown. Find which decompositions have the lossless join property. You
must explain your answer.
1. R1(a, b, c) and R2(b, c, d)
2. R1(a, b, d) and R2(b, c, d)
1. R1(a, b, d) and R2(b, c)
(g) Give a definition of the 3rd normal norm (3NF).
(h) The table R(A B C D) is not on 3NF when the functional dependencies are F = {A B, C D}.
A possible decomposition of R is R1(A B), R2(C D) and R3(A C). Is this decomposition a good
solution? Your answer must be explained.
Problem 2: 16.31 Static hashing
A PARTS file with Part# as the hash key includes records with the following Part# values: 2369,
3760, 4692, 4871, 5659, 1821, 1074, 7115, 1620, 2428, 3943, 4750, 6975, 4981, and 9208. The file
uses 8 buckets, numbered 07. Each bucket is one disk block and holds two records. Load these
records into the file in the given order, using the hash function h(K) = K mod 8. Calculate the average
number of block accesses for a random retrieval on Part#.
Problem 3: 16.32 Extendible hashing
Load the records of problem 2 (16.31) into expandable hash files based on extendible hashing. Show
the structure of the directory at each step, and the global and local depths. Use the hash function h(K)
= K mod 128. Start with global depth 2 and four blocks.
Side 1 av 2

Problem 4: 16.33 Linear hashing


Load the records of problem 2 (16.31) into an expandable hash file based on linear hashing. Start with
one block using the hash function h0 = K mod 20, and show how the file grows and how the hash
function changes as the records are inserted. Assume that blocks are split whenever an overflow
occurs, and show the value of n at each stage.
Problem 5: B+-trees
(a) Assume we are to create a B+-tree-index for a student database with 300000 students. The
primary key for a student is student number, an integer using 4 bytes representation. Assume a block
identifier to be 4 bytes. On the leaf level there is a pointer to the block where the student record is.
This pointer is a block identifier as well. Assume a block to be 4096 bytes. How many blocks will
exist at the different levels of the B+-tree? Explain the assumptions that you make.
(b) If you are to insert student 300001 in the student database, how many disk accesses do you get?
Explain the assumptions you make?
(c) If you are to find and update a leaf level record of the B+-tree, how long time does it take?
Assume reasonable disk access timing and describe these assumptions.
(d) Assume there is space for 3 records in each block in a B+-tree. Assume the B+-tree to initially be
empty. Show the B+-tree after each insertion of records with the following keys: 12, 9, 3, 18, 22, 1,
37, 11.

Side 2 av 2

You might also like