Database Normalization – Sample Paper

Database normalization refers to the process of organizing data in a database. Normalization tries to attain two goals. It strives to eliminate redundant data and ensuring data dependencies can make sense. Storing the same data in one table is one of the examples of eliminating redundant data. On the other hand, storing only related data in one table is one of the methods of making sure that data dependencies make sense. In most instances, normalization involves dividing large tables into smaller tables. The relationship between the tables is then defined. The aim of dividing the data into small tables is to isolate the data for any additions, deletions, or modifications to be made within a certain field easily. The alterations are then propagated throughout the rest of the database according to the defined relationships (Kedar, 2009).

Read also Installing The Adventureworks LT2012 Database Into The SQL Environment

To normalize database, one must follow a series of guidelines. The guidelines are referred to as normal forms. The normal forms are numbered from one – first normal form, 1NF – to five –fifth normal form, 5NF. The first normal form is the lowest form of normalization whereas the fifth normal form is the highest form of normalization. Data in unnormalized form can repeat within the same column. Therefore, to convert data to the first normal form one should eliminate duplicate columns that may exist in the same table. In addition, one should create a separate table for all types of data that are related. In addition, one should identify each row using a distinctive  column or set of columns, which are referred to as primary keys. To convert database to the second normal form one should remove subsets of data that are applicable to different rows of the table and insert them in separate tables.

Read also Database Consultant Services – Capacity Planning And Upgrade

One should then use foreign keys to define the relationship between the new tables and the tables where they came from. To convert database to the third normal form, one should remove columns that are not dependent on the primary key (Coronel, Morris & Rob, 2012).

A college may have data that represent the courses it offers, students and tutors that are in certain of certain course(s). Students may take several courses. In addition, tutors may teach more than one course. Therefore, it is vital for the college to have a record that would depict the above more efficiently. Below is an example of the record of the college.

Course code P101 Course Name Introduction to philosophy
Tutor ID 123544 Tutor Name Kaushik Khan
Student ID Student name Date of birth Gender Last attendance date
P2433 Peter Declan 09/ 12/1991 M
P4940 Tracy Morgan 04/ 12/1992 F
P8494 Morris Stevens 09/ 2/1993 M
P8763 Robert Michaels 18/ 6/1992 M
P2756 Cathleen Johnson 20/ 7/1993 F
P7464 Victoria James 30/ 8/1991 F

From the above data, it is clear that the data repeats itself more than once. Representation of the above data is complex since each course would have different number of students. Therefore, it is vital to convert the structure of the data into a form that would be easy to represent. Conversion of the data should be undertaken in stages. Since different courses would have different number of students, this would pose several problems. Converting the data into the first normal form would help in solving this problem. Conversion would make the students be represented by the course code. This would solve the problem of having different number of students for each course.

However, conversion of the data into the first normal form would present another problem. If the students enroll in another course, the details of the student would be repeated. However, the student detailed recorded in the database only once. The above situation occurred due to the fact that the table’s key instead of the whole key has referenced non-key attributes of the table. To convert the data to second normal form, the course code would reference the course name, tutor id, and tutor name. The course code and the student number would reference the last attendance name. The student number would reference the student name, date of birth and gender.

From the data, there is a key and dependant relationship between the tutor id and tutor name. Therefore, they should be removed to make another table. Therefore, the third normal form would prevent the duplication of the tutor’s details. In the third normal form, the course code would reference the course name and tutor id. The tutor id would have a foreign link that would link it to the original table. The course code and the student number would reference the last attendance name. The student number would reference the student name, date of birth and gender. The tutor id would reference the tutor’s name.

Un-normalized form First normal form Second normal form Third normal form entity
  • Course code
  • Course code
  • Course code
  • Course code
Course
Course name Course name Course name Course name
Tutor id Tutor id Tutor id *Tutor id
Tutor name Tutor name Tutor name
  • Student no
Student name
  • Course code
  • Course code
  • Course code
classlist
Date of birth
  • Student no
  • Student no
  • Student no
 Gender Student name Last attendance date Last attendance date
Last attendance date Date of birth
 Gender
  • Student no
  • Student no
Student
Last attendance date Student name Student name
Date of birth Date of birth
 Gender  Gender
Last attendance date Last attendance date
  • Tutor id
tutor
Tutor name

Normalization leads stores information in several separate logical tables. Therefore, the database query would have to draw information from the large number of logical tables, which would make the operations be slow. Denormalization helps in improving the query response. However, one should ensure that denormalization does not make the database become inconsistent. A company may denormalize its database to allow customers to alter their names or address. Denormalization would enable the company to retrieved customers information if there are changes in their details.

The policies of an organization would determine how database tables are normalized and demormalized. The rules determine how data items would be defined or used. Therefore, prior to the normalization or denormalization of data, it is vital to have a consensus on the definition and use of various data items. For example, there must be a consensus on the meaning of customer. Different people within the same company may have different definitions of customer. Business rules define the meaning and use of various terms (Coronel, Morris & Rob, 2012).

Scroll to Top