Database Normalization – Sample Paper

Database normalization refers to the process of organizing data in a database. Normalization tries to attain two goals. It strives to eliminate redundant data and ensuring data dependencies can make sense. Storing the same data in one table is one of the examples of eliminating redundant data. On the other hand, storing only related data in one table is one of the methods of making sure that data dependencies make sense. In most instances, normalization involves dividing large tables into smaller tables. The relationship between the tables is then defined. The aim of dividing the data into small tables is to isolate the data for any additions, deletions, or modifications to be made within a certain field easily. The alterations are then propagated throughout the rest of the database according to the defined relationships (Kedar, 2009).

To normalize database, one must follow a series of guidelines. The guidelines are referred to as normal forms. The normal forms are numbered from one – first normal form, 1NF – to five –fifth normal form, 5NF. The first normal form is the lowest form of normalization whereas the fifth normal form is the highest form of normalization. Data in unnormalized form can repeat within the same column. Therefore, to convert data to the first normal form one should eliminate duplicate columns that may exist in the same table. In addition, one should create a separate table for all types of data that are related. In addition, one should identify each row using a distinctive  column or set of columns, which are referred to as primary keys. To convert database to the second normal form one should remove subsets of data that are applicable to different rows of the table and insert them in separate tables.

Read also Database Consultant Services – Capacity Planning And Upgrade

One should then use foreign keys to define the relationship between the new tables and the tables where they came from. To convert database to the third normal form, one should remove columns that are not dependent on the primary key (Coronel, Morris & Rob, 2012).

A college may have data that represent the courses it offers, students and tutors that are in certain of certain course(s). Students may take several courses. In addition, tutors may teach more than one course. Therefore, it is vital for the college to have a record that would depict the above more efficiently. Below is an example of the record of the college.

Course codeP101Course NameIntroduction to philosophy
    
Tutor ID123544Tutor NameKaushik Khan
    
Student IDStudent nameDate of birthGenderLast attendance date
P2433Peter Declan09/ 12/1991M 
P4940Tracy Morgan04/ 12/1992F 
P8494Morris Stevens09/ 2/1993M 
P8763Robert Michaels18/ 6/1992M 
P2756Cathleen Johnson20/ 7/1993F 
P7464Victoria James30/ 8/1991F 

From the above data, it is clear that the data repeats itself more than once. Representation of the above data is complex since each course would have different number of students. Therefore, it is vital to convert the structure of the data into a form that would be easy to represent. Conversion of the data should be undertaken in stages. Since different courses would have different number of students, this would pose several problems. Converting the data into the first normal form would help in solving this problem. Conversion would make the students be represented by the course code. This would solve the problem of having different number of students for each course.

However, conversion of the data into the first normal form would present another problem. If the students enroll in another course, the details of the student would be repeated. However, the student detailed recorded in the database only once. The above situation occurred due to the fact that the table’s key instead of the whole key has referenced non-key attributes of the table. To convert the data to second normal form, the course code would reference the course name, tutor id, and tutor name. The course code and the student number would reference the last attendance name. The student number would reference the student name, date of birth and gender.

From the data, there is a key and dependant relationship between the tutor id and tutor name. Therefore, they should be removed to make another table. Therefore, the third normal form would prevent the duplication of the tutor’s details. In the third normal form, the course code would reference the course name and tutor id. The tutor id would have a foreign link that would link it to the original table. The course code and the student number would reference the last attendance name. The student number would reference the student name, date of birth and gender. The tutor id would reference the tutor’s name.

Un-normalized formFirst normal formSecond normal formThird normal formentity
  • Course code
  • Course code
  • Course code
  • Course code
Course
Course nameCourse nameCourse nameCourse name 
Tutor idTutor idTutor id*Tutor id 
Tutor nameTutor nameTutor name  
  • Student no
    
Student name
  • Course code
  • Course code
  • Course code
classlist
Date of birth
  • Student no
  • Student no
  • Student no
 
 GenderStudent nameLast attendance dateLast attendance date 
Last attendance dateDate of birth   
  Gender
  • Student no
  • Student no
Student
 Last attendance dateStudent nameStudent name 
  Date of birthDate of birth 
   Gender Gender 
  Last attendance dateLast attendance date 
     
   
  • Tutor id
tutor
   Tutor name 

Normalization leads stores information in several separate logical tables. Therefore, the database query would have to draw information from the large number of logical tables, which would make the operations be slow. Denormalization helps in improving the query response. However, one should ensure that denormalization does not make the database become inconsistent. A company may denormalize its database to allow customers to alter their names or address. Denormalization would enable the company to retrieved customers information if there are changes in their details.

The policies of an organization would determine how database tables are normalized and demormalized. The rules determine how data items would be defined or used. Therefore, prior to the normalization or denormalization of data, it is vital to have a consensus on the definition and use of various data items. For example, there must be a consensus on the meaning of customer. Different people within the same company may have different definitions of customer. Business rules define the meaning and use of various terms (Coronel, Morris & Rob, 2012).

Get Your Custom Paper From Professional Writers. 100% Plagiarism Free, No AI Generated Content and Good Grade Guarantee. We Have Experts In All Subjects.

Place Your Order Now
Scroll to Top