
304
10 DATABASE AND FILE CONCEPTS
10
10.10 Normalising data
Normalisation is a technique used to reduce the duplication of data in a
relational database. It helps to:
» organise data in an efficient way
» remove redundant (duplicated) data
» make sure that only related data is stored in a table.
This technique is a multi-step process, where each step has a rule that improves
the efficiency of the database. These rules are called the Normal Forms,
numbered from 0 to 5. We will only study un-normalised data (0NF) to Third
Normal Form (3NF). Fourth and Fifth Normal Form (4NF and 5NF) are
beyond the scope of this book and exam.
10.10.1 Un-normalised Form (0NF or UNF)
If a database is not normalised it is called an un-normalised database, often
shortened to 0NF or sometimes UNF. This is often a flat-file database (a
single table) that contains duplicated data (which is called redundant data) and
complex data structures (more than one item of data, otherwise known as non-
atomic data) stored within a single field.
Task 10ae
Changethefollowingdataonstudents,theirhouseandtheircourses,intoFirst,
Second and Third Normal Form.
Student name House Course
Jane Smith York Computing, Mr Brown, Room 53, Maths, Miss White,
Room 104, Statistics, Miss White, Room 104
Ruksana Patel Lancaster Business, Mrs Grey, Room 3, Maths, Miss White, Room
104, Science, Mr Green, Room 24
Jamal Aziz York Computing, Mr Brown, Room 53, Maths, Miss Black, Room 102
Jet Li Xa Lancaster Maths, Miss Black, Room 102, Science, Mr Green, Room 24
Jane Smith Edinburgh Statistics, Miss White, Room 104
Each teacher teaches in a single classroom.
Save your work at each stage as Task 10ae_1, Task 10ae_2 and Task 10ae_3.
Create a new word-processed document which contains the data structures for
this task. Save this as Task 10ae_1.
10.10.2 First Normal Form (1NF)
The rules for a database normalised to 1NF are:
» All data is stored in a database table.
» A unique key must exist in each table (often a primary or compound key).
» Only atomic data is stored (which is data stored to the lowest level of data
and cannot be broken down any further).
» Each field has a unique name.
» Each record is unique (so there are no repeated rows in any table and a
primary/compound key exists in each table).
» There are no repeating groups of columns.
To convert the data in Task 10ae into 1NF we must first ensure that a unique
key exists in each table. There are two students with the same name. Looking
at this data it is possible (although unlikely) that two students with identical
names, houses and courses could exist, so a new field is required as a primary
key field. We will call this field
St_ID. Each St_ID will be a unique number.