Normalization -Part 1

First Normal Form (1NF):

The values should be atomic . There should be no repeating data in a single column. And a Primary Key should be assigned to the table.

For example : Taking example from Ramaz Elmasri book , consider a company with multiple department locations in USA.

1

The DNumber is the Primary Key . We cannot have multiple values as they are in DLocations column  {Bellaire, Sugarland ,Houston}. They should be atomic. To normalize it to 1NF , we can easily apply the following technique:

  1. As we see, DNumber and DLocations uniquely identify each row , so we can make {DNumber and DLocations} as the combined Primary Key.

2

Second Normal Form (2NF):

For the table to be in 2NF , it should be in 1NF and secondly it should not have any partial dependency (means every non key column should not be dependent partially on the Primary Key) . We can say that every non key column should be fully dependent on the Primary Key.

Consider the following table EMP_PROJ:

3

To achieve it into 1NF , make SSN and PNumber the Primary Key .

But what about the functional dependencies ? Remember we told that there should be no partial dependency on part of the Primary Key . Here EName is dependent on SSN !

To further elaborate , consider the following table :

5

Consider the SSN and PNumber as the Primary Key . The functional dependency FD1 , Hours is dependent fully on SSN and PNumber , which is fine according to the 2NF rule as it is fully dependent on PK. But look at FD2 and FD3 , EName is dependent only on SSN which violates our 2NF rule of being fully dependent on whole Key . Also FD3 , PName and PLocation is also dependent on PNumber which also violates the 2NF rule .

To achieve 2NF , we break the tables as the following :

6

The first table above : EP1 consists of the FD1 column Hours which was fully dependent on whole PK {SSN,PNumber}.

EP2 table we made for FD2 , Ename which was dependent on SSN .

EP3 table we made for FD3 , PName and PLocation which was dependent on PNumber .

So in this way we can achieve the 2NF. The EMP_Proj table is now in 2NF .

 

Third Normal Form (3NF):

For the table to be in 3NF ,it should be in 2NF and there should be no transitive dependency . Which means that any non key column should not be transitively dependent on Primary key.

Transitive dependency : X->Y so Y->Z.

Consider the following table:

7

To explain , see the DNumber column which dependent on SSN. So consider , SSN as X , DNumber as Y  and Dname as Z .  Or we can say the DName and DMGRSSN is dependent on DNumber which is a non Key column .

To achieve , 3NF , we can do the following :

8

We separated the DName and DMGRSSN in a separate table whose PK is DNumber !

 

Thanks

 

Leave a comment