Category: 4. Normalization

Normalization -Part 1

First Normal Form (1NF):

The values should be atomic . There should be no repeating data in a single column. And a Primary Key should be assigned to the table.

For example : Taking example from Ramaz Elmasri book , consider a company with multiple department locations in USA.

1

The DNumber is the Primary Key . We cannot have multiple values as they are in DLocations column  {Bellaire, Sugarland ,Houston}. They should be atomic. To normalize it to 1NF , we can easily apply the following technique:

  1. As we see, DNumber and DLocations uniquely identify each row , so we can make {DNumber and DLocations} as the combined Primary Key.

2

Second Normal Form (2NF):

For the table to be in 2NF , it should be in 1NF and secondly it should not have any partial dependency (means every non key column should not be dependent partially on the Primary Key) . We can say that every non key column should be fully dependent on the Primary Key.

Consider the following table EMP_PROJ:

3

To achieve it into 1NF , make SSN and PNumber the Primary Key .

But what about the functional dependencies ? Remember we told that there should be no partial dependency on part of the Primary Key . Here EName is dependent on SSN !

To further elaborate , consider the following table :

5

Consider the SSN and PNumber as the Primary Key . The functional dependency FD1 , Hours is dependent fully on SSN and PNumber , which is fine according to the 2NF rule as it is fully dependent on PK. But look at FD2 and FD3 , EName is dependent only on SSN which violates our 2NF rule of being fully dependent on whole Key . Also FD3 , PName and PLocation is also dependent on PNumber which also violates the 2NF rule .

To achieve 2NF , we break the tables as the following :

6

The first table above : EP1 consists of the FD1 column Hours which was fully dependent on whole PK {SSN,PNumber}.

EP2 table we made for FD2 , Ename which was dependent on SSN .

EP3 table we made for FD3 , PName and PLocation which was dependent on PNumber .

So in this way we can achieve the 2NF. The EMP_Proj table is now in 2NF .

 

Third Normal Form (3NF):

For the table to be in 3NF ,it should be in 2NF and there should be no transitive dependency . Which means that any non key column should not be transitively dependent on Primary key.

Transitive dependency : X->Y so Y->Z.

Consider the following table:

7

To explain , see the DNumber column which dependent on SSN. So consider , SSN as X , DNumber as Y  and Dname as Z .  Or we can say the DName and DMGRSSN is dependent on DNumber which is a non Key column .

To achieve , 3NF , we can do the following :

8

We separated the DName and DMGRSSN in a separate table whose PK is DNumber !

 

Thanks

 

Higher Normal Forms

BOYCE-CODD NORMAL FORM

  • For a table to be in BCNF , it should have non trivial functional dependency (X->Y so that Y is subset of X).
  • So, X is the superKey.

Super Key :

Super Key is a set of columns that uniquely identifies a row.

For instance : Consider the following table : Customer , having the columns :

SSN, CustomerID, CustomerName,DOB, Address, Amount

Here the set of columns that can form the super key are:

{SSN, CustomerName,DOB}

{SSN,CustomerID,CustomerName}

{SSN,CustomerName}

They can easily uniquely identify the row.

 

Click to access BCNF.pdf

Functional Dependencies

It is essential to have understandings of functional dependencies in normalizing the tables.

“A value X determines value of Y .”  or X->Y.

Which means that it is a constraint that Y is determined by the value of X but this is not Vice versa !

It does not necessarily means that Y determines X.

For instance, In USA , every person has a unique social security No ,referred to as SSN , uniquely identifies each person. So , according to the definition , consider X being the SSN and Y being a person name .

Example :

SSN : 11223344 is assigned to the person ABC . This satisfies the functional dependency SSN -> personName

But vice versa is not necessary that it fulfills the condition . There can be hundreds or thousands of Mr ABC in the USA ! so vice versa cannot satisfy the condition.

Remember the following functional dependency Rules :

  1. Reflexivity : If Y is subset of X , then X determines Y  that is X->Y.

For example : Consider {EmpID,EmpName} -> EmpID , this satisfies the rule of reflexivity because EmpID is a subset of {EmpID,EmpName} .

  2.  Augmentation : If X-> Y , then XZ->YZ .

For example : Consider EmplD being X , EmpName being Y and Z being EmpBirthDate . EmpID -> EmpName , so {EmpID, EmpBirthDate} -> {EmpName , EmpBirthDate} .

      3. Transitive :  If X -> Y and Y -> Z is true, then X-> Z is a transitive dependency.

MovieID, ListID, Genre and Price.

MovieID-> ListID

ListID-> Genre

   4. Decomposition : {X→YZ} |=X→Y.

If a table has two columns which are determined by the same primary key then break them up into two different tables.

5. Union : {X→Y, X→Z} |=X→YZ .

Which means that if X determines Y and X determines Z then Y and Z can be put together in a table depending on the situation because they can be determined by same Primary Key X.

  6. Pseudotransitive : If X-> Y and YW->Z ,then XW->Z .

 

Thank You

Normalization Exercise # 1

Here we will work on exercise to break the table up till 3NF:

Original Table :

1NF

Making CustNo and PropNo as Primary Keys:

To make things easier , find functional dependencies , so it will make easier for us to break and understand the table :

We have three relations having the following functional dependencies:


– CustNo, PropNo -> RentStart, RentFinish
– CustNo -> CustName
– PropNo ->Paddress, Rent, OwnerName, Oname

As we see the functional dependencies , we can come up with the 2NF:

– Customer(CustNo, Cname)
– Rental(CustNo, PropNo, RentStart, RentFinish)
– Property_Owner(PropNo, Paddress, Rent, OwnerNo,OName).

2NF:

In 2NF, there the columns should not be dependent on partial or part of Primary Key.

3NF

No non-primary-key attributes that are transitively dependent on the primary key.

So, We have 4 relations:
• Customer(CustNo, Cname)
• Rental(CustNo, PropNo, RentStart,RentFinish)
• Property_For_Rent(PropNo, Paddress, Rent, OwnerNo)
• Owner(OwnerNo, OName)