FD8 Normalization OR Normal Forms in DBMS
Normal Forms are used to eliminate or reduce redundancy in database tables.
1. First Normal Form (1NF):
Ensures that each column in a table contains atomic (indivisible) values, and there are no repeating groups or arrays.
Let’s illustrate 1NF using a “Student” table with columns for student number (st.no.), student name (st.name), phone number (st.phone), state (st.state), and country (st.country). We’ll also consider the case where a student can have multiple phone numbers.
*Original Student Table:
+--------+---------+--------------+----------+-----------+
| st.no. | st.name | st.phone | st.state | st.country|
+--------+---------+--------------+----------+-----------+
| 1 | Alice | 123-456-7890 | CA | USA |
| 2 | Bob | 987-654-3210 | NY | USA |
| 3 | Charlie | 555-123-4567 | TX | USA |
| | | 567-123-4556 | | | |
+--------+---------+--------------+----------+-----------+
In this table, each column appears to contain atomic values, but the “st.phone” column violates 1NF because it contains multiple phone numbers for some students.
*Modified Student Table for 1NF:
+--------+---------+--------------+----------+-----------+
| st.no. | st.name | st.phone | st.state | st.country|
+--------+---------+--------------+----------+-----------+
| 1 | Alice | 123-456-7890 | CA | USA |
| 2 | Bob | 987-654-3210 | NY | USA |
| 3 | Charlie | 555-123-4567 | TX | USA |
| 3 | Charlie | 555-987-6543 | TX | USA |
+--------+---------+--------------+----------+-----------+
-- Additional phone number for Charlie
In this modified table, we have repeated the row for Charlie with an additional phone number to adhere to 1NF. Now, each cell contains an atomic value, and the table does not have repeating groups.
Explanation:
- The “st.no.” and “st.name” columns remain the same, representing the student number and name.
- The “st.phone” column now includes multiple rows for students with multiple phone numbers, preserving atomicity.
- The “st.state” and “st.country” columns remain unaffected.
This modification aligns the table with 1NF principles, ensuring that each column contains atomic values and no repeating groups are present. In practical terms, when a student has multiple phone numbers, additional rows are added to capture each distinct phone number, maintaining a normalized structure.
2. Second Normal Form (2NF):
Builds on 1NF and requires that all non-key attributes are fully functionally dependent on the primary key. In other words, there should be no partial dependencies.
Let’s consider a “Student” table and go through the process of achieving Second Normal Form (2NF) step by step.
*Student Table Structure (Before 2NF):
+-------+---------+------------+-------+---------+
| St.No | St.Name | St.Phone | St.State | St.Country |
+-------+---------+------------+----------+------------+
| 1 | Alice | 123-4567 | NY | USA |
| 2 | Bob | 987-6543 | CA | USA |
| 3 | Charlie | 111-2222 | TX | USA |
+-------+---------+------------+----------+------------+
Identifying Partial Dependency:
In the initial structure, the primary key is “St.No.” Now, let’s identify the partial dependencies:
- “St.State” and “St.Country” are partially dependent on the primary key (“St.No”), as they depend on a part of the primary key rather than the entire primary key.
*Restructured Student Table (2NF):
To achieve 2NF, we need to create two separate tables: one for student details and another for country details.
StudentDetails Table:
+-------+---------+------------+----------+
| St.No | St.Name | St.Phone | St.State |
+-------+---------+------------+----------+
| 1 | Alice | 123-4567 | NY |
| 2 | Bob | 987-6543 | CA |
| 3 | Charlie | 111-2222 | TX |
+-------+---------+------------+----------+
CountryDetails Table:
+-------+------------+
| St.No | St.Country |
+-------+------------+
| 1 | USA |
| 2 | USA |
| 3 | USA |
+-------+------------+
Explanation:
- In the “StudentDetails” table, we’ve removed the columns “St.State” and “St.Country” because they were partially dependent on the primary key “St.No.”
- The “CountryDetails” table contains only the columns “St.No” and “St.Country,” establishing a separate table for country details.
- Now, both tables adhere to 2NF, and there are no partial dependencies on the primary key.
This restructuring ensures that each table represents a single logical entity and eliminates partial dependencies on the primary key, aligning with the principles of Second Normal Form.
3. Third Normal Form (3NF):
Builds on 2NF and eliminates transitive dependencies. A transitive dependency occurs when a non-key attribute depends on another non-key attribute.
The Third Normal Form (3NF) is a level of normalization in relational database design that builds upon the Second Normal Form (2NF). To achieve 3NF, a table must meet the following conditions:
- It must be in 2NF.
- There should be no transitive dependencies.
In simpler terms, every non-prime attribute (an attribute that is not part of the primary key) must be dependent only on the primary key and not on other non-prime attributes.
Let’s illustrate this with a “Student” table example:
*Student Table Structure (Before 2NF):
+-------+---------+------------+----------+------------+
| St.No | St.Name | St.Phone | St.State | St.Country |
+-------+---------+------------+----------+------------+
| 1 | Alice | 123-4567 | NY | USA |
| 2 | Bob | 987-6543 | CA | USA |
| 3 | Charlie | 111-2222 | TX | USA |
+-------+---------+------------+----------+------------+
In this table, the primary key is “St.No.” The non-prime attributes include “St.Name,” “St.Phone,” “St.State,” and “St.Country.”
Identifying Transitive Dependency:
- Transitive Dependency: In this case, “St.State” is transitively dependent on the primary key “St.No” through “St.Country.” “St.Country” is not part of the primary key but determines the value of “St.State.”
*Restructured Student Table (2NF):
To achieve 3NF, we create three separate tables: one for student details, another for country details, and a third for state details.
StudentDetails Table:
+-------+---------+------------+
| St.No | St.Name | St.Phone |
+-------+---------+------------+
| 1 | Alice | 123-4567 |
| 2 | Bob | 987-6543 |
| 3 | Charlie | 111-2222 |
+-------+---------+------------+
CountryDetails Table:
+------------+----------+
| St.Country | CountryID|
+------------+----------+
| USA | 1 |
+------------+----------+
StateDetails Table:
+-------+----------+------------+
| St.No | St.State | StateID |
+-------+----------+------------+
| 1 | NY | 1 |
| 2 | CA | 1 |
| 3 | TX | 1 |
+-------+----------+------------+
Explanation:
In this 3NF structure, each table serves a specific purpose, and there are no transitive dependencies. “StateDetails” and “CountryDetails” are linked to “StudentDetails” through their respective primary keys, resolving the transitive dependency issue.
4. Boyce-Code Normal Form (BCNF):
A stricter form of 3NF that eliminates all non-trivial functional dependencies. In BCNF, every determinant is a superkey.
Boyce-Codd Normal Form (BCNF) is a higher level of normalization compared to the Third Normal Form (3NF). To achieve BCNF, a table must meet the following conditions:
- It must be in 3NF.
- There should be no non-trivial functional dependencies where a non-prime attribute is functionally dependent on another non-prime attribute.
In simpler terms, BCNF addresses situations where there might be overlapping candidate keys and ensures that all functional dependencies are based on superkeys. Now, let’s use the “Student” table example to illustrate BCNF:
*Student Table Structure (Before 2NF):
+-------+---------+------------+----------+------------+
| St.No | St.Name | St.Phone | St.State | St.Country |
+-------+---------+------------+----------+------------+
| 1 | Alice | 123-4567 | NY | USA |
| 2 | Bob | 987-6543 | CA | USA |
| 3 | Charlie | 111-2222 | TX | USA |
+-------+---------+------------+----------+------------+
Identifying Functional Dependency:
- The primary key is “St.No.”
- “St.State” is functionally dependent on “St.No.”
- “St.Country” is functionally dependent on “St.No.”
*Restructured Student Table (2NF):
StudentDetails Table:
+-------+---------+------------+----------+
| St.No | St.Name | St.Phone | St.State |
+-------+---------+------------+----------+
| 1 | Alice | 123-4567 | NY |
| 2 | Bob | 987-6543 | CA |
| 3 | Charlie | 111-2222 | TX |
+-------+---------+------------+----------+
CountryDetails Table:
+-------+------------+
| St.No | St.Country |
+-------+------------+
| 1 | USA |
| 2 | USA |
| 3 | USA |
+-------+------------+
Explanation:
- In BCNF, we have eliminated the partial dependency of “St.State” and “St.Country” on the primary key “St.No.”
- Each table is now in BCNF, and all functional dependencies are based on superkeys (in this case, the primary key “St.No”).
- This eliminates potential anomalies and ensures that the tables are structurally sound, adhering to the principles of BCNF.
5. Fourth Normal Form (4NF):
Addresses certain types of multivalued dependencies, ensuring that a table is free from nontrivial multivalued dependencies.
The Fourth Normal Form (4NF) is a level of normalization in relational database design that addresses certain types of multivalued dependencies. A table is in 4NF if it is already in the Third Normal Form (3NF) and if it has no nontrivial multivalued dependencies.
In simpler terms, 4NF ensures that there are no dependencies between non-key attributes and sets of multivalued attributes.
Let’s illustrate this with a “Student” table example:
*Student Table Structure (Before 4NF):
+-------+---------+------------+--------------+
| St.No | St.Name | St.Phone | St.Courses |
+-------+---------+------------+--------------+
| 1 | Alice | 123-4567 | Math, English|
| 2 | Bob | 987-6543 | History, Math|
| 3 | Charlie | 111-2222 | Science |
+-------+---------+------------+--------------+
In this table, the primary key is “St.No.” The non-prime attributes include “St.Name,” “St.Phone,” and “St.Courses,” which is a multivalued attribute.
Identifying Multivalued Dependency:
Multivalued Dependency: The attribute “St.Courses” is a set of values that can be associated with a student.
*Restructured Student Table (4NF):
To achieve 4NF, we decompose the table to separate the multivalued dependency. We create two tables: one for student details and another for course details.
StudentDetails Table:
+-------+---------+------------+
| St.No | St.Name | St.Phone |
+-------+---------+------------+
| 1 | Alice | 123-4567 |
| 2 | Bob | 987-6543 |
| 3 | Charlie | 111-2222 |
+-------+---------+------------+
CourseDetails Table:
+-------+---------+
| St.No | Course |
+-------+---------+
| 1 | Math |
| 1 | English |
| 2 | History |
| 2 | Math |
| 3 | Science |
+-------+---------+
Explanation:
In this restructured form, the “StudentDetails” table contains information about each student, and the “CourseDetails” table contains information about the courses associated with each student. This eliminates the multivalued dependency and ensures that the database is in the Fourth Normal Form.