Skip to content
SP StackPractices
intermediate By StackPractices

Database Normalization — 1NF to 5NF Explained

A visual guide to database normalization: learn 1NF through 5NF with practical examples, when to apply each form, and how to balance normalization with performance.

Note: This guide follows English-language naming conventions and terminology standards common in international development teams. Examples use English identifiers and comments to maximize compatibility across codebases and tooling.

Overview

Database normalization is the process of organizing data to minimize redundancy and eliminate anomalies during insert, update, and delete operations. The normal forms — from 1NF to 5NF — provide progressive rules for structuring relational databases. Understanding when to apply each form, and when to intentionally break them for performance, separates competent database designers from exceptional ones.

When to Use

  • Designing new relational schemas from scratch
  • Refactoring legacy databases with duplicate data
  • Preparing schemas for transactional workloads (OLTP)
  • Before deciding what to denormalize for reporting (OLAP)

1NF — Atomic Values

Rule: Every column contains only atomic (indivisible) values. No repeating groups.

Before (violates 1NF):

order_idcustomerproducts
1AliceApple, Banana, Cherry

After (1NF compliant):

order_idcustomerproduct
1AliceApple
1AliceBanana
1AliceCherry

2NF — No Partial Dependencies

Rule: All non-key attributes depend on the entire primary key (relevant for composite keys).

Before (violates 2NF):

course_idstudent_idcourse_namestudent_namegrade
CS101S1Intro to CSAliceA

course_name depends only on course_id; student_name only on student_id.

After (2NF compliant):

Enrollments:

course_idstudent_idgrade
CS101S1A

Courses:

course_idcourse_name
CS101Intro to CS

Students:

student_idstudent_name
S1Alice

3NF — No Transitive Dependencies

Rule: Non-key attributes depend only on the primary key, not on other non-key attributes.

Before (violates 3NF):

employee_idnamedepartment_iddepartment_namedepartment_head
E1BobD1EngineeringCarol

department_name and department_head depend on department_id, not employee_id.

After (3NF compliant):

Employees:

employee_idnamedepartment_id
E1BobD1

Departments:

department_iddepartment_namedepartment_head
D1EngineeringCarol

BCNF — Boyce-Codd Normal Form

Rule: For every functional dependency X → Y, X must be a superkey.

Before (violates BCNF):

studentcourseprofessor
AliceCS101Prof. Smith
BobCS101Prof. Smith

course → professor, but course is not a superkey.

After (BCNF compliant):

Enrollments:

studentcourse
AliceCS101
BobCS101

CourseAssignments:

courseprofessor
CS101Prof. Smith

4NF — No Multi-Valued Dependencies

Rule: No multi-valued dependencies except those on a superkey.

Before (violates 4NF):

employeeskilllanguage
AliceJavaEnglish
AliceJavaSpanish
AlicePythonEnglish
AlicePythonSpanish

Skills and languages are independent multi-valued facts.

After (4NF compliant):

EmployeeSkills:

employeeskill
AliceJava
AlicePython

EmployeeLanguages:

employeelanguage
AliceEnglish
AliceSpanish

5NF — Join Dependency / Projected Join

Rule: Every join dependency is implied by the candidate keys.

Before (violates 5NF):

agentcompanyproduct
SmithFordTruck
SmithFordCar
SmithToyotaCar
JonesToyotaCar

After (5NF compliant):

AgentCompany:

agentcompany
SmithFord
SmithToyota
JonesToyota

AgentProduct:

agentproduct
SmithTruck
SmithCar
JonesCar

CompanyProduct:

companyproduct
FordTruck
FordCar
ToyotaCar

Normalization Summary

FormRuleEliminates
1NFAtomic valuesRepeating groups
2NFFull key dependencyPartial dependencies
3NFKey-only dependencyTransitive dependencies
BCNFSuperkey determinantRemaining anomalies
4NFNo multi-valued depsIndependent multi-values
5NFJoin dependenciesReconstructable joins

When to Stop Normalizing

  • 3NF/BCNF is the practical stopping point for most OLTP systems
  • 4NF matters when you have true multi-valued attributes (rare)
  • 5NF is mostly theoretical for production applications
  • Denormalize intentionally when read performance matters more than write integrity

Common Mistakes

  • Over-normalizing to 5NF — adds complexity with minimal practical benefit
  • Under-normalizing to 1NF — leads to update anomalies and data inconsistency
  • Normalizing before understanding queries — the schema should serve the workload
  • Ignoring BCNF — 3NF does not handle all anomalies; BCNF is the stricter standard

FAQ

Do NoSQL databases need normalization? Not in the same way. Document databases often embed related data (denormalization) and use application-level consistency.

Should I always aim for 3NF? Aim for BCNF in transactional systems. For read-heavy analytics, denormalize deliberately.

How does normalization affect indexing? Normalized schemas need more joins, which require careful indexing. Denormalized schemas need fewer joins but more storage and update logic.