DP-900: Microsoft Azure Data Fundamentals

Structured Data

Normalizing Data

Welcome back to the Azure Data Fundamentals course. In this lesson, we’ll dive into how Relational Database Management Systems (RDBMS) rely on structured data, why normalization is essential, and how to design an efficient, normalized database schema.

Structured Data and Relational Databases

Relational databases organize information into tables, where each table consists of rows (records) and columns (fields). Every column has a defined data type, and each row adheres to the same column structure.

Example: Customer Table

  • Customer ID: Primary key (unique identifier)
  • First Name: Text / string
  • Last Name: Text / string

Each row represents one customer entity.

Note

The diagram below illustrates the Customer table structure, highlighting the primary key.

The image illustrates a table normalization concept with columns for Customer ID, First Name, and Last Name, highlighting the Customer ID as the primary key.

Why Normalization Matters

Normalization is the process of organizing columns and tables to minimize redundancy and prevent update anomalies. Its main objectives are:

  1. Entity Separation: Each table contains data for a single entity (e.g., Customer, Order, Product).
  2. Eliminate Duplicates: No two rows should have identical values across all columns.

Without normalization, you risk:

  • Data Redundancy: Repeated storage of the same information.
  • Inconsistent Updates: Changing a customer’s address requires updates in multiple places.

Denormalized Example

A denormalized Sales Order table might look like this:

  • Sales Order ID
  • Name
  • Address
  • Product
  • Quantity

Here, a customer’s name and address are stored with every order.

Note

Below is a denormalized sales order table showing repeated customer details.

The image is a table illustrating why entities matter, showing columns for Sales Order ID, Name, Address, Product, and Quantity with sample data.

Designing a Normalized Schema

To normalize your data:

  1. Identify Entities: Separate data into tables like Customer, Product, and Sales Order.
  2. Define Keys: Assign a primary key for each table and use foreign keys for relationships.
  3. Specify Data Types: Choose appropriate types (e.g., numeric(10,2) for prices).

Normalized Schema Overview

Note

This diagram shows the normalized schema with three tables and their relationships.

The image shows a table schema with three sections: customer information, product details, and sales order data, each with corresponding fields and values. It illustrates how data is organized in a database table format.

TablePrimary KeyForeign KeysKey Columns
CustomerCustomer IDFirst Name, Last Name
ProductProduct IDPrice (numeric(10,2))
Sales OrderSales Order IDCustomer ID → Customer,Quantity
Product ID → Product

How It Works
To find all orders for “Peter Vogel,” you simply join the Sales Order table with the Customer table on Customer ID. Normalized design:

  • Reduces storage costs
  • Enforces data integrity
  • Simplifies maintenance and queries

Further Reading and References

Watch Video

Watch video content

Previous
Demo Creating Azure SQL