DP-900: Microsoft Azure Data Fundamentals

Cosmos DB

Introduction

Welcome to Azure Data Fundamentals Module 4. In this lesson, we’ll dive into Azure Cosmos DB, Microsoft’s globally distributed, multi-model database service. Cosmos DB delivers turnkey global distribution, elastic scalability, and comprehensive SLAs for throughput, availability, latency, and consistency.

Key Features of Azure Cosmos DB

Azure Cosmos DB extends the familiar semi-structured model of Azure Table Storage and adds two powerful capabilities:

  1. Global Distribution
  2. Multi-Model APIs

Global Distribution Overview

Cosmos DB’s global distribution lets you transparently replicate your data to any number of Azure regions. This ensures:

  • High availability with 99.999% read/write SLA
  • Millisecond latency for reads and writes
  • Built-in disaster recovery and failover

The image is an illustration explaining Cosmos DB, a semi-structured database that can distribute data to multiple regions, with a world map showing various locations.

In Azure, a region consists of one or more data centers with virtually zero network latency between them. By replicating your containers and databases across regions, Cosmos DB automatically routes requests to the nearest replica.

Note

Global distribution not only reduces latency but also improves resilience by automatically failing over to another region if one goes offline.

Multi-Model API Support

Cosmos DB lets you choose the API that best fits your workload or existing codebase. Each API provides a well-documented surface area for data operations, so your application doesn’t need to know about the service internals.

The image is a diagram illustrating the concept of APIs (Application Programming Interfaces), showing two applications connected by dotted lines, with a label indicating "Application Programming Interface."

Supported Cosmos DB APIs

API ModelDescriptionQuery Language / Surface
Document APIJSON-based documentsCore (SQL) API & MongoDB native
Table APIKey-value store compatible with Azure Table StorageOData / Azure Storage SDK
Gremlin APIProperty graph with vertices and edgesGremlin
Cassandra APIWide-column store with tunable consistencyCassandra Query Language (CQL)
PostgreSQL APIRelational model for Postgres workloadsPostgreSQL wire protocol

The image illustrates the integration of Cosmos DB with various database types, including documents, key-value tables, relational databases, graphs, and column-family stores, featuring technologies like MongoDB, NoSQL, Table Storage, PostgreSQL, Gremlin, and Cassandra.

Global Data Access Example

To see global distribution in action, imagine a manufacturing plant in Mumbai and a customer portal in Apollo. By configuring Cosmos DB to replicate between South India and Central India regions, both employees and clients experience low-latency reads and writes:

The image shows a world map with icons representing data distribution locations, illustrating global data sharing with Cosmos DB for low latency access.

Workflow

  1. A sensor in Mumbai writes telemetry to the nearest Cosmos DB endpoint.
  2. Cosmos replicates the data to Apollo in under 10 ms.
  3. A customer dashboard in Apollo reads the latest data from the local replica.

Warning

Be mindful of your chosen consistency level (e.g., Session, Consistent Prefix) to balance latency and data correctness across regions.

Watch Video

Watch video content

Previous
Roles and Responsibilities