DP-900: Microsoft Azure Data Fundamentals

File Based Storage

File Based Storage

In this lesson, we’ll explore how to store and manage file-based (object) data in Azure as part of the DP-900: Microsoft Azure Data Fundamentals certification.

What Is File-Based Storage?

File-based storage—also known as object storage or unstructured data storage—treats entire files (images, documents, archives) as single objects. You upload, download, or access the complete file, not fragments.

The image features a folder icon with a lock symbol and asterisks, suggesting secure storage, accompanied by the text "Access as one thing."

Since each file object encapsulates all of its content—whether mixed media, text, or binary—Azure handles it as an opaque unit. You won’t retrieve half an image or half a Word document.

The image shows a document labeled "Whole object" above an open briefcase, with the text "What Are We Storing?"

Microsoft terms this unstructured data because a single file can contain any type of content.

The image features the text "What Are We Storing?" and an icon of a briefcase with the label "Object storage."

For example, an audio file might combine voice, music, and sound effects into one file object.

The image is a presentation slide titled "What Are We Storing?" featuring an icon of an audio file with symbols of a microphone, a person speaking, a music note, and sound waves, labeled as "Unstructured data."

Or a Word document could include emails, images, and text—yet you handle it as a single upload or download operation.

The image is a diagram showing a document with email addresses and a short story, labeled as "unstructured data," under the heading "What Are We Storing?"

Note

Whether you refer to it as object storage or unstructured data, the key concept is that the file is managed as a single object.

The image is a diagram showing the concept of storing a "whole file," which is uploaded and downloaded as a single object.

Kinds of File Data

Although all files are stored as binary data in Azure, we categorize them by structure:

CategoryDescriptionCommon Extensions
Text-basedEditable with any text editor.csv, .json, .xml, .txt
BinaryRequires specialized software to interpret.pdf, .jpeg, .png, .exe

The image shows two overlapping documents with binary numbers on them, labeled "Binary data," under the heading "Kinds of Data."

Popular text-based formats use delimiters or tags:

  • CSV (comma-separated values)
  • JSON (JavaScript Object Notation)
  • XML (eXtensible Markup Language)

The image illustrates different kinds of data formats, including a text editor (Notepad) and delimited text-based files such as CSV, JSON, and XML.

Corporate File Types and Export Formats

Our global organization maintains a variety of file assets:

  • PDF catalogs for product listings
  • Word documents with specifications
  • Image files (PNG, GIF, JPEG)

When sharing data externally, we select formats that recipients can readily consume.

The image shows icons representing different file types: PDF, Images, CSV, JSON, and XML, labeled under "Our Company's Files" and "Export."

Below are detailed examples of three common export formats.

CSV Example

CSV files use rows ending with CRLF (\r\n) and commas to separate fields:

First Name,Last Name,City,Province,Country
Peter,Vogel,London,Ontario,Canada
Jason,van de Velde,Toronto,Canada,Registered
  • The first row often contains column headers.
  • Each subsequent row represents a record.

JSON Example

JSON uses arrays ([ ]) and objects ({ }) with key-value pairs for nested structures:

[
  {
    "person": {
      "First Name": "Peter",
      "Last Name": "Vogel",
      "Address": {
        "City": "London",
        "Country": "Canada"
      }
    }
  },
  {
    "person": {
      "First Name": "Peter",
      "Middle Name": "Hunter",
      "Last Name": "Vogel"
    }
  }
]
  • Flexible schema; fields can vary between records.

XML Example

XML employs tags to define elements and hierarchy:

<Person>
  <FirstName>Peter</FirstName>
  <LastName>Vogel</LastName>
  <Address>
    <City>London</City>
    <Country>Canada</Country>
  </Address>
</Person>
<Person>
  <FirstName>Peter</FirstName>
  <LastName>Vogel</LastName>
  <!-- Additional elements can go here -->
</Person>
  • Ideal for document-centric and hierarchical data.

Other Format: Avro

Avro files combine a JSON schema header with a binary payload:

The image describes the Avro data format, highlighting that it consists mostly of binary data and begins with a JSON header that describes the structure of the binary data.

This hybrid approach is optimized for big data systems like Hadoop.

Storing File-Based Data in Azure

Azure Storage Accounts host file-based objects in Blob Storage. You can:

  • Upload/download entire files via REST API, SDK, or CLI
  • Configure access tiers (Hot, Cool, Archive)
  • Secure with Azure RBAC and SAS tokens

Warning

Azure Blob Storage does not enforce any schema on your files. Always validate and parse unstructured data at the application level.

Watch Video

Watch video content

Previous
Course Introduction