DP-900: Microsoft Azure Data Fundamentals
File Based Storage
File Based Storage
In this lesson, we’ll explore how to store and manage file-based (object) data in Azure as part of the DP-900: Microsoft Azure Data Fundamentals certification.
What Is File-Based Storage?
File-based storage—also known as object storage or unstructured data storage—treats entire files (images, documents, archives) as single objects. You upload, download, or access the complete file, not fragments.
Since each file object encapsulates all of its content—whether mixed media, text, or binary—Azure handles it as an opaque unit. You won’t retrieve half an image or half a Word document.
Microsoft terms this unstructured data because a single file can contain any type of content.
For example, an audio file might combine voice, music, and sound effects into one file object.
Or a Word document could include emails, images, and text—yet you handle it as a single upload or download operation.
Note
Whether you refer to it as object storage or unstructured data, the key concept is that the file is managed as a single object.
Kinds of File Data
Although all files are stored as binary data in Azure, we categorize them by structure:
Category | Description | Common Extensions |
---|---|---|
Text-based | Editable with any text editor | .csv , .json , .xml , .txt |
Binary | Requires specialized software to interpret | .pdf , .jpeg , .png , .exe |
Popular text-based formats use delimiters or tags:
- CSV (comma-separated values)
- JSON (JavaScript Object Notation)
- XML (eXtensible Markup Language)
Corporate File Types and Export Formats
Our global organization maintains a variety of file assets:
- PDF catalogs for product listings
- Word documents with specifications
- Image files (PNG, GIF, JPEG)
When sharing data externally, we select formats that recipients can readily consume.
Below are detailed examples of three common export formats.
CSV Example
CSV files use rows ending with CRLF (\r\n
) and commas to separate fields:
First Name,Last Name,City,Province,Country
Peter,Vogel,London,Ontario,Canada
Jason,van de Velde,Toronto,Canada,Registered
- The first row often contains column headers.
- Each subsequent row represents a record.
JSON Example
JSON uses arrays ([ ]
) and objects ({ }
) with key-value pairs for nested structures:
[
{
"person": {
"First Name": "Peter",
"Last Name": "Vogel",
"Address": {
"City": "London",
"Country": "Canada"
}
}
},
{
"person": {
"First Name": "Peter",
"Middle Name": "Hunter",
"Last Name": "Vogel"
}
}
]
- Flexible schema; fields can vary between records.
XML Example
XML employs tags to define elements and hierarchy:
<Person>
<FirstName>Peter</FirstName>
<LastName>Vogel</LastName>
<Address>
<City>London</City>
<Country>Canada</Country>
</Address>
</Person>
<Person>
<FirstName>Peter</FirstName>
<LastName>Vogel</LastName>
<!-- Additional elements can go here -->
</Person>
- Ideal for document-centric and hierarchical data.
Other Format: Avro
Avro files combine a JSON schema header with a binary payload:
This hybrid approach is optimized for big data systems like Hadoop.
Storing File-Based Data in Azure
Azure Storage Accounts host file-based objects in Blob Storage. You can:
- Upload/download entire files via REST API, SDK, or CLI
- Configure access tiers (Hot, Cool, Archive)
- Secure with Azure RBAC and SAS tokens
Warning
Azure Blob Storage does not enforce any schema on your files. Always validate and parse unstructured data at the application level.
Links and References
- Azure Blob Storage Documentation
- DP-900: Microsoft Azure Data Fundamentals
- Introduction to Object Storage
Watch Video
Watch video content