AZ-400: Designing and Implementing Microsoft DevOps Solutions
Configuring and Managing Repositories
Purge Data from Source Control
In this article, we explore a critical aspect of repository management: purging data from source control. Purging involves removing unnecessary, large, or sensitive files from your repository to maintain a clean, efficient, and secure codebase. This process not only optimizes repository performance but also ensures that committed sensitive information, such as passwords or API keys, is completely expunged from your version history.
Best Practice
Always verify your changes in a separate branch before permanently rewriting your repository history.
Why Purge Files?
Purging files from your repository can offer several benefits:
- Improved Performance: Reduces repository size and speeds up operations.
- Accidental Commits: Removes large files committed by mistake.
- Security: Eliminates files containing sensitive data like passwords or API keys.
Repository Cleanup Tools
There are two primary tools for repository cleanup:
- Git Filter Repo – A versatile tool that allows for selective rewriting of repository history.
- BFG Repo Cleaner – A faster solution for cleaning up data, which is optimal for simpler use cases.
Practical Examples
Below are some practical examples on how to remove large, unnecessary, or sensitive files from your repository.
Removing Large or Unnecessary Files
To delete large or unnecessary files and reduce repository size, use one of the following commands:
bfg --delete-files yourfile.ext
or
git filter-repo --path archive.tar.gz --invert-paths
Removing Sensitive Content
If sensitive content (such as API keys or credentials) has been accidentally committed, follow these commands for a secure cleanup:
bfg --delete-files yourfile.ext
or
git filter-repo --path archive.tar.gz --invert-paths
To remove sensitive text from your repository history, run:
bfg --replace-text passwords.txt
or
git filter-repo --replace-text passwords.txt
These examples highlight the flexibility and power of both cleanup tools.
Final Steps After Purging
After purging the unwanted data, complete these final steps to ensure consistency and integrity across your development environment:
Force Push the Changes:
Update your remote repository with the cleaned history by running:git push --force
Notify Your Collaborators:
Inform your team to reclone the repository so that everyone’s local copy reflects the revised history.
Important Reminder
After purging data and force pushing the changes, it is crucial to communicate with your team immediately to prevent potential conflicts or outdated references.
Watch Video
Watch video content