Certified Backstage Associate (CBA)

Catalog

Demo Integrations Entity Provider

Backstage can automatically discover entities by scanning GitHub repositories for catalog-info.yaml files. In this guide, you’ll install and configure the GitHub Entity Provider to import components, systems, APIs, and more—without manual registration.

1. Install the GitHub Discovery Plugin

Backstage provides a dedicated backend module for GitHub discovery. From your repo root, run:

yarn --cwd packages/backend add @backstage/plugin-catalog-backend-module-github

2. Register the Plugin in the Backend

Edit packages/backend/src/index.ts to include the GitHub module:

// packages/backend/src/index.ts

import { createRouter } from '@backstage/plugin-catalog-backend';
import githubDiscovery from '@backstage/plugin-catalog-backend-module-github';

export default async function main() {
  const backend = await createBackend();

  // Core plugins
  backend.add(import('@backstage/plugin-search-backend'));
  backend.add(import('@backstage/plugin-search-backend-module-pg'));
  backend.add(import('@backstage/plugin-search-backend-module-catalog'));
  backend.add(import('@backstage/plugin-search-backend-module-techdocs'));
  backend.add(import('@backstage/plugin-kubernetes-backend'));
  backend.add(import('@backstage/plugin-catalog-backend'));

  // Add GitHub discovery
  backend.add(import('@backstage/plugin-catalog-backend-module-github'));

  await backend.start();
}

Note

If you see duplicate registrations for the GitHub module, remove the extras to avoid startup errors.

Now restart the backend:

cd packages/backend
yarn dev

3. Configure the GitHub Entity Provider

Add a github provider under catalog.providers in app-config.yaml:

catalog:
  providers:
    github:
      personalAccount:
        organization: 'Sanjeev-Thiyagarajan'
        catalogPath: '/catalog-info.yaml'
        filters:
          branch: 'main'
          repository: '.*'
        schedule:
          frequency: { minutes: 20 }
          timeout: { minutes: 3 }

import:
  entityFilename: catalog-info.yaml
  pullRequestBranchName: backstage-integration
  rules:
    - allow: [Component, System, API, Resource, Location, Group, User, Domain]
Configuration KeyDescription
providerIdUnique identifier (personalAccount in this example)
organizationGitHub user/org name (where to scan for entities)
catalogPathPath to each repo’s catalog-info.yaml
filters.branchBranch to scan (e.g., main)
filters.repositoryRegex for repo names (e.g., .* for all)
schedule.frequencyHow often to poll GitHub (e.g., every 20 minutes)
schedule.timeoutMax time per scan (e.g., 3 minutes)

Warning

Ensure catalogPath matches your catalog-info.yaml location exactly. A mismatch prevents discovery.

4. Verify Automatic Discovery

Restart your Backstage frontend:

yarn dev

Backstage will poll your GitHub account and import any entities it finds. For example, you might see:

The image shows a dashboard from "My Company Catalog" in Backstage, listing three components: "auth-service," "example-website," and "shopping-cart," with details like system, owner, type, lifecycle, and tags.

Under the hood, it scans each repo’s main branch for catalog-info.yaml:

catalog:
  providers:
    github:
      personalAccount:
        organization: 'Sanjeev-Thiyagarajan'
        catalogPath: '/catalog-info.yaml'
        filters:
          branch: 'main'
          repository: '.*'
        schedule:
          frequency: { minutes: 20 }
          timeout: { minutes: 2 }

import:
  entityFilename: catalog-info.yaml
  pullRequestBranchName: backstage-integration

For example, the backstage-shopping-cart repo contains:

The image shows a GitHub repository page named "backstage-shopping-cart," which is private and contains several files and folders. The repository is primarily written in JavaScript and lacks a README file.

5. Importing from a GitHub Organization

To import from another GitHub org (e.g., shopping-hub), add a second provider block:

catalog:
  providers:
    github:
      personalAccount:
        organization: 'Sanjeev-Thiyagarajan'
        catalogPath: '/catalog-info.yaml'
        filters:
          branch: 'main'
          repository: '.*'
        schedule:
          frequency: { minutes: 20 }
          timeout: { minutes: 2 }

      shoppingHub:
        organization: 'shopping-hub'
        catalogPath: '/catalog-info.yaml'
        filters:
          branch: 'main'
          repository: '.*'
        schedule:
          frequency: { minutes: 20 }
          timeout: { minutes: 2 }

import:
  entityFilename: catalog-info.yaml
  pullRequestBranchName: backstage-integration
  rules:
    - allow: [Component, System, API, Resource, Location, Group, User, Domain]

Here’s the GitHub UI showing three private repos in shopping-hub:

The image shows a GitHub repository page with three private repositories listed, all updated on January 4th. The repositories are named app3, app2, and app1, and are written in JavaScript.

After restarting, Backstage imports all three:

The image shows a dashboard from "My Company Catalog" in Backstage, listing various components with details like name, system, owner, type, lifecycle, description, and tags.

Each repo has its own catalog-info.yaml. For instance, app3 contains YAML and JSON definitions:

The image shows a GitHub repository page for a project named "app3," displaying a list of files and folders, including YAML and JSON files. The repository is private, with no stars or forks, and the main programming language used is JavaScript.

Next Steps

Backstage supports additional entity providers such as GitLab, Amazon S3, and more.
Refer to the Backstage documentation for full integration guides and advanced configuration.

Watch Video

Watch video content

Previous
Demo Integrations Private Repository