Skip to main content
This lesson shows how to extract AI-driven insights from video and audio using custom models. You will learn how to detect people, improve transcriptions with domain-aware language models, and identify brands—boosting searchability, content understanding, and personalization for video assets.

What you can build

  • Face recognition and tracking across video timelines for people analytics and personalization.
  • Domain-customized transcription for industry-specific vocabulary and multilingual audiences.
  • Brand detection (logos, product mentions) to enable content classification and rights management.

Custom model types and use cases

Model typeTypical use caseBenefit
People (facial recognition)Identify and track individuals across footagePersonalization, credits, analytics
Language (custom transcription/translation)Recognize domain-specific terms and translate transcriptsHigher accuracy, multilingual distribution
Brand detectionLocate logos or named products in video framesRights tracking, advertising analytics
You can combine these models to create richer metadata for indexing, search, and downstream automation. To enable facial recognition, create a Face resource in Azure AI Services and connect it with Video Indexer to let indexed videos use face models:
Face recognition may require special approval from Microsoft and may be restricted in some regions or for certain accounts. Request and obtain the required access before using face recognition features.
A dark-themed slide titled "Building Insights" with three numbered panels: 01 People (facial recognition), 02 Language (domain-specific transcription/terminology), and 03 Brand (detect product/company names). Each panel includes an icon and a brief description of the model use.

Indexer options: widgets vs REST API

Video Indexer provides two primary integration patterns:
  • Widgets (embed/iframe): Quick, interactive visualization of insights (topics, people, scenes, transcripts) you can drop into web pages. Use this when you want a low-code front-end integration and interactive playback.
  • REST API: Programmatic access to metadata, management, and automation. Choose the API for CI/CD, custom dashboards, batch processing, or workflows that integrate with other services.
Reference: Use widgets when embedding the full insight UI. If the video is private, ensure viewers are authenticated or have permission before embedding. For automated scenarios or extracting metadata in pipelines, call the REST API to fetch structured data and orchestrate processing.
A slide titled "Video Indexer Widgets and API" showing a "REST API for Automation" callout and a text bubble that reads "Retrieve video metadata, including account details, duration, processing status, and language." The slide has a dark teal background with a circular icon at left and a small "© Copyright KodeKloud" note at the bottom.

Sample REST API response (illustrative)

The JSON shown below is a simplified example of metadata returned by the Video Indexer APIs. Actual responses may include additional fields depending on the request parameters and enabled features.
{
  "results": [
    {
      "accountId": "1234abcd-9876fghi-0156kihb-00123",
      "id": "a12345bc6",
      "name": "Responsible AI",
      "description": "Microsoft Responsible AI video",
      "created": "2021-01-05T15:33:58.918+00:00",
      "lastModified": "2021-01-05T15:50:03.123+00:00",
      "lastIndexed": "2021-01-05T15:34:08.007+00:00",
      "processingProgress": "100%",
      "durationInSeconds": 114,
      "sourceLanguage": "en-US"
    }
  ]
}
Key metadata fields to use in automation and analytics:
  • accountId, id: identify the account and video resource.
  • name, description: human-readable labels for UI and reports.
  • created, lastModified, lastIndexed: timeline for processing and audits.
  • processingProgress: track indexing status for orchestration.
  • durationInSeconds, sourceLanguage: media properties for players and translations.

Embedding, access, and automation tips

  • Embedding: copy the widget iframe from Video Indexer to display the indexed video and its insights on your web pages. Example direct URL pattern:
https://www.videoindexer.ai/accounts/0ae29563-3796-4210-b2f0-0590d4a45948/videos/slut1smuc
  • Access control: private videos require viewer authentication and permission; public videos can be embedded broadly.
  • Automation: use the REST API for retrieving metadata, downloading assets (transcripts, thumbnails), and integrating insights into search indexes, CMS platforms, and analytics pipelines.
For full endpoint details, authentication methods, and examples, see the Video Indexer REST API reference: This concludes the lesson on building insights with Video Indexer—use custom models and the API to transform raw media into searchable, actionable intelligence.

Watch Video