> ## Documentation Index > Fetch the complete documentation index at: https://notes.kodekloud.com/llms.txt > Use this file to discover all available pages before exploring further. # Module Introduction > Guide to building custom document classifiers and named entity extraction models covering labeling, training, evaluation, deployment, and post-deployment monitoring and best practices. Custom classification and named-entity extraction Text analytics platforms provide powerful pre-built capabilities—such as entity recognition and document classification—that work well out of the box and require no training. These prebuilt features are excellent for general scenarios like detecting people’s names, dates, or common PII in documents. When your application needs to recognize domain-specific items (for example, medical terminology, contract clauses, or proprietary product SKUs) or apply your organization’s own document categories, you’ll need custom models trained on labeled examples. This module walks through the full lifecycle: labeling data, training models, evaluating performance, and deploying production endpoints for real-time inference. Below are the learning objectives for this lesson/article. A presentation slide titled "Learning Objectives" listing three numbered items: 01 Document labeling and model training, 02 Performance evaluation, and 03 Model deployment.

A presentation slide titled "Learning Objectives" listing three numbered items: 01 Document labeling and model training, 02 Performance evaluation, and 03 Model deployment.

Learning objectives (overview) * Document labeling and model training\ Learn how to label documents and annotate text spans for both classification and named-entity extraction. Labeling is the manual process of tagging documents or text fragments with the categories and entity types you want the model to learn. Those labeled examples form the training set for a custom machine-learning model. * Performance evaluation\ Learn to evaluate custom models with standard metrics such as precision, recall, and F1 score. These metrics quantify model behavior on held-out test data, reveal weaknesses (for example, poor recall on rare classes), and guide iterative improvements. * Model deployment\ Learn how to deploy a trained model as a REST API endpoint so your application can call it in real time to classify new documents or extract custom entities. Quick-reference: objectives and outcomes | Objective | Key activities | Deliverable / outcome | Example use case | | ---------------------------- | ------------------------------------------------------------------- | ---------------------------------------------------------- | ----------------------------------------------------------------------------- | | Document labeling & training | Define labels, annotate examples, prepare datasets, run training | A trained custom model ready for evaluation | Classify invoices vs. contracts; extract medication names from clinical notes | | Performance evaluation | Split data, calculate precision/recall/F1, analyze confusion matrix | Metrics and error analysis guiding data/model improvements | Identify low-performing classes and add more labeled examples | | Model deployment | Create REST endpoint, secure access, monitor predictions | Production endpoint for real-time inference and monitoring | Integrate into ingestion pipeline to tag documents on arrival | Use prebuilt text analytics features when they meet your needs (e.g., general entity recognition for common named entities). Choose custom models when you must detect domain-specific entities or apply organization-specific classifications that prebuilt models cannot capture. Best practices covered in this module * Labeling guidelines: tips to create high-quality, consistent annotations (for example: label spans consistently, define clear label definitions, and include edge cases). * Balanced datasets: approaches to handle class imbalance such as targeted labeling, data augmentation, or sampling strategies. * Iterative evaluation: how to use metrics and error analysis to prioritize where to add more labeled data or adjust modeling choices. * Monitoring after deployment: methods for tracking model drift, collecting real-world feedback, and scheduling re-training. Links and references * [Introduction to Named Entity Recognition (NER)](https://en.wikipedia.org/wiki/Named-entity_recognition) — conceptual overview of entity extraction. * [Evaluation metrics for classification](https://developers.google.com/machine-learning/crash-course/classification/precision-and-recall) — primer on precision, recall, and F1. * [Text analytics and custom model guidance](https://learn.microsoft.com/azure/cognitive-services/text-analytics/) — vendor documentation and examples for deploying text analytics solutions. Throughout this module you will learn practical steps and tools to create robust custom text models, plus workflows to maintain model quality after deployment.