Guide to Azure AI Language Services text analysis features and examples, covering language detection, key phrase extraction, sentiment, named entity recognition, entity linking, summarization, and PII detection.
This guide demonstrates the core text-analysis capabilities available in Azure AI Language Services. It covers language detection, key phrase extraction, sentiment analysis, named entity recognition (NER), entity linking, summarization, and PII detection — with REST-style JSON payload examples and concise Python SDK snippets that illustrate common usage patterns.Use these features to build multilingual, privacy-aware, and searchable applications that extract meaningful information from unstructured text.
Never hard-code secrets (endpoint, keys) in production code. Store credentials in environment variables or a secure secrets store and load them at runtime.
For local testing, put your Azure Language endpoint and key in environment variables (or a .env file) and load them at runtime. The samples below assume you already have those values available.
Detects the language of a text and returns a confidence score. It supports automatic detection across many scripts (Latin, Arabic, Chinese, etc.). You can optionally provide a country hint to influence detection, but it is not required.
Example request payload (REST-style):
Copy
{ "documents": [ { "id": "1", "countryHint": "Spain", "text": "Hola, ¿cómo estás?" }, { "id": "2", "text": "Guten Morgen, wie geht es Ihnen?" } ]}
Extracts prominent words and short phrases (topics) from text. This is helpful for search indexing, content tagging, and summarization—best applied to longer passages.
Request payload example:
Copy
{ "documents": [ { "id": "1", "language": "en", "text": "Artificial intelligence is transforming industries with automation and analytics." }, { "id": "2", "language": "en", "text": "Climate change is a critical issue that affects global economies and ecosystems." } ]}
Python SDK example — extract key phrases from a single long document:
Copy
from azure.ai.textanalytics import TextAnalyticsClientfrom azure.core.credentials import AzureKeyCredentialendpoint = "https://<your-resource>.cognitiveservices.azure.com/"key = "<your-key>"documents = [ "Golden retrievers are one of the most popular dog breeds, known for their friendly, intelligent, and devoted nature. They are excellent family pets and are often used as guide dogs, therapy dogs, and in search-and-rescue operations due to their trainability and gentle temperament."]client = TextAnalyticsClient(endpoint=endpoint, credential=AzureKeyCredential(key))response = client.extract_key_phrases(documents=documents)for idx, doc in enumerate(response): print(f"\nText: {documents[idx]}") if not doc.is_error: print("\nKey Phrases:") for phrase in doc.key_phrases: print(f" - {phrase}") else: print(f"Document error: {doc.error}")
Classifies documents (and sentences) as positive, neutral, negative, or mixed and returns confidence scores. Useful for product feedback, social-media analysis, and customer support automation.
Request example:
Copy
{ "documents": [ { "id": "1", "language": "en", "text": "I love the new design! However, the app crashes frequently, which is frustrating." } ]}
NER extracts entities such as people, organizations, locations, datetimes, addresses, emails, and URLs from text. Use this to populate structured metadata, build knowledge graphs, or enhance search relevance.
Request example:
Copy
{ "documents": [ { "id": "1", "language": "en", "text": "Elon Musk announced a new Tesla model in California last Friday." } ]}
Python SDK example — recognize and list named entities:
Copy
from azure.ai.textanalytics import TextAnalyticsClientfrom azure.core.credentials import AzureKeyCredentialendpoint = "https://<your-resource>.cognitiveservices.azure.com/"key = "<your-key>"documents = [ "The capital of United States is Washington, D.C."]client = TextAnalyticsClient(endpoint=endpoint, credential=AzureKeyCredential(key))response = client.recognize_entities(documents=documents)for idx, doc in enumerate(response): print(f"\nText: {documents[idx]}") if not doc.is_error: print("\nNamed Entities:") for entity in doc.entities: print(f"- {entity.text} ({entity.category}, Confidence: {entity.confidence_score:.2f})") else: print(f"Error: {doc.error}")
Entity linking (or entity resolution) maps recognized mentions to entries in an external knowledge base (for example, Wikipedia). This disambiguates mentions such as “Paris” (city) vs “Paris” (person) and provides authoritative metadata (IDs and URLs).
Request example:
Copy
{ "documents": [ { "id": "1", "language": "en", "text": "Apple launched a new iPhone." } ]}
Summarization creates concise representations of long documents. You can choose:
Extractive summarization — select the most important sentences verbatim.
Abstractive summarization — generate a rewritten, shorter summary.
Summarization is useful for overviews of documents, slide decks, and long reports. Implementation details vary by SDK version and whether the operation is long-running (poller-based). Consult the Azure docs for the exact method name and behavior in your installed package.
Input example:
Copy
{ "documents": [ { "id": "1", "language": "en", "text": "Artificial intelligence is shaping the future. AI helps in automation, decision-making and improving efficiency in various industries." } ]}
Illustrative extractive summary output:
Copy
{ "documents": [ { "id": "1", "sentences": [ { "text": "Artificial intelligence is shaping the future.", "rankScore": 0.80 }, { "text": "AI helps in automation, decision-making, and improving efficiency in various industries.", "rankScore": 0.75 } ] } ]}
Personally Identifiable Information (PII) detection and redaction
PII detection identifies sensitive data such as names, phone numbers, emails, and Social Security numbers. After detection you can redact or mask values to help meet privacy and compliance requirements (for example, GDPR or HIPAA).
Request example:
Copy
{ "documents": [ { "id": "1", "language": "en", "text": "Contact me at john.doe@email.com or call me at +1-555-123-4567." } ]}
Illustrative redacted output:
Copy
{ "documents": [ { "id": "1", "redactedText": "Contact me at *************** or call me at ***************.", "entities": [ { "text": "john.doe@email.com", "category": "Email", "confidenceScore": 0.99 }, { "text": "+1-555-123-4567", "category": "PhoneNumber", "confidenceScore": 0.98 } ] } ]}
Python SDK example — detect PII entities:
Copy
from azure.ai.textanalytics import TextAnalyticsClientfrom azure.core.credentials import AzureKeyCredentialendpoint = "https://<your-resource>.cognitiveservices.azure.com/"key = "<your-key>"documents = [ "My name is John Doe, and my phone number is (555) 123-4567. My SSN is 123-45-6789."]client = TextAnalyticsClient(endpoint=endpoint, credential=AzureKeyCredential(key))response = client.recognize_pii_entities(documents=documents)for idx, doc in enumerate(response): print(f"\nText: {documents[idx]}") if not doc.is_error: print("\nDetected PII Entities:") for entity in doc.entities: print(f" - {entity.text} ({entity.category}, Confidence: {entity.confidence_score:.2f})") else: print(f"Error: {doc.error}")
For legal or compliance-sensitive scenarios, combine detection with secure redaction and follow organizational privacy controls.
Example: single Flask app that runs multiple analyses
You can combine multiple analyses in a single application, keeping each call focused and handling errors per document. The example below shows how to load credentials, initialize a client, and call several analyzers (language, key phrases, sentiment, NER, entity linking, and PII) from a Flask route. This compact pattern is suitable for demos and small apps — for production, add proper error handling, rate limiting, and secrets management.
Copy
import osfrom flask import Flask, request, render_templatefrom azure.ai.textanalytics import TextAnalyticsClientfrom azure.core.credentials import AzureKeyCredentialfrom dotenv import load_dotenv# Load Azure credentials from environmentload_dotenv()endpoint = os.getenv("AZURE_LANGUAGE_ENDPOINT")key = os.getenv("AZURE_LANGUAGE_KEY")# Initialize clientclient = TextAnalyticsClient(endpoint=endpoint, credential=AzureKeyCredential(key))app = Flask(__name__)@app.route("/", methods=["GET", "POST"])def index(): result = {} text = "" if request.method == "POST": text = request.form.get("text", "") if text: # Example: run language detection lang_resp = client.detect_language(documents=[text]) if not lang_resp[0].is_error: result["language"] = { "name": lang_resp[0].primary_language.name, "iso": lang_resp[0].primary_language.iso6391_name, "confidence": lang_resp[0].primary_language.confidence_score } # Key phrases kp_resp = client.extract_key_phrases(documents=[text]) if not kp_resp[0].is_error: result["key_phrases"] = kp_resp[0].key_phrases # Sentiment s_resp = client.analyze_sentiment(documents=[text]) if not s_resp[0].is_error: cs = s_resp[0].confidence_scores result["sentiment"] = { "label": s_resp[0].sentiment, "scores": {"positive": cs.positive, "neutral": cs.neutral, "negative": cs.negative} } # NER ner_resp = client.recognize_entities(documents=[text]) if not ner_resp[0].is_error: result["entities"] = [{"text": e.text, "category": e.category, "confidence": e.confidence_score} for e in ner_resp[0].entities] # Entity linking link_resp = client.recognize_linked_entities(documents=[text]) if not link_resp[0].is_error: result["linked_entities"] = [ {"name": e.name, "url": e.url, "source": e.data_source, "matches": [{"text": m.text, "confidence": m.confidence_score} for m in e.matches]} for e in link_resp[0].entities ] # PII pii_resp = client.recognize_pii_entities(documents=[text]) if not pii_resp[0].is_error: result["pii"] = [{"text": p.text, "category": p.category, "confidence": p.confidence_score} for p in pii_resp[0].entities] return render_template("index.html", text=text, result=result)if __name__ == "__main__": app.run(debug=True)
The screenshot below shows a simple web demo that runs these analyses and displays results (language, sentiment, key phrases, named/PII entities):
That completes this overview of Azure AI Language Services text-analysis features. Use these tools to extract structure, meaning, and privacy-aware metadata from unstructured text across languages and domains.