Demo notebook showing LLM-driven text processing tasks such as summarization, sentiment analysis, poetic translation, and text-to-structure conversion, with examples, code snippets, and best practices.
We’re starting with a fresh Jupyter notebook to demonstrate common text-processing tasks using a chat-based LLM: summarization, bullets, sentiment analysis, translation, and converting plain text into structured formats. These small utilities show practical patterns for building reproducible prompts and integrating model outputs into downstream workflows.
Overview: tasks and examples
Task
Goal
Example prompt type
Summarization
Produce concise or length-limited summaries
500-word summary, bullet points
Sentiment analysis
Classify text as Positive / Negative / Neutral
Label customer reviews
Translation (tone-preserving)
Translate while keeping poetic or rhetorical tone
French poem → English poetic rendering
Format conversion
Convert semi-structured text to JSON / XML / JSONL
States → structured records
Setup
Load your API key from an environment variable and define a helper that wraps the ChatCompletion API. Keep API keys out of source code and follow your organization’s secret management policies.
Store sensitive credentials (like OPENAI_API_KEY) in environment variables or
a secrets manager. Avoid hard-coding keys in notebooks.
Copy
# In [1]import osimport openaiopenai.api_key = os.getenv("OPENAI_API_KEY")def get_word_completion(prompt: str, model: str = "gpt-3.5-turbo") -> str: """ Send a chat-style prompt and return the assistant's content string. """ messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt}, ] response = openai.ChatCompletion.create(model=model, messages=messages) return response.choices[0].message.content
Best practices
Separate context (source text) from instructions (the prompt). This makes prompts reusable and easier to test.
Delimit large contexts (e.g., using triple backticks) so the model can clearly distinguish input data from the instruction.
When requesting structured outputs (JSON, XML, CSV), be explicit about the required schema to minimize parsing errors.
When embedding large context into prompts, delimit it (for example with triple
backticks) so the model can clearly distinguish the source content from the
instruction.
Summarization
Keep the source text and the instruction separate. Here’s an excerpt from Steve Jobs’ 2005 Stanford commencement address. We ask for a 500-word summary, then show how to request a bullet-point summary for scannability.
Copy
# In [2]: Context (excerpt)context = '''Steve Jobs' 2005 Stanford Commencement AddressI am honored to be with you today at your commencement from one of the finest universities in the world. I never graduThe first story is about connecting the dots.I dropped out of Reed College after the first 6 months, but then stayed around as a drop-in for another 18 months or so'''
500-word summary prompt and invocation:
Copy
# In [3]: Prompt asking for a 500-word summaryprompt = f"""Create a summary capturing the main points and key details in 500 words based on the content delimited by triple backticks.```{context}```"""response = get_word_completion(prompt)print(response)
Bullet summary (quick, scannable):
Copy
# In [4]: Bullet summary promptprompt_bullets = f"""Create a summary capturing the main points and key details as bullets based on the content delimited by triple backticks.```{context}```"""response_bullets = get_word_completion(prompt_bullets)print(response_bullets)
Sample bullet-form output (example):
Steve Jobs delivered a commencement address at Stanford University in 2005 and shared three stories from his life.
First story: connecting the dots — dropping out led him to learn calligraphy, which later influenced Macintosh design.
Second story: love and loss — getting fired from Apple enabled him to start anew (NeXT, Pixar) and eventually return.
Third story: death — facing mortality focused his priorities; follow your intuition and live authentically.
Closing advice: “Stay Hungry. Stay Foolish.” — remain curious and brave in pursuing your work.
Sentiment analysis
Use the same structure: pass the text as context, then instruct the model how to label each item. This pattern is useful for generating labeled datasets for downstream model training or analysis.
Copy
# In [5]: Sentiment analysis contextcontext_reviews = '''1. If you sometimes like to go to the movies to have fun, Wasabi is a good place to start.2. An idealistic love story that brings out the latent 15-year-old romantic in everyone.3. The story loses its bite in a last-minute happy ending that's even less plausible than the rest of the picture.'''prompt_sentiment = f"""Analyze the sentiment of the reviews delimited in triple backticks.First show the actual review and then add the sentiment - Positive, Negative, or Neutral.```{context_reviews}```"""response_sentiment = get_word_completion(prompt_sentiment)print(response_sentiment)
Expected output example:
If you sometimes like to go to the movies to have fun, Wasabi is a good place to start.
Sentiment: Positive
An idealistic love story that brings out the latent 15-year-old romantic in everyone.
Sentiment: Positive
The story loses its bite in a last-minute happy ending that’s even less plausible than the rest of the picture.
Sentiment: Negative
Note: LLMs can be used to generate labeled data (for example, labeling customer reviews or social media posts) which you can then use for downstream analysis or to train supervised models.Translation (poetic translation)
LLMs can translate and preserve tone. Provide the poem as context and request a tone-preserving English rendering.
Copy
# In [6]: Poem translation contextcontext_poem = """Demain, dès l'aube, à l'heure où blanchit la campagne,Je partirai. Vois-tu, je sais que tu m'attends.J'irai par la forêt, j'irai par la montagne.Je ne puis demeurer loin de toi plus longtemps.Je marcherai les yeux fixés sur mes pensées,Sans rien voir au dehors, sans entendre aucun bruit,Seul, inconnu, le dos courbé, les mains croisées,Triste, et le jour pour moi sera comme la nuit.Je ne regarderai ni l'or du soir qui tombe,Ni les voiles au loin descendant vers Harfleur,Et quand j'arriverai, je mettrai sur ta tombeUn bouquet de houx vert et de bruyère en fleur."""prompt_translate = f"""Write an English poem based on the French poem delimited in triple backticks.```{context_poem}```"""response_translate = get_word_completion(prompt_translate)print(response_translate)
Sample poetic translation (example):
Tomorrow, at dawn’s early light,
I shall depart, for I know you await.
Through forest and mountain, I’ll take flight,
For I cannot bear this distance, this weight.With eyes fixed on my thoughts, I’ll tread,
Unseeing of the world, deaf to its sound.
Alone, unknown, stooped with a heavy head,
Gloomy, for me, day will be night unbound.I’ll not gaze upon the evening’s golden hue,
Nor watch distant sails descend to Harfleur.
And when I arrive, a bouquet I’ll bestrew,
Of green holly and blooming heather pure.And there, upon your grave, my tribute laid,
I’ll feel your presence, though you’ve been away.Format conversion (plain text → JSON / XML / JSONL)
Convert semi-structured plain text into structured formats for ingestion into pipelines and databases. Be explicit about the desired output schema (keys, types) to reduce ambiguity.
Copy
# In [7]: States and capitals contextcontext_states = """1. Alabama - Montgomery2. California - Sacramento3. Florida - Tallahassee4. Georgia - Atlanta5. Illinois - Springfield6. Massachusetts - Boston7. New York - Albany8. Texas - Austin9. Pennsylvania - Harrisburg10. Washington - Olympia"""prompt_formats = f"""From the content delimited in triple backticks, format it in JSON, XML, and JSONL.```{context_states}```"""response_formats = get_word_completion(prompt_formats)print(response_formats)
Note: JSONL (also called “newline-delimited JSON” or “NDJSON”) is not the same as JSON-LD. JSONL means each line is a separate valid JSON object — convenient for streaming and line-by-line processing.Summary
What we covered:
Summarization: fixed-length summaries and bullet-style output, emphasizing separation of context and prompt.
Sentiment analysis: labeling text as Positive / Negative / Neutral for downstream use.
Translation: preserving tone (poetic translation example).
Format conversion: converting semi-structured text into JSON, XML, and JSONL for pipelines.