Guide to building a recruiter multi agent system that extracts job keywords, scans PDF resumes, transcribes interviews, analyzes alignment, and orchestrates tools into a consolidated report
Welcome back. In this lesson we’ll build a practical multi‑agent system that helps a recruiter automate screening and interview analysis.What is a multi‑agent system?A multi‑agent system is composed of multiple specialized agents (or tools), each with a narrow role. Agents coordinate by passing tasks or data downstream, while a coordinator (or orchestrator) agent controls the overall workflow and composes a final result.Project overviewWe’ll assemble a recruiter-focused system that does the following:
Extract relevant skills and responsibilities from a job description.
Scan local PDF resumes for matches to those skills.
Transcribe an interview audio file and analyze whether the interview questions align with the job posting.
Produce a consolidated report with extracted keywords, resume matches, and interview relevance feedback.
This guide contains the end-to-end implementation and an example runner to execute the workflow.
Ensure your environment variables are configured (for example via a .env file). Set your OpenAI API key at a minimum. Also update RESUME_DIR and INTERVIEW_AUDIO_PATH to match your local filesystem.
This tool opens each PDF in RESUME_DIR, searches for each keyword (case-insensitive), and returns matches containing filename, keyword, surrounding snippet, and page number.
@function_tool(name_override="scan_resumes_for_keywords")def scan_resumes_for_keywords(keywords: list[str]) -> list[dict]: """ Scan all PDF resumes in RESUME_DIR for keyword occurrences. Returns a list of dicts: [ { "filename": "resume.pdf", "keyword": "Python", "match_snippet": "...context around the match...", "page": 2 }, ... ] """ results: list[dict] = [] for file in RESUME_DIR.glob("*.pdf"): try: doc = fitz.open(str(file)) except Exception as e: # Skip files that can't be opened continue for page in doc: text = page.get_text() or "" for kw in keywords: idx = text.lower().find(kw.lower()) if idx >= 0: start = max(0, idx - 75) snippet = text[start:start + 250].strip() results.append({ "filename": file.name, "keyword": kw, "match_snippet": snippet, "page": page.number + 1 }) doc.close() return results
Best practices:
Normalize keywords before searching to improve match quality.
Consider using more advanced NLP (lemmatization, fuzzy matching) for improved recall.
Use the LLM to extract 10–15 focused skills, tools, and responsibilities. Provide a clear system instruction and parse the model output into a clean list.
@function_tool(name_override="extract_keywords_from_job_description")def extract_keywords_from_job_description(job_text: str) -> list[str]: """ Use the LLM to extract 10-15 key skills/tools/responsibilities from the job_text. Returns a list of keywords. """ response = openai.ChatCompletion.create( model="gpt-4", messages=[ { "role": "system", "content": "Extract 10–15 key skills, tools, and responsibilities from this job description." }, {"role": "user", "content": job_text}, ], temperature=0.3 ) response_text = response["choices"][0]["message"]["content"] lines = response_text.splitlines() # Strip bullets, numbering, and any leading/trailing whitespace keywords = [line.strip(" -•*0123456789.").strip() for line in lines if line.strip()] return keywords
Tip: If the LLM returns multi-word phrases, keep them as-is (e.g., REST APIs, containerization) to preserve context for resume scanning.
Transcribe interviews using OpenAI’s speech-to-text model. This function returns the transcription text extracted from the audio file.
@function_tool(name_override="transcribe_interview")def transcribe_interview(file_path: str) -> str: """ Transcribe an audio file (wav, mp3, etc.) using OpenAI's Whisper model. Returns the transcript text. """ with open(file_path, "rb") as audio_file: transcript = openai.Audio.transcribe(model="whisper-1", file=audio_file) # The response includes a 'text' field return transcript.get("text", "").strip()
Note: Transcription quality depends on audio clarity, sampling rate, and accents. Preprocessing (noise reduction, splitting long files) can improve results.
Compare the transcript against the job description and return a human-readable assessment that highlights areas that were strong, missing, or overemphasized, plus actionable suggestions.
@function_tool(name_override="analyze_interview_relevance")def analyze_interview_relevance(interview_text: str, job_description: str) -> str: """ Using the LLM, evaluate how well the interview questions align with the job description. Return a detailed, human-readable assessment. """ system_msg = ( "You are an HR assistant. Evaluate how well the interview questions align with the job description. " "Be specific and helpful. Mention which areas were strong, which were missing, and provide actionable suggestions." ) response = openai.ChatCompletion.create( model="gpt-4", messages=[ {"role": "system", "content": system_msg}, { "role": "user", "content": f"Interview:\n\n{interview_text}\n\nJob Description:\n\n{job_description}" } ], temperature=0.4 ) return response["choices"][0]["message"]["content"].strip()
Suggestion: For more structured outputs, ask the LLM to return a JSON object with keys like strengths, gaps, and recommendations, then parse it programmatically.
Now compose the tools into a coordinator Agent that orchestrates the full workflow. The agent pulls together keyword extraction, resume scanning, transcription, and interview analysis, and returns a consolidated report.
recruiter_agent = Agent( name="Ai Recruiter Assistant", instructions="""You are helping a recruiter. Workflow:1) Extract keywords from the job description.2) Scan local resumes for keyword matches.3) Transcribe the interview audio file.4) Analyze how well the interview questions align with the job description.Return a single consolidated report containing:- Extracted keywords- Resume keyword matches (filename, keyword, snippet, page)- Interview transcript summary and alignment feedbackBe concise but thorough; include actionable suggestions where appropriate.""", tools=[ extract_keywords_from_job_description, scan_resumes_for_keywords, transcribe_interview, analyze_interview_relevance ], model="gpt-4", model_settings=ModelSettings(truncation="auto"))
Design note: Keeping each @function_tool narrow and focused makes it easy to test, reuse, and replace components (for example, swapping Whisper for another transcription service).
Create the job description and set the interview audio path. Update paths and job text to match your use case.
JOB_DESCRIPTION = """We're hiring a full-stack engineer with experience in React, Python, REST APIs, and deployment on cloud platforms like AWS or GCP.The role involves building scalable services, collaborating with product and design, and occasionally supporting data engineering tasks.Experience with containerization, CI/CD, and monitoring is a plus."""INTERVIEW_AUDIO_PATH = "/Users/gavinridgeway/Documents/Anaconda/AiAgent/Resume/audio_interview.MP3"prompt = f"""Please process this job description:{JOB_DESCRIPTION}Then scan local resumes for matches using scan_resumes_for_keywords. Finally, transcribe the audio file at:{INTERVIEW_AUDIO_PATH}and analyze whether the interview questions align with the job description."""
The Runner interface is asynchronous. Use an async entrypoint to execute the agent and print the final report. Modify this to fit your runtime or Runner API if necessary.
async def main(): result = await Runner.run(recruiter_agent, input=prompt) # The Runner returns a structured result — print the final output from the agent. print(result.final_output)if __name__ == "__main__": asyncio.run(main())
A list of extracted keywords from the job description (10–15 items).
Resume matches found in your PDF files, each with filename, keyword, snippet, and page number.
A transcript of the interview audio.
A detailed analysis explaining which interview questions aligned with the job description and which areas were under- or over-emphasized, including actionable suggestions.
Example scenario: The system might identify candidates matching “React” and “REST APIs” while noting the interview focused heavily on data-analysis topics (SQL, Excel), indicating a misalignment with the software engineering role.
Be mindful of API usage and costs. Transcribing long audio files and multiple LLM calls can incur charges—batch and rate-limit requests where possible. Also ensure you have consent and comply with relevant privacy requirements when processing candidate data.