Journal Summary
Generated by ChatGPT
The project officially began with a deliberate decision to consolidate effort into a single integrated medical pharmacy initiative combining architecture, code, presentation material, and operational narrative. The user organized the effort into six chapters covering business capabilities, business assessment findings, value creation levers, agentic architecture, a deep dive into Auth-to-Claim Match (ATCM), and future considerations. Drawing on prior healthcare consulting experience, the user narrowed focus to the ATCM process because it represented a high-friction operational reconciliation layer involving prior authorizations, claims adjudication, manual review, and complex client-specific rules. The day was heavily grounded in business decomposition, operational pain-point analysis, and domain mapping across utilization management, claims integrity, and clinical governance workflows.
From a technical perspective, the user generated synthetic healthcare datasets and implemented a deterministic Python matching engine capable of identifying clean matches, quantity mismatches, date mismatches, provider mismatches, and missing authorizations. The user then designed a layered agentic architecture separating deterministic systems from reasoning systems, concluding that LLMs should not perform authoritative matching logic but instead provide interpretation and operational guidance. This led to the successful implementation of an initial CrewAI-based “Exception Explanation Agent” that translated deterministic outputs into reviewer-friendly narratives and routed claims to specialist investigative paths such as authorization discovery and date/window validation.
Starting Point
This is the first post of many which will focus on business domain exploration (healthcare initially) and the design and build of AI-enabled solutions. To start, I decided to learn more about the medical pharmacy domain given my professional experience with a specific client.
I started the day by getting my development environment setup using Cursor/Code. I’ve also decided to keep track of my progress using this engineering (build) journal as it will help me think more clearly about what problem(s) I’m solving, where I’m struggling, and what I need to do to move forward.
The business analysis that I led in 2023 spanned a wide range of business areas and capabilities:
- Delegated Prior Authorization
- Non-Delegated Prior Authorization
- Specialty Clinical Programs
- Physician Clinical Review
- Expert Clinical Network
- Site of Service
- Dosage Optimization and Drug Wastage
- Oncology
- Clinical Policy Development
- Clinical Algorithm Development
- Medical Necessity Criteria Catalog
- Underwriting
- Post-Service Claims Edits
- Auth to Claim Match
- Patient Specific Edits
These can largely be organized into these 5 business domains:
- Utilization Management
- Clinical Decisioning & Governance
- Specialty & High Cost Therapy Management
- Claims Integrity & Revenue Protection
- Financial Risk & Underwriting
After doing some research, I will focus my attention on the Auth-to-Claim Match program which falls within Utilization Management. My goal is to get something working quickly and then expand and adapt as necessary; I’ll only introduce additional complexity when it is warranted.
Here’s a high-level overview:
In a Medical Pharmacy context, “Auth to Claim Match” refers to a process that involves verifying the authorization status of a healthcare service or prescription medication against the corresponding insurance or payer claim.
Essentially, any agentic solution will fall between the prior authorization (pre-service object) and the claim (post-service reality). This reconciliation layer is what I need to build.
The goal is to pay, deny, or flag the claim in a way that is consistent with the authorization, clinical rules, client configuration, and patient-specific constraints.
Admittedly, that’s a lot to consider. So, let’s start piece-by-piece. A simple way of understanding the Auth-to-Claim process is the following:
Step #1. Collect authorizations and claims from source systems.
Step #2. Use pre-defined rules (a deterministic approach) to attempt a match.
- If there is a match, then pay the claim.
- If there isn’t a match, then move the claim to a work queue for manual review.
Step #3. Perform a manual review to adjust, deny, or reprocess the claim.
Step #4. Generate reports to measure performance.
Data Sources
Since our process starts with data, let’s take inventory of all applicable data sources even if we don’t need them all right now:
Core Data Sources
- Prior Authorization System: Source of truth for what was approved
- Claims Adjudication System: Source of truth for what was billed
- Member / Eligibility System: Identity + coverage validation
- Provider Data Systems: Normalize provider identity across systems
- Drug Reference Data: Normalize drug representation
- Clinical Policy / Authorization Rules: Interpret allowed variance
- Contract / Benefit Configuration: Client-specific rules
Supporting Data Sources
- Drug Reference Data: Normalize drug representation
- Clinical Policy / Authorization Rules: Interpret allowed variance
- Contract / Benefit Configuration: Client-specific rules Operational / Reconcilation Data
- Claims Edit / Exception Logs: Historical reconciliation outcomes
- Audit & Case Management System: Human review traceability Cross-System Enablers
- Master Data / Identity Resolution Layer: Link everything together
- Temporal / Event Data: Sequence validation
As a starting point, I’ve generated the following 5 synthetic datasets:
- Prior Authorization System — authorizations.csv
- Claims Adjudication System — claims.csv
- Drug Reference - drug_reference.csv
- Member / Eligibility System — member_eligibility.csv
- Provider Data System — providers.csv
Using the boldfaced exported CSVs, the solution that I generated is a deterministic matching engine written in Python (“Auth-to-Claim Matching.ipynb”) which uses basic rules to respond to the following 5 use cases:
- Use Case #1 (Clean match): member, provider, drug, quantity, date, and site align
- Use Case #2 (Quantity mismatch): claim billed 3 vials, auth approved 2
- Use Case #3 (Date mismatch): service date after auth end date
- Use Case #4 (Provider mismatch): provider not found / non-normalized
- Use Case #5 (None matching): No matching authorization found for member/drug/date
This deterministic engine identifies any potential mismatches. LLMs cannot be used in this step because we need a deterministic solution. However, LLMs can add value to introduce corresponding narrative to explain the results of this matching process.
NLP Experimentation
Before I create an AI-enabled application to do this, I fed in a sample data set to ChatGPT along with the following prompt:
The attached dataset contains output from an “auth-to-claim” matching application that indicates whether there was a match, and if not, what was the exception. Please generate natural language guidance for each record based on the data provided.
Here is the result:
(Clean Match) No action required. Claim aligns with authorization across member, provider, drug, quantity, and service date.
(Quantity) Verify billed quantity against authorized amount. Check for unit conversion issues (e.g., mg vs vial) or dosing differences. Adjust claim or request auth update if clinically appropriate.
(Date) Validate service date falls within authorized date range. If service occurred outside the window, request retro authorization or correct claim date if entered incorrectly.
(Provider) Confirm rendering/billing provider matches authorized provider. If different, validate delegation rules or update authorization to reflect servicing provider.
(No match) No matching authorization found. Confirm if prior authorization was required. If yes, locate or create authorization; otherwise review policy for exception or denial handling.
Concisely, we have the following logic:
- MATCH: No action.
- Quantity Mismatch: Validate units/dose; reconcile or update auth.
- Date Mismatch: Check service date vs. auth window; correct or request retro auth.
- Provider Mismatch: Align rendering/billing provider with auth or update auth.
- Missing Authorization: Confirm PA requirement; locate/create auth or route to denial/exception workflow.
This is essentially information that would be shared in a work queue that would otherwise require manual processing, but it’s not telling us anything meaningful. We need to know what to do about each determination (what happens next?), and for that we need additional guidance.
For example, even though a particular claim and authorization “match”, there are numerous baseline validation / optimization opportunities that we can take advantage of.
For example:
- We could perform quality assurance checks post-adjudication which would validate the allowed amount, contract alignment, and downstream edits; thus, catching silent leakage.
- We could identify high-frequency approvals for potential “gold-carding” or rule simplification.
- Note: Gold carding is selective exemption from prior authorizations based on historical performance. In essence, the payer trusts the provider’s clinical and documentation behavior, which permits them to submi claims without obtaining prior authorization for certain services/drugs.
- We could process the claim immediately with periodic audit sampling to ensure integrity.
And we can leverage agents to realize these select opportunities:
| Capability | Agent Action | Required Data Inputs |
|---|---|---|
| Straight-through processing | Auto-close claim and mark as fully reconciled | claim record, authorization record, match flags |
| Post-pay audit sampling | Select claim for QA review based on thresholds | claim data, contract rules, audit policies, historical error rates |
| Gold-carding identification | Flag provider/member patterns for reduced future auth friction | historical claims/auth matches, provider performance metrics |
This is valuable information and it gives me an indication of what I can start building. But first let’s examine sample guidance (and agent recommendations) for the remaining 4 exception categories - i.e., what should happen next?
QUANTITY Mismatch → Clinical + normalization reconciliation
- Unit normalization → Convert billed units (e.g., mg ↔ vial) and re-evaluate match
- → Inputs: claim quantity + units, auth quantity + units, drug packaging reference data
- Clinical dosing validation → Compare billed dose against protocol (weight-based/regimen)
- → Inputs: drug reference data, clinical protocols, member weight/diagnosis, billed quantity
- Authorization limit check → Validate against approved quantity/frequency/cumulative caps
- → Inputs: authorization limits, benefit rules, claim quantity, prior utilization history
- Correction recommendation → Suggest claim adjustment vs. auth update
- → Inputs: results from normalization + dosing + limits, payer rules
DATE Mismatch → Temporal validation + policy routing
- Auth window validation → Check if service date falls within effective/expiration dates
- → Inputs: claim service date, auth effective/expiration dates
- Grace period evaluation → Apply payer-specific buffer windows
- → Inputs: payer policy rules, claim date, auth dates
- Treatment cadence validation → Ensure service aligns with expected therapy intervals
- → Inputs: clinical protocol schedules, prior claim dates, drug/regimen data
- Retro authorization routing → Determine eligibility and trigger retro workflow
- → Inputs: payer retro auth policy, claim date, submission lag timestamps
PROVIDER Mismatch → Identity + network resolution
- Provider normalization → Resolve NPI/group mismatches (billing vs rendering)
- → Inputs: claim NPIs, auth provider IDs, provider master data
- Delegation rule evaluation → Check if alternate provider is allowed
- → Inputs: payer delegation rules, provider network data, auth constraints
- Facility vs rendering validation → Align site-of-service vs individual provider authorization
- → Inputs: site-of-service, facility IDs, auth scope, claim provider roles
- Correction recommendation → Suggest updating claim or authorization
- → Inputs: normalized provider mapping, policy constraints, mismatch classification
MISSING AUTHORIZATION Mismatch → Discovery + decisioning
- PA requirement determination → Check if authorization is required for service
- → Inputs: claim procedure/drug/diagnosis, benefit plan rules
- Cross-system auth search → Look for existing auth across systems/vendors
- → Inputs: authorization systems, member ID, provider ID, service details
- Exception classification → Identify emergency, retrospective, or policy-based exceptions
- → Inputs: policy rules, claim context (site, timing, diagnosis)
- Resolution routing → Decide: create auth, deny, or pend for review
- → Inputs: outputs from requirement check + search + policy evaluation
I asked ChatGPT to identify the top 5-7 agents with associated complexity level to help me determine next steps.
| Agent | Primary Role | Complexity |
|---|---|---|
| Exception Explanation Agent | Converts match results and exception codes into plain-language guidance for reviewers. | Low |
| Authorization Discovery Agent | Searches for potentially matching authorizations when the engine reports “missing authorization.” | Low–Medium |
| Quantity Reconciliation Agent | Normalizes units, checks billed quantity against authorized amount, and flags dosing/packaging issues. | Medium |
| Date / Window Validation Agent | Validates service date against auth effective/expiration dates, grace periods, and retro-auth eligibility. | Medium |
| Provider Resolution Agent | Resolves billing/rendering/provider-group mismatches and evaluates whether provider substitution is allowed. | Medium–High |
| Policy Exception Agent | Interprets benefit, PA, site-of-service, emergency, retro, and exception policies to recommend routing. | High |
| Resolution Orchestration Agent | Synthesizes outputs from other agents and recommends: auto-resolve, pend, deny, update claim, update auth, or escalate. | High |
And it recommended that I start with the following three:
- Exception Explanation Agent
- Category: General Mismatch
- Complexity: Low
- Type: Light reasoning
- Notes: Fastest to build; makes deterministic engine output usable.
- Authorization Discovery Agent
- Category: Missing Authorization Mismatch
- Complexity: Low-Medium Complexity
- Type: Heuristic -> Reasoning
- Notes: Strong demo value for “missing auth” scenarios without deep clinical complexity.
- Date / Window Validation Agent
- Category: Date Mismatch
- Complexity: Medium
- Type: Deterministic
- Notes: Easy to explain; rules are intuitive and auditable.
Followed these final four (more complex):
| Agent | Primary Role | Complexity | Type |
|---|---|---|---|
| Quantity Reconciliation Agent | Adds clinical/pharmacy credibility, but requires drug/unit logic. | Medium | Primarily rule-based + structured logic |
| Provider Resolution Agent | Useful but depends on provider master/network data quality. | Medium-High | Heuristic -> Reasoning |
| Policy Exception Agent | Powerful, but policy interpretation is the hardest part. | High | True reasoning agent |
| Resolution Orchestration Agent | Best introduced after several specialist agents exist. | High | True reasoning agent (decision synthesis) |
From an agentic architecture perspective, it’s important to note that I don’t have 7 reasoning agents. Rather, these 7 agents are spread across 3 distinct agentic layers:
Deterministic Layer (existing system)
- Matching engine (Phase 1)
- Date validation (Phase 1)
- Unit normalization
Analytical / Heuristic Layer
- Auth discovery (Phase 1)
- Provider resolution
- Quantity reconciliation (partially)
Reasoning Layer (true agents)
- Policy Exception Agent
- Resolution Orchestration Agent
I learned that it’s critical that one does not treat everything as reasoning. Here’s why:
If you treat everything as reasoning:
- You increase cost and latency
- You lose determinism and auditability
- You overcomplicate simple problems
If you separate them:
- Deterministic layer = fast, cheap, reliable
- Reasoning layer = targeted, high-value, explainable where needed
Given this information, Phase 1 will consist of the deterministic matching application combined with 3 agentic resolution agents.
Specifically:
Step #1: Deterministic matching engine (Python) runs first.
It outputs:
- match status
- exception type
- relevant claim/auth fields
- confidence or rule result
Step #2: Exception Explanation Agent runs next.
It acts as the first agentic interpretation layer:
- explains the exception in reviewer-friendly language
- classifies the issue
- determines whether additional investigation is needed
- selects the next specialist agent, if any
Step #3: Specialist agent is called only when relevant
| Exception Type | Next Agent |
|---|---|
| Match | No specialist agent |
| Missing Authorization | Authorization Discovery Agent |
| Date Mismatch | Date / Window Validation Agent |
| Quantity Mismatch | Defer or basic explanation only for phase 1 |
| Provider Mismatch | Defer or basic explanation only for phase 1 |
Represented visually:
Claim + Auth Data
↓
Deterministic Matching Engine
↓
Exception Explanation Agent
↓
IF exception_type = Missing Authorization
→ call Authorization Discovery Agent
ELSE IF exception_type = Date Mismatch
→ call Date / Window Validation Agent
ELSE
→ generate explanation only
This works well because the Exception Explanation Agent (something that I demonstrated initially using a ChatGPT prompt) becomes a lightweight “router”. Thus, with this initial scope defined, phase 1 will show:
- the deterministic engine produces the truth signal
- the explanation agent makes the output operationally understandable
- “exception explanation agent”
- the (2) specialist agents add deeper investigation only when needed
- “missing authorization”
- “date mismatch”
- the architecture stays simple and demoable
At this stage, it is important that we revisit the agentic solution build process so that I can accelerate implementation efforts. The full process is as follows:
- Problem Framing (Decision Surface First) - Ensure this is worth solving before designing anything.
- Process Decomposition - Identify where an agent can intervene meaningfully.
- Agentic Opportunity Identification - Define what the agent actually does
- Populate the Agentic Solution Template - Force completeness and clarity.
- Define Autonomy + Control Model - Avoid over/under-automation.
- Knowledge & Data Mapping - Ensure the agent can reason correctly.
- Agentic Architecture Design - Translate logic into system components.
- Workflow + Interaction Design (UX Layer) - Make the agent usable and trustworthy.
- Prototype / Demo Build (Fast Path) - Make it real quickly (this is where most value is unlocked).
- Evaluation Framework - Prove it works (beyond “looks good”).
- Risk, Governance, and Controls - Make it enterprise-safe.
- Pilot Design - Move from demo to real usage.
- Scale & Operationalization - Turn into a repeatable capability.
A simplified approach is as follows:
- Define decision + KPI
- Map 5-step workflow
- Pick 1–2 agent roles
- Fill template (light version)
- Sketch architecture
- Design 1 screen (workbench)
- Build demo (mock data)
- Script narrative
I’ve populated an Agentic Solution Snapshot template in another file which contains full details about the Exception Explanation Agent I need to build. For this phase, I’ll use Crew.ai, a relatively straightforward agentic architecture that I learned on Udemy’s “AI Engineer Agentic Track: The Complete Agent & MCP Course”.
A few core concepts that will be relevant here:
- Agent - An autonomous unit, with an LLM, a role, a goal, a backstory, memory, tools
- Task - A specific assignment to be carried out, with a description, expected output, agent
- Crew - A team of Agents and Tasks; either:
- Sequential - Run tasks in order they are defined.
- Hierarchical - Use a Manager LLM to assign.
Agents and tasks can be created by code, setting the backstory, description, expected output, etc., or you can define each in a YAML file as shown below:
- researcher:
- role: >
- Senior Financial Researcher
- goal: >
- Research companies, news and potential
- backstory: >
- You're a seasoned financial researcher with a talent for finding the most relevant information.
- llm: openai/gpt-4o-mini
agent = Agent(config=self.agents_config['researcher']
We’ll also have a crew.py where our configuration comes together (using a Decorator pattern):
@CrewBase
class MyCrew():
@agent
def my_agent(self) -> Agent:
return Agent(
config=self.agents_config['my_agent'])
@task
def my_task(self) -> Task:
return Task(
config=self.tasks_config['my_task'])
@crew
def crew(self) -> Crew:
return Crew(
agents=self.agents,
tasks=self.tasks,
process=Process.sequential # Or Hierarchical
)
Crew.ai also uses a super simple LiteLLM under the hood to interface with almost any LLM; set keys in an .env file. This means that we can use multiple LLMs if necessary.
For our Exception Explanation Agent, we’ll have a single sequential crew comprising a single agent that will complete a specific task. The agent will be associated with a single role: “summarizer”
summarizer:
role: >
An Authorization-to-Claim Exception Summarizer
goal: >
Convert {match-results} into plain-language guidance for reviewers.
backstory: >
You're a seasoned utilization management expert with a skill
for concisely describing mismatched prior authorizations and medical
claims found in {match-results}.
This agent will be responsible for a single task: “summarizing_task”
summarizing_task:
description: >
For each of the {match-results} generate narrative guidance.
If the mismatch is caused by a missing authorization, then
indicate that the "Authorization Discovery Agent" should be called and reference the claim ID.
If the mismatch is caused by a date mismatch, then indicate
that the "Date-Window Validation Agent" should be called and reference the claim ID.
For all other mismatches, indicate that a manual review should be performed.
expected_output: >
A list of 1-2 sentence narratives for each mismatch.
Formatted as markdown without '```'
agent: summarizer
output_file: output/summarization.md
I made additional modifications to the project this evening, and it appears that this first agent is working:
Final Output:
- For Claim ID CLM-9002, there is a Quantity Mismatch. Please perform a manual review to assess the billed quantity of 3 vials against the authorized quantity.
- For Claim ID CLM-9003, there is a Date Mismatch. The Date-Window Validation Agent should be called to evaluate the service date of 2026-03-05 in relation to the authorization period.
- For Claim ID CLM-9004, there is a Provider Mismatch. A manual review should be conducted due to the discrepancy in the rendering provider for the service date 2026-03-01.
- For Claim ID CLM-9005, there is a Missing Authorization. Please contact the Authorization Discovery Agent and reference this claim ID for further investigation.
This is great progress. Tomorrow I’ll start building the Authorization Discovery Agent which I believe will be a tool found within the tools directory.