Build Journal (Project Foundation & First Agent)

Journal Summary

Generated by ChatGPT

The project officially began with a deliberate decision to consolidate effort into a single integrated medical pharmacy initiative combining architecture, code, presentation material, and operational narrative. The user organized the effort into six chapters covering business capabilities, business assessment findings, value creation levers, agentic architecture, a deep dive into Auth-to-Claim Match (ATCM), and future considerations. Drawing on prior healthcare consulting experience, the user narrowed focus to the ATCM process because it represented a high-friction operational reconciliation layer involving prior authorizations, claims adjudication, manual review, and complex client-specific rules. The day was heavily grounded in business decomposition, operational pain-point analysis, and domain mapping across utilization management, claims integrity, and clinical governance workflows.

From a technical perspective, the user generated synthetic healthcare datasets and implemented a deterministic Python matching engine capable of identifying clean matches, quantity mismatches, date mismatches, provider mismatches, and missing authorizations. The user then designed a layered agentic architecture separating deterministic systems from reasoning systems, concluding that LLMs should not perform authoritative matching logic but instead provide interpretation and operational guidance. This led to the successful implementation of an initial CrewAI-based “Exception Explanation Agent” that translated deterministic outputs into reviewer-friendly narratives and routed claims to specialist investigative paths such as authorization discovery and date/window validation.

Starting Point

This is the first post of many which will focus on business domain exploration (healthcare initially) and the design and build of AI-enabled solutions. To start, I decided to learn more about the medical pharmacy domain given my professional experience with a specific client.

I started the day by getting my development environment setup using Cursor/Code. I’ve also decided to keep track of my progress using this engineering (build) journal as it will help me think more clearly about what problem(s) I’m solving, where I’m struggling, and what I need to do to move forward.

The business analysis that I led in 2023 spanned a wide range of business areas and capabilities:

Delegated Prior Authorization
Non-Delegated Prior Authorization
Specialty Clinical Programs
Physician Clinical Review
Expert Clinical Network
Site of Service
Dosage Optimization and Drug Wastage
Oncology
Clinical Policy Development
Clinical Algorithm Development
Medical Necessity Criteria Catalog
Underwriting
Post-Service Claims Edits
Auth to Claim Match
Patient Specific Edits

These can largely be organized into these 5 business domains:

Utilization Management
Clinical Decisioning & Governance
Specialty & High Cost Therapy Management
Claims Integrity & Revenue Protection
Financial Risk & Underwriting

After doing some research, I will focus my attention on the Auth-to-Claim Match program which falls within Utilization Management. My goal is to get something working quickly and then expand and adapt as necessary; I’ll only introduce additional complexity when it is warranted.

Here’s a high-level overview:

In a Medical Pharmacy context, “Auth to Claim Match” refers to a process that involves verifying the authorization status of a healthcare service or prescription medication against the corresponding insurance or payer claim.

Essentially, any agentic solution will fall between the prior authorization (pre-service object) and the claim (post-service reality). This reconciliation layer is what I need to build.

The goal is to pay, deny, or flag the claim in a way that is consistent with the authorization, clinical rules, client configuration, and patient-specific constraints.

Admittedly, that’s a lot to consider. So, let’s start piece-by-piece. A simple way of understanding the Auth-to-Claim process is the following:

Step #1. Collect authorizations and claims from source systems.

Step #2. Use pre-defined rules (a deterministic approach) to attempt a match.

If there is a match, then pay the claim.
If there isn’t a match, then move the claim to a work queue for manual review.

Step #3. Perform a manual review to adjust, deny, or reprocess the claim.

Step #4. Generate reports to measure performance.

Data Sources

Since our process starts with data, let’s take inventory of all applicable data sources even if we don’t need them all right now:

Core Data Sources

Prior Authorization System: Source of truth for what was approved
Claims Adjudication System: Source of truth for what was billed
Member / Eligibility System: Identity + coverage validation
Provider Data Systems: Normalize provider identity across systems
Drug Reference Data: Normalize drug representation
Clinical Policy / Authorization Rules: Interpret allowed variance
Contract / Benefit Configuration: Client-specific rules

Supporting Data Sources

Drug Reference Data: Normalize drug representation
Clinical Policy / Authorization Rules: Interpret allowed variance
Contract / Benefit Configuration: Client-specific rules Operational / Reconcilation Data
Claims Edit / Exception Logs: Historical reconciliation outcomes
Audit & Case Management System: Human review traceability Cross-System Enablers
Master Data / Identity Resolution Layer: Link everything together
Temporal / Event Data: Sequence validation

As a starting point, I’ve generated the following 5 synthetic datasets:

Prior Authorization System — authorizations.csv
Claims Adjudication System — claims.csv
Drug Reference - drug_reference.csv
Member / Eligibility System — member_eligibility.csv
Provider Data System — providers.csv

Using the boldfaced exported CSVs, the solution that I generated is a deterministic matching engine written in Python (“Auth-to-Claim Matching.ipynb”) which uses basic rules to respond to the following 5 use cases:

Use Case #1 (Clean match): member, provider, drug, quantity, date, and site align
Use Case #2 (Quantity mismatch): claim billed 3 vials, auth approved 2
Use Case #3 (Date mismatch): service date after auth end date
Use Case #4 (Provider mismatch): provider not found / non-normalized
Use Case #5 (None matching): No matching authorization found for member/drug/date

This deterministic engine identifies any potential mismatches. LLMs cannot be used in this step because we need a deterministic solution. However, LLMs can add value to introduce corresponding narrative to explain the results of this matching process.

NLP Experimentation

Before I create an AI-enabled application to do this, I fed in a sample data set to ChatGPT along with the following prompt:

The attached dataset contains output from an “auth-to-claim” matching application that indicates whether there was a match, and if not, what was the exception. Please generate natural language guidance for each record based on the data provided.

Here is the result:

(Clean Match) No action required. Claim aligns with authorization across member, provider, drug, quantity, and service date.

(Quantity) Verify billed quantity against authorized amount. Check for unit conversion issues (e.g., mg vs vial) or dosing differences. Adjust claim or request auth update if clinically appropriate.

(Date) Validate service date falls within authorized date range. If service occurred outside the window, request retro authorization or correct claim date if entered incorrectly.

(Provider) Confirm rendering/billing provider matches authorized provider. If different, validate delegation rules or update authorization to reflect servicing provider.

(No match) No matching authorization found. Confirm if prior authorization was required. If yes, locate or create authorization; otherwise review policy for exception or denial handling.

Concisely, we have the following logic:

MATCH: No action.
Quantity Mismatch: Validate units/dose; reconcile or update auth.
Date Mismatch: Check service date vs. auth window; correct or request retro auth.
Provider Mismatch: Align rendering/billing provider with auth or update auth.
Missing Authorization: Confirm PA requirement; locate/create auth or route to denial/exception workflow.

This is essentially information that would be shared in a work queue that would otherwise require manual processing, but it’s not telling us anything meaningful. We need to know what to do about each determination (what happens next?), and for that we need additional guidance.

For example, even though a particular claim and authorization “match”, there are numerous baseline validation / optimization opportunities that we can take advantage of.

For example:

We could perform quality assurance checks post-adjudication which would validate the allowed amount, contract alignment, and downstream edits; thus, catching silent leakage.
We could identify high-frequency approvals for potential “gold-carding” or rule simplification.
- Note: Gold carding is selective exemption from prior authorizations based on historical performance. In essence, the payer trusts the provider’s clinical and documentation behavior, which permits them to submi claims without obtaining prior authorization for certain services/drugs.
We could process the claim immediately with periodic audit sampling to ensure integrity.

And we can leverage agents to realize these select opportunities:

Capability	Agent Action	Required Data Inputs
Straight-through processing	Auto-close claim and mark as fully reconciled	claim record, authorization record, match flags
Post-pay audit sampling	Select claim for QA review based on thresholds	claim data, contract rules, audit policies, historical error rates
Gold-carding identification	Flag provider/member patterns for reduced future auth friction	historical claims/auth matches, provider performance metrics

This is valuable information and it gives me an indication of what I can start building. But first let’s examine sample guidance (and agent recommendations) for the remaining 4 exception categories - i.e., what should happen next?

QUANTITY Mismatch → Clinical + normalization reconciliation

Unit normalization → Convert billed units (e.g., mg ↔ vial) and re-evaluate match
- → Inputs: claim quantity + units, auth quantity + units, drug packaging reference data
Clinical dosing validation → Compare billed dose against protocol (weight-based/regimen)
- → Inputs: drug reference data, clinical protocols, member weight/diagnosis, billed quantity
Authorization limit check → Validate against approved quantity/frequency/cumulative caps
- → Inputs: authorization limits, benefit rules, claim quantity, prior utilization history
Correction recommendation → Suggest claim adjustment vs. auth update
- → Inputs: results from normalization + dosing + limits, payer rules

DATE Mismatch → Temporal validation + policy routing

Auth window validation → Check if service date falls within effective/expiration dates
- → Inputs: claim service date, auth effective/expiration dates
Grace period evaluation → Apply payer-specific buffer windows
- → Inputs: payer policy rules, claim date, auth dates
Treatment cadence validation → Ensure service aligns with expected therapy intervals
- → Inputs: clinical protocol schedules, prior claim dates, drug/regimen data
Retro authorization routing → Determine eligibility and trigger retro workflow
- → Inputs: payer retro auth policy, claim date, submission lag timestamps

PROVIDER Mismatch → Identity + network resolution

Provider normalization → Resolve NPI/group mismatches (billing vs rendering)
- → Inputs: claim NPIs, auth provider IDs, provider master data
Delegation rule evaluation → Check if alternate provider is allowed
- → Inputs: payer delegation rules, provider network data, auth constraints
Facility vs rendering validation → Align site-of-service vs individual provider authorization
- → Inputs: site-of-service, facility IDs, auth scope, claim provider roles
Correction recommendation → Suggest updating claim or authorization
- → Inputs: normalized provider mapping, policy constraints, mismatch classification

MISSING AUTHORIZATION Mismatch → Discovery + decisioning

PA requirement determination → Check if authorization is required for service
- → Inputs: claim procedure/drug/diagnosis, benefit plan rules
Cross-system auth search → Look for existing auth across systems/vendors
- → Inputs: authorization systems, member ID, provider ID, service details
Exception classification → Identify emergency, retrospective, or policy-based exceptions
- → Inputs: policy rules, claim context (site, timing, diagnosis)
Resolution routing → Decide: create auth, deny, or pend for review
- → Inputs: outputs from requirement check + search + policy evaluation

I asked ChatGPT to identify the top 5-7 agents with associated complexity level to help me determine next steps.

Agent	Primary Role	Complexity
Exception Explanation Agent	Converts match results and exception codes into plain-language guidance for reviewers.	Low
Authorization Discovery Agent	Searches for potentially matching authorizations when the engine reports “missing authorization.”	Low–Medium
Quantity Reconciliation Agent	Normalizes units, checks billed quantity against authorized amount, and flags dosing/packaging issues.	Medium
Date / Window Validation Agent	Validates service date against auth effective/expiration dates, grace periods, and retro-auth eligibility.	Medium
Provider Resolution Agent	Resolves billing/rendering/provider-group mismatches and evaluates whether provider substitution is allowed.	Medium–High
Policy Exception Agent	Interprets benefit, PA, site-of-service, emergency, retro, and exception policies to recommend routing.	High
Resolution Orchestration Agent	Synthesizes outputs from other agents and recommends: auto-resolve, pend, deny, update claim, update auth, or escalate.	High

And it recommended that I start with the following three:

Exception Explanation Agent
- Category: General Mismatch
- Complexity: Low
- Type: Light reasoning
- Notes: Fastest to build; makes deterministic engine output usable.
Authorization Discovery Agent
- Category: Missing Authorization Mismatch
- Complexity: Low-Medium Complexity
- Type: Heuristic -> Reasoning
- Notes: Strong demo value for “missing auth” scenarios without deep clinical complexity.
Date / Window Validation Agent
- Category: Date Mismatch
- Complexity: Medium
- Type: Deterministic
- Notes: Easy to explain; rules are intuitive and auditable.

Followed these final four (more complex):

Agent	Primary Role	Complexity	Type
Quantity Reconciliation Agent	Adds clinical/pharmacy credibility, but requires drug/unit logic.	Medium	Primarily rule-based + structured logic
Provider Resolution Agent	Useful but depends on provider master/network data quality.	Medium-High	Heuristic -> Reasoning
Policy Exception Agent	Powerful, but policy interpretation is the hardest part.	High	True reasoning agent
Resolution Orchestration Agent	Best introduced after several specialist agents exist.	High	True reasoning agent (decision synthesis)

From an agentic architecture perspective, it’s important to note that I don’t have 7 reasoning agents. Rather, these 7 agents are spread across 3 distinct agentic layers:

Deterministic Layer (existing system)

Matching engine (Phase 1)
Date validation (Phase 1)
Unit normalization

Analytical / Heuristic Layer

Auth discovery (Phase 1)
Provider resolution
Quantity reconciliation (partially)

Reasoning Layer (true agents)

Policy Exception Agent
Resolution Orchestration Agent

I learned that it’s critical that one does not treat everything as reasoning. Here’s why:

If you treat everything as reasoning:

You increase cost and latency

You lose determinism and auditability

You overcomplicate simple problems

If you separate them:

Deterministic layer = fast, cheap, reliable

Reasoning layer = targeted, high-value, explainable where needed

Given this information, Phase 1 will consist of the deterministic matching application combined with 3 agentic resolution agents.

Specifically:

Step #1: Deterministic matching engine (Python) runs first.

It outputs:

match status
exception type
relevant claim/auth fields
confidence or rule result

Step #2: Exception Explanation Agent runs next.

It acts as the first agentic interpretation layer:

explains the exception in reviewer-friendly language
classifies the issue
determines whether additional investigation is needed
selects the next specialist agent, if any

Step #3: Specialist agent is called only when relevant

Exception Type	Next Agent
Match	No specialist agent
Missing Authorization	Authorization Discovery Agent
Date Mismatch	Date / Window Validation Agent
Quantity Mismatch	Defer or basic explanation only for phase 1
Provider Mismatch	Defer or basic explanation only for phase 1

Represented visually:

Claim + Auth Data
   ↓
Deterministic Matching Engine
   ↓
Exception Explanation Agent
   ↓
IF exception_type = Missing Authorization
      → call Authorization Discovery Agent
   ELSE IF exception_type = Date Mismatch
      → call Date / Window Validation Agent
   ELSE
      → generate explanation only

This works well because the Exception Explanation Agent (something that I demonstrated initially using a ChatGPT prompt) becomes a lightweight “router”. Thus, with this initial scope defined, phase 1 will show:

the deterministic engine produces the truth signal
the explanation agent makes the output operationally understandable
- “exception explanation agent”
the (2) specialist agents add deeper investigation only when needed
1. “missing authorization”
2. “date mismatch”
the architecture stays simple and demoable

At this stage, it is important that we revisit the agentic solution build process so that I can accelerate implementation efforts. The full process is as follows:

Problem Framing (Decision Surface First) - Ensure this is worth solving before designing anything.
Process Decomposition - Identify where an agent can intervene meaningfully.
Agentic Opportunity Identification - Define what the agent actually does
Populate the Agentic Solution Template - Force completeness and clarity.
Define Autonomy + Control Model - Avoid over/under-automation.
Knowledge & Data Mapping - Ensure the agent can reason correctly.
Agentic Architecture Design - Translate logic into system components.
Workflow + Interaction Design (UX Layer) - Make the agent usable and trustworthy.
Prototype / Demo Build (Fast Path) - Make it real quickly (this is where most value is unlocked).
Evaluation Framework - Prove it works (beyond “looks good”).
Risk, Governance, and Controls - Make it enterprise-safe.
Pilot Design - Move from demo to real usage.
Scale & Operationalization - Turn into a repeatable capability.

A simplified approach is as follows:

Define decision + KPI
Map 5-step workflow
Pick 1–2 agent roles
Fill template (light version)
Sketch architecture
Design 1 screen (workbench)
Build demo (mock data)
Script narrative

I’ve populated an Agentic Solution Snapshot template in another file which contains full details about the Exception Explanation Agent I need to build. For this phase, I’ll use Crew.ai, a relatively straightforward agentic architecture that I learned on Udemy’s “AI Engineer Agentic Track: The Complete Agent & MCP Course”.

A few core concepts that will be relevant here:

Agent - An autonomous unit, with an LLM, a role, a goal, a backstory, memory, tools
Task - A specific assignment to be carried out, with a description, expected output, agent
Crew - A team of Agents and Tasks; either:
- Sequential - Run tasks in order they are defined.
- Hierarchical - Use a Manager LLM to assign.

Agents and tasks can be created by code, setting the backstory, description, expected output, etc., or you can define each in a YAML file as shown below:

- researcher:
	- role: >
		- Senior Financial Researcher
	- goal: >
		- Research companies, news and potential
	- backstory: >
		- You're a seasoned financial researcher with a talent for finding the most relevant information.
	- llm: openai/gpt-4o-mini

agent = Agent(config=self.agents_config['researcher']

We’ll also have a crew.py where our configuration comes together (using a Decorator pattern):

@CrewBase
class MyCrew():

    @agent
    def my_agent(self) -> Agent:
        return Agent(
            config=self.agents_config['my_agent'])

    @task
    def my_task(self) -> Task:
        return Task(
            config=self.tasks_config['my_task'])

    @crew
    def crew(self) -> Crew:
        return Crew(
            agents=self.agents,
            tasks=self.tasks,
            process=Process.sequential # Or Hierarchical
        )

Crew.ai also uses a super simple LiteLLM under the hood to interface with almost any LLM; set keys in an .env file. This means that we can use multiple LLMs if necessary.

For our Exception Explanation Agent, we’ll have a single sequential crew comprising a single agent that will complete a specific task. The agent will be associated with a single role: “summarizer”

summarizer:
  role: >
    An Authorization-to-Claim Exception Summarizer
  goal: >
    Convert {match-results} into plain-language guidance for reviewers.
  backstory: >
    You're a seasoned utilization management expert with a skill
    for concisely describing mismatched prior authorizations and medical
    claims found in {match-results}.

This agent will be responsible for a single task: “summarizing_task”

summarizing_task:
  description: >
    For each of the {match-results} generate narrative guidance.
    If the mismatch is caused by a missing authorization, then
    indicate that the "Authorization Discovery Agent" should be called and reference the claim ID.
    If the mismatch is caused by a date mismatch, then indicate
    that the "Date-Window Validation Agent" should be called and reference the claim ID.
    For all other mismatches, indicate that a manual review should be performed.
  expected_output: >
    A list of 1-2 sentence narratives for each mismatch.
    Formatted as markdown without '```'
  agent: summarizer
  output_file: output/summarization.md

I made additional modifications to the project this evening, and it appears that this first agent is working:

Final Output:

- For Claim ID CLM-9002, there is a Quantity Mismatch. Please perform a manual review to assess the billed quantity of 3 vials against the authorized quantity.

- For Claim ID CLM-9003, there is a Date Mismatch. The Date-Window Validation Agent should be called to evaluate the service date of 2026-03-05 in relation to the authorization period.

- For Claim ID CLM-9004, there is a Provider Mismatch. A manual review should be conducted due to the discrepancy in the rendering provider for the service date 2026-03-01.

- For Claim ID CLM-9005, there is a Missing Authorization. Please contact the Authorization Discovery Agent and reference this claim ID for further investigation.

This is great progress. Tomorrow I’ll start building the Authorization Discovery Agent which I believe will be a tool found within the tools directory.