Skip to content
Perspectives (2018-26)
Go back

Build Journal (Project Foundation & First Agent)

Journal Summary

Generated by ChatGPT

The project officially began with a deliberate decision to consolidate effort into a single integrated medical pharmacy initiative combining architecture, code, presentation material, and operational narrative. The user organized the effort into six chapters covering business capabilities, business assessment findings, value creation levers, agentic architecture, a deep dive into Auth-to-Claim Match (ATCM), and future considerations. Drawing on prior healthcare consulting experience, the user narrowed focus to the ATCM process because it represented a high-friction operational reconciliation layer involving prior authorizations, claims adjudication, manual review, and complex client-specific rules. The day was heavily grounded in business decomposition, operational pain-point analysis, and domain mapping across utilization management, claims integrity, and clinical governance workflows.

From a technical perspective, the user generated synthetic healthcare datasets and implemented a deterministic Python matching engine capable of identifying clean matches, quantity mismatches, date mismatches, provider mismatches, and missing authorizations. The user then designed a layered agentic architecture separating deterministic systems from reasoning systems, concluding that LLMs should not perform authoritative matching logic but instead provide interpretation and operational guidance. This led to the successful implementation of an initial CrewAI-based “Exception Explanation Agent” that translated deterministic outputs into reviewer-friendly narratives and routed claims to specialist investigative paths such as authorization discovery and date/window validation.

Starting Point

This is the first post of many which will focus on business domain exploration (healthcare initially) and the design and build of AI-enabled solutions. To start, I decided to learn more about the medical pharmacy domain given my professional experience with a specific client.

I started the day by getting my development environment setup using Cursor/Code. I’ve also decided to keep track of my progress using this engineering (build) journal as it will help me think more clearly about what problem(s) I’m solving, where I’m struggling, and what I need to do to move forward.

The business analysis that I led in 2023 spanned a wide range of business areas and capabilities:

These can largely be organized into these 5 business domains:

After doing some research, I will focus my attention on the Auth-to-Claim Match program which falls within Utilization Management. My goal is to get something working quickly and then expand and adapt as necessary; I’ll only introduce additional complexity when it is warranted.

Here’s a high-level overview:

In a Medical Pharmacy context, “Auth to Claim Match” refers to a process that involves verifying the authorization status of a healthcare service or prescription medication against the corresponding insurance or payer claim.

Essentially, any agentic solution will fall between the prior authorization (pre-service object) and the claim (post-service reality). This reconciliation layer is what I need to build.

The goal is to pay, deny, or flag the claim in a way that is consistent with the authorization, clinical rules, client configuration, and patient-specific constraints.

Admittedly, that’s a lot to consider. So, let’s start piece-by-piece. A simple way of understanding the Auth-to-Claim process is the following:

Step #1. Collect authorizations and claims from source systems.

Step #2. Use pre-defined rules (a deterministic approach) to attempt a match.

Step #3. Perform a manual review to adjust, deny, or reprocess the claim.

Step #4. Generate reports to measure performance.

Data Sources

Since our process starts with data, let’s take inventory of all applicable data sources even if we don’t need them all right now:

Core Data Sources

Supporting Data Sources

As a starting point, I’ve generated the following 5 synthetic datasets:

  1. Prior Authorization System — authorizations.csv
  2. Claims Adjudication System — claims.csv
  3. Drug Reference - drug_reference.csv
  4. Member / Eligibility System — member_eligibility.csv
  5. Provider Data System — providers.csv

Using the boldfaced exported CSVs, the solution that I generated is a deterministic matching engine written in Python (“Auth-to-Claim Matching.ipynb”) which uses basic rules to respond to the following 5 use cases:

This deterministic engine identifies any potential mismatches. LLMs cannot be used in this step because we need a deterministic solution. However, LLMs can add value to introduce corresponding narrative to explain the results of this matching process.

NLP Experimentation

Before I create an AI-enabled application to do this, I fed in a sample data set to ChatGPT along with the following prompt:

The attached dataset contains output from an “auth-to-claim” matching application that indicates whether there was a match, and if not, what was the exception. Please generate natural language guidance for each record based on the data provided.

Here is the result:

(Clean Match) No action required. Claim aligns with authorization across member, provider, drug, quantity, and service date.

(Quantity) Verify billed quantity against authorized amount. Check for unit conversion issues (e.g., mg vs vial) or dosing differences. Adjust claim or request auth update if clinically appropriate.

(Date) Validate service date falls within authorized date range. If service occurred outside the window, request retro authorization or correct claim date if entered incorrectly.

(Provider) Confirm rendering/billing provider matches authorized provider. If different, validate delegation rules or update authorization to reflect servicing provider.

(No match) No matching authorization found. Confirm if prior authorization was required. If yes, locate or create authorization; otherwise review policy for exception or denial handling.

Concisely, we have the following logic:

This is essentially information that would be shared in a work queue that would otherwise require manual processing, but it’s not telling us anything meaningful. We need to know what to do about each determination (what happens next?), and for that we need additional guidance.

For example, even though a particular claim and authorization “match”, there are numerous baseline validation / optimization opportunities that we can take advantage of.

For example:

And we can leverage agents to realize these select opportunities:

CapabilityAgent ActionRequired Data Inputs
Straight-through processingAuto-close claim and mark as fully reconciledclaim record, authorization record, match flags
Post-pay audit samplingSelect claim for QA review based on thresholdsclaim data, contract rules, audit policies, historical error rates
Gold-carding identificationFlag provider/member patterns for reduced future auth frictionhistorical claims/auth matches, provider performance metrics

This is valuable information and it gives me an indication of what I can start building. But first let’s examine sample guidance (and agent recommendations) for the remaining 4 exception categories - i.e., what should happen next?

QUANTITY Mismatch → Clinical + normalization reconciliation

DATE Mismatch → Temporal validation + policy routing

PROVIDER Mismatch → Identity + network resolution

MISSING AUTHORIZATION Mismatch → Discovery + decisioning

I asked ChatGPT to identify the top 5-7 agents with associated complexity level to help me determine next steps.

AgentPrimary RoleComplexity
Exception Explanation AgentConverts match results and exception codes into plain-language guidance for reviewers.Low
Authorization Discovery AgentSearches for potentially matching authorizations when the engine reports “missing authorization.”Low–Medium
Quantity Reconciliation AgentNormalizes units, checks billed quantity against authorized amount, and flags dosing/packaging issues.Medium
Date / Window Validation AgentValidates service date against auth effective/expiration dates, grace periods, and retro-auth eligibility.Medium
Provider Resolution AgentResolves billing/rendering/provider-group mismatches and evaluates whether provider substitution is allowed.Medium–High
Policy Exception AgentInterprets benefit, PA, site-of-service, emergency, retro, and exception policies to recommend routing.High
Resolution Orchestration AgentSynthesizes outputs from other agents and recommends: auto-resolve, pend, deny, update claim, update auth, or escalate.High

And it recommended that I start with the following three:

  1. Exception Explanation Agent
    • Category: General Mismatch
    • Complexity: Low
    • Type: Light reasoning
    • Notes: Fastest to build; makes deterministic engine output usable.
  2. Authorization Discovery Agent
    • Category: Missing Authorization Mismatch
    • Complexity: Low-Medium Complexity
    • Type: Heuristic -> Reasoning
    • Notes: Strong demo value for “missing auth” scenarios without deep clinical complexity.
  3. Date / Window Validation Agent
    • Category: Date Mismatch
    • Complexity: Medium
    • Type: Deterministic
    • Notes: Easy to explain; rules are intuitive and auditable.

Followed these final four (more complex):

AgentPrimary RoleComplexityType
Quantity Reconciliation AgentAdds clinical/pharmacy credibility, but requires drug/unit logic.MediumPrimarily rule-based + structured logic
Provider Resolution AgentUseful but depends on provider master/network data quality.Medium-HighHeuristic -> Reasoning
Policy Exception AgentPowerful, but policy interpretation is the hardest part.HighTrue reasoning agent
Resolution Orchestration AgentBest introduced after several specialist agents exist.HighTrue reasoning agent (decision synthesis)

From an agentic architecture perspective, it’s important to note that I don’t have 7 reasoning agents. Rather, these 7 agents are spread across 3 distinct agentic layers:

Deterministic Layer (existing system)

Analytical / Heuristic Layer

Reasoning Layer (true agents)

I learned that it’s critical that one does not treat everything as reasoning. Here’s why:

If you treat everything as reasoning:

  • You increase cost and latency
  • You lose determinism and auditability
  • You overcomplicate simple problems

If you separate them:

  • Deterministic layer = fast, cheap, reliable
  • Reasoning layer = targeted, high-value, explainable where needed

Given this information, Phase 1 will consist of the deterministic matching application combined with 3 agentic resolution agents.

Specifically:

Step #1: Deterministic matching engine (Python) runs first.

It outputs:

Step #2: Exception Explanation Agent runs next.

It acts as the first agentic interpretation layer:

Step #3: Specialist agent is called only when relevant

Exception TypeNext Agent
MatchNo specialist agent
Missing AuthorizationAuthorization Discovery Agent
Date MismatchDate / Window Validation Agent
Quantity MismatchDefer or basic explanation only for phase 1
Provider MismatchDefer or basic explanation only for phase 1

Represented visually:

Claim + Auth Data
   ↓
Deterministic Matching Engine
   ↓
Exception Explanation Agent
   ↓
IF exception_type = Missing Authorization
      → call Authorization Discovery Agent
   ELSE IF exception_type = Date Mismatch
      → call Date / Window Validation Agent
   ELSE
      → generate explanation only

This works well because the Exception Explanation Agent (something that I demonstrated initially using a ChatGPT prompt) becomes a lightweight “router”. Thus, with this initial scope defined, phase 1 will show:

At this stage, it is important that we revisit the agentic solution build process so that I can accelerate implementation efforts. The full process is as follows:

  1. Problem Framing (Decision Surface First) - Ensure this is worth solving before designing anything.
  2. Process Decomposition - Identify where an agent can intervene meaningfully.
  3. Agentic Opportunity Identification - Define what the agent actually does
  4. Populate the Agentic Solution Template - Force completeness and clarity.
  5. Define Autonomy + Control Model - Avoid over/under-automation.
  6. Knowledge & Data Mapping - Ensure the agent can reason correctly.
  7. Agentic Architecture Design - Translate logic into system components.
  8. Workflow + Interaction Design (UX Layer) - Make the agent usable and trustworthy.
  9. Prototype / Demo Build (Fast Path) - Make it real quickly (this is where most value is unlocked).
  10. Evaluation Framework - Prove it works (beyond “looks good”).
  11. Risk, Governance, and Controls - Make it enterprise-safe.
  12. Pilot Design - Move from demo to real usage.
  13. Scale & Operationalization - Turn into a repeatable capability.

A simplified approach is as follows:

  1. Define decision + KPI
  2. Map 5-step workflow
  3. Pick 1–2 agent roles
  4. Fill template (light version)
  5. Sketch architecture
  6. Design 1 screen (workbench)
  7. Build demo (mock data)
  8. Script narrative

I’ve populated an Agentic Solution Snapshot template in another file which contains full details about the Exception Explanation Agent I need to build. For this phase, I’ll use Crew.ai, a relatively straightforward agentic architecture that I learned on Udemy’s “AI Engineer Agentic Track: The Complete Agent & MCP Course”.

A few core concepts that will be relevant here:

Agents and tasks can be created by code, setting the backstory, description, expected output, etc., or you can define each in a YAML file as shown below:

- researcher:
	- role: >
		- Senior Financial Researcher
	- goal: >
		- Research companies, news and potential
	- backstory: >
		- You're a seasoned financial researcher with a talent for finding the most relevant information.
	- llm: openai/gpt-4o-mini

agent = Agent(config=self.agents_config['researcher']

We’ll also have a crew.py where our configuration comes together (using a Decorator pattern):

@CrewBase
class MyCrew():

    @agent
    def my_agent(self) -> Agent:
        return Agent(
            config=self.agents_config['my_agent'])

    @task
    def my_task(self) -> Task:
        return Task(
            config=self.tasks_config['my_task'])

    @crew
    def crew(self) -> Crew:
        return Crew(
            agents=self.agents,
            tasks=self.tasks,
            process=Process.sequential # Or Hierarchical
        )

Crew.ai also uses a super simple LiteLLM under the hood to interface with almost any LLM; set keys in an .env file. This means that we can use multiple LLMs if necessary.

For our Exception Explanation Agent, we’ll have a single sequential crew comprising a single agent that will complete a specific task. The agent will be associated with a single role: “summarizer”

summarizer:
  role: >
    An Authorization-to-Claim Exception Summarizer
  goal: >
    Convert {match-results} into plain-language guidance for reviewers.
  backstory: >
    You're a seasoned utilization management expert with a skill
    for concisely describing mismatched prior authorizations and medical
    claims found in {match-results}.

This agent will be responsible for a single task: “summarizing_task”

summarizing_task:
  description: >
    For each of the {match-results} generate narrative guidance.
    If the mismatch is caused by a missing authorization, then
    indicate that the "Authorization Discovery Agent" should be called and reference the claim ID.
    If the mismatch is caused by a date mismatch, then indicate
    that the "Date-Window Validation Agent" should be called and reference the claim ID.
    For all other mismatches, indicate that a manual review should be performed.
  expected_output: >
    A list of 1-2 sentence narratives for each mismatch.
    Formatted as markdown without '```'
  agent: summarizer
  output_file: output/summarization.md

I made additional modifications to the project this evening, and it appears that this first agent is working:

Final Output:

- For Claim ID CLM-9002, there is a Quantity Mismatch. Please perform a manual review to assess the billed quantity of 3 vials against the authorized quantity.

- For Claim ID CLM-9003, there is a Date Mismatch. The Date-Window Validation Agent should be called to evaluate the service date of 2026-03-05 in relation to the authorization period.

- For Claim ID CLM-9004, there is a Provider Mismatch. A manual review should be conducted due to the discrepancy in the rendering provider for the service date 2026-03-01.

- For Claim ID CLM-9005, there is a Missing Authorization. Please contact the Authorization Discovery Agent and reference this claim ID for further investigation.

This is great progress. Tomorrow I’ll start building the Authorization Discovery Agent which I believe will be a tool found within the tools directory.


Share this post on:

Previous Post
Build Journal (Deterministic Validation Design)
Next Post
Civility