OpenAI-Powered Documentation Quality Feedback System

Background & Challenge

In our clinical setting, hourly technicians—many with only a high school diploma—were expected to produce medical documentation that would withstand scrutiny from payers, physicians, and regulators. Turnover was high and provider time for coaching was limited. While we had Power BI dashboards to flag missing signatures or CPT mismatches, nothing assessed whether the narrative truly conveyed medical necessity, complexity, or a coherent story of the visit.

I shadowed technicians on the floor and ran quick interviews, uncovering that no one had ever built a tool to teach why notes failed audits—only that they failed. It became clear: we didn't just need another report. We needed scalable coaching.

How I Discovered the Real Pain Point

• Spent days shadowing frontline staff, observing their note-writing struggles.
• Conducted brief "What's the hardest part of writing notes?" interviews—all pointed to lack of feedback on narrative quality.
• Realized the opportunity: use an LLM not for transcription or billing checks, but to coach users on crafting a clear, compliant clinical story.

Technical Highlights

Secure ETL Pipeline

• Python + SQL Server extracted and de-identified notes (redacting names, dates, identifiers) under HIPAA guardrails.

Prompt Engineering for BCBS

• Rapid sprints with auditors and BCBAs to craft a GPT-4 prompt that evaluates medical necessity, CPT justification, narrative clarity, and treatment rationale.

Role-Based Feedback

• Technicians received guidance on structure, intervention details, and credentials; providers got recommendations on clinical rationale, goal alignment, and regulatory consistency.

Power BI Dashboard

• Visualized audit-readiness metrics by author, location, and time—enabling targeted coaching.

Azure Automation

• Nightly batch processing via Azure Functions, with Teams notifications through Microsoft Graph; alerts triggered when systemic risks emerged.

Design Thinking & Iteration

I led three rapid prompt-tuning sprints in one week with cross-functional partners (technicians, auditors, BCBAs):

Draft Prompt → tested on sample notes → gathered user reactions
Tone Adjust → balanced critique with encouragement → measured "feedback acknowledged" rates
Final Prompt → pilot rollout → monitored adoption and morale

This cycle ensured the LLM's voice felt supportive, not punitive.

Data-Driven Refinement

Beyond tracking audit exceptions, I monitored:

• Teams Message Opens: +40% after shifting to a balanced tone
• Revision Turnaround Time: –30% once feedback highlighted both strengths and gaps

These KPIs guided each iteration toward both compliance and learning.

Before vs. After: Audit Enforcer → Coaching Engine

🔍 Old: Audit Enforcer	🌱 New: Skill Developer
Flagged only what was missing	Highlighted strengths and gaps
Delivered blunt, clinical feedback	Delivered empathetic, balanced feedback
Focused on passing audits	Focused on building documentation skill
Seen as punitive by technicians	Seen as a collaborative mentor
Treated the model like a grader	Treated the model like a coach
Designed for outcomes	Designed for growth

Human Touch

Technicians said, "It's really helpful to see why good notes matter." Providers noted the AI surfaced details they sometimes overlooked—boosting both efficiency and care continuity. This validated that the tool's true power lies in human–AI collaboration.

Impact & Results

Metric	Result	Timeline
Audit Exceptions	50% reduction	Post-implementation
Compliance Review Time	20+ hours/week saved	Automated processing
Teams Message Engagement	+40% after tone adjustment	Balanced feedback approach
Revision Turnaround	-30% improvement	Strengths-focused feedback

• Improved technician confidence and skill development
• Modular, payer-agnostic architecture ready for Aetna, UnitedHealthcare, and beyond

Novelty & Transferability

This pattern—identify a manual compliance pain point, reframe it as a coaching challenge, then build an LLM-driven mentor—can be applied across healthcare (e.g., imaging protocols, pharmacy reviews, discharge summaries) and beyond. Wherever passive checks exist, there's opportunity to teach and uplift.

Takeaway

I refuse to accept "that's just the way it is." By combining empathy, rapid iteration, and LLM expertise, I transformed a routine compliance complaint into a scalable learning engine—demonstrating that the right technology, aligned with human needs, can continuously improve complex workflows.