For Doctors
For doctors

How Doctora Generates Clinical Notes

Understanding the AI pipeline from recording to structured clinical data

6 min readUpdated April 7, 2026

When you finish an encounter and Doctora produces a populated exam, it can feel like magic. It is not. This article walks through exactly what happens between your narration and the structured data that appears in the review screen, so you can understand the system's strengths, its limits, and how to get the best results.

The pipeline at a glance

Every encounter flows through five stages:

  1. Recording -- Doctora captures audio from the exam room.
  2. Transcription -- A medical speech-to-text model converts audio to text in real time.
  3. AI extraction -- A large language model reads the transcript and fills a structured medical schema.
  4. Review -- You see the populated exam in Doctora's editor and make corrections.
  5. EHR sync -- Approved data is written into your EHR system.

The AI extraction step is where the heavy lifting happens, and where understanding the mechanics pays off.

What the AI actually does

Doctora does not generate free-text notes. Instead, the AI receives your transcript and fills a structured JSON schema with hundreds of defined fields--visual acuity measurements, anterior segment findings per eye, HPI entries, tonometry readings, assessment flags, and so on. Every field has a type (number, enum, text), constraints, and a description that tells the model exactly what belongs there.

Think of it as a very detailed form. The AI reads your narration and fills in each box. If you did not mention a finding, the field stays empty. If you said "cornea clear OU," the model selects the "clear" option for both OD and OS cornea fields.

This schema-driven approach is why Doctora's output is consistent and structured enough to sync directly into your EHR, rather than producing a narrative blob that still needs manual entry.

How custom instructions work

When you set a custom instruction on a field--say, you tell Doctora to "always use SPK grade 2 when trace staining is mentioned"--that instruction is injected directly into the schema definition the AI follows. It becomes part of the field's description, tagged as a doctor preference. The model treats it as a high-priority constraint for that specific field.

Global instructions (the ones that apply to all sections) are included in the system prompt, but field-specific instructions take priority. This means you can set a general preference and override it for particular fields when needed.

Custom instructions are the single most effective way to shape your output. More on this below.

Normal values and "all normal" detection

When you say "anterior segment unremarkable" or "fundus healthy OU," Doctora detects these as "all normal" signals. The AI sets a flag, and a post-processing step fills every anterior or posterior structure with its clinically appropriate normal value--cornea "clear", lens "clear", optic nerve "clear/pink/distinct 360", macula "clear", periphery "flat, no holes or breaks", and so on.

If you say "anterior normal except trace SPK OD," Doctora sets the all-normal flag and records the SPK finding. Post-processing fills every other structure with normals while preserving your specific finding. This means you do not have to narrate every normal structure individually.

What the AI is good at

  • Structured clinical findings -- Slit lamp findings, posterior segment observations, visual acuity, tonometry, refraction values. The schema has precise fields for these, and the model maps narration to them reliably.
  • Laterality -- OD, OS, OU. The model tracks which eye you are discussing and maps findings to the correct side. It handles "right eye... left eye..." transitions and explicit laterality callouts well.
  • Medical history extraction -- HPI, medications, allergies, family history, review of systems. The model distinguishes patient-reported history from today's exam findings.
  • Distinguishing imaging from physical exam -- When you transition from slit lamp to reviewing OCT scans or photos, the AI tracks that context shift and routes findings to the correct section (special testing results vs. posterior segment).
  • Contact lens parameters -- Brand names, base curves, sphere/cylinder/axis, fit assessments. A dedicated specialist model handles contact lens extraction separately for higher accuracy.

What the AI can struggle with

  • Very specific formatting preferences -- If you want pressures written as "14/16" rather than separate OD/OS fields, the schema may not accommodate that exact format. Custom instructions help, but the output must fit the structured schema.
  • Ambiguous narration -- "Looks about the same" without context. The model does not have access to prior exams, so relative statements without a baseline are hard to interpret.
  • Uncommon abbreviations -- Standard ophthalmic shorthand (SPK, EBMD, NS, DFE) is well understood. Practice-specific or highly regional abbreviations may not be recognized.
  • Implicit clinical reasoning -- If you think it but do not say it, the model cannot extract it. The AI only works from what is in the transcript.
  • Long pauses and off-topic conversation -- Extended discussions about non-clinical topics can occasionally confuse context boundaries, though the model generally handles casual conversation well.

ICD-10 code generation

ICD-10 coding is a separate AI step that runs after the main extraction. The ICD model receives the structured exam data--not just the raw transcript--along with a curated catalog of optometry-relevant ICD-10 families. It selects diagnosis families and resolves axis values (laterality, severity, stage) to produce specific codes.

This matters because ICD codes are derived from your documented findings, not from keywords in the audio. If you narrate "dry eye" but the exam findings show normal tear film and cornea, the disconnect will surface. The ICD model also generates care plan suggestions per diagnosis from a configurable library.

CPT code selection

CPT codes are determined algorithmically, not by AI. The system evaluates:

  • New vs. established patient -- From the assessment flags.
  • Exam completeness -- Whether anterior and posterior segments were both documented (comprehensive vs. intermediate).
  • Medical necessity -- Whether the encounter is routine, borderline, or medical, based on documented findings and clinical indicators.
  • Special testing -- Each documented test (OCT, fundus photography, visual field, ERG) adds its corresponding CPT code, with screening vs. diagnostic distinction.
  • Refraction -- Added when best-corrected visual acuity is documented.

Because this is rule-based rather than AI-generated, CPT codes are deterministic--same inputs always produce the same codes.

Why output can vary slightly

The main AI extraction runs at temperature 0 (no randomness), so identical inputs should produce identical outputs. In practice, minor variation can occur because:

  • Transcription itself can vary slightly between recordings of the same narration, due to background noise, speaking pace, or enunciation.
  • Model updates from OpenAI can subtly shift behavior, though the structured schema constrains output heavily.
  • Different template configurations change which fields the AI is asked to fill, which can affect how it interprets ambiguous narration.

These variations are typically minor--a word choice in a comment field, not a clinical finding changing sides.

How to get better output

  1. Use custom instructions. This is the highest-leverage tool. If Doctora consistently gets something wrong for your workflow, a field-level instruction fixes it permanently. "Always grade NS on a 1-4 scale." "Use 'trace' instead of 'mild' for minimal findings." These are injected directly into the AI's schema.

  2. Be explicit about laterality. Say "right eye" or "OD" before describing findings. The model handles transitions well, but explicit callouts eliminate ambiguity.

  3. Name your findings. "Cornea clear" is better than "looks good." "Two-plus nuclear sclerosis" is better than "some cataract changes." The more precisely you describe a finding, the more precisely the AI can map it to the structured field.

  4. Distinguish imaging from exam. When you transition to reviewing scans, say so. "Looking at the OCT..." or "On fundus photos..." helps the model route findings to the correct section.

  5. State what you mean, not what you assume. The AI has no access to prior visits. If a patient is established and you are monitoring glaucoma, say "glaucoma stable, IOP at target, continuing timolol" rather than just checking pressures silently.

  6. Speak at a natural pace. You do not need to slow down or use special phrasing. Narrate the way you normally would during a teaching exam--describing what you see as you see it. The transcription model is tuned for medical speech.

The bottom line: Doctora is a structured extraction system, not a creative writer. The more structured and explicit your narration, the more accurately the AI fills in the form. And when it does not get something right, custom instructions let you teach it your preferences permanently.