Autonomy you can put your name on.
Every action an agent takes carries its reasoning, its confidence, and the evidence behind it — and a person reviews anything that can't be undone. So when a surveyor, a client, or your board asks how a file was decided, the answer is already assembled.
Traceable by design
Every decision shows its work.
Agent decision
every action · inspectableConfidence
96%
- ActionVerify RN license · TX board
- Source evidenceboard result · captured 9:41 linked
- ReasoningActive, unencumbered, matches file
- Policy gatepassed · read-only lookup ok
- Supervisor reviewAI supervisor · consistent ok
One-way doors
The irreversible gets heightened scrutiny.
Reversible actions proceed. Anything that can't be undone or changes external state escalates: the primary agent is reviewed by an AI supervisor with a lower threshold for flagging, and a human gives the final approval.
Reversible
Draft a message, save a note, read a portal. Easily undone — the agent proceeds.
proceeds automatically
Irreversible or external
Submit to a facility, send to a board. Heightened scrutiny.
Human oversight
Built for people to review fast and intervene.
A purpose-built review surface presents each item with its evidence and a recommendation, so a decision takes seconds. Approve, correct, redirect — or pause the whole operation and hand it back to a person.
Review queue
3 waiting- approved
Flagged background check
evidence + recommendation attached
Low-confidence document read
handwritten card · 61%
License discrepancy
name mismatch across sources
Injection & abuse defense
We assume hostile input will show up.
The agents read messages, documents, and external sources — so content is treated as data, never instruction, and the blast radius of anything misunderstood is kept small.
Direct prompt injection
Instruction hierarchy, relevance classification, and tool mediation stop a message from ever becoming authority.
Indirect injection
Documents and external text are labeled untrusted and quoted as evidence — never followed as instructions.
Data extraction
Role-bound retrieval, tenant isolation, and redaction keep data inside the user's authority.
Excessive agency
Agents get only the granular tools a task needs; high-impact and irreversible writes require extra checks.
Knowledge pollution
Uploaded knowledge passes source tracking, versioning, and review before it can become agent context.
Unbounded consumption
Request budgets, rate limits, and anomaly detection protect spend and availability.
And people, too
A human team reviews the agents' work — sampling live operations, catching drift, and keeping the system compliant.
We built this as an applied research system: useful autonomy, constrained by software boundaries, evals, and human review.
Questions
How is safety built into every LLM call?
Every meaningful output carries a confidence score, a reasoning summary, the source evidence it drew on, the policy that gated it, and any supervisor or human review — all written to a durable, reproducible audit trail. Nothing the agents do is a black box a person can't inspect after the fact.
How do you handle irreversible actions differently?
We distinguish two-way doors (reversible — draft a message, read a portal) from one-way doors (irreversible or externally visible — submit to a facility, send to a board). One-way doors get heightened scrutiny: the primary agent is reviewed by an AI supervisor with a lower threshold for flagging, and a human gives final approval before anything executes.
What kinds of decisions require human review?
By default: flagged background checks, low-confidence document reads, discrepancies between sources, and irreversible or externally visible actions. Beyond that, you can place custom checkpoints anywhere — by facility, requirement type, or risk level. A human team also samples live work to catch drift and keep the system compliant.
How do you defend against prompt injection?
Content the agents read — messages, documents, external pages — is treated as data, never instruction. An instruction hierarchy, relevance classification, tenant isolation, and complete tool mediation keep untrusted input from becoming authority, and least-privilege tools keep the blast radius small. See How we built it for the full model.
Does review create a bottleneck?
No. Items arrive pre-assembled with evidence and a recommendation, so most decisions take seconds — and the rest of the file keeps moving in parallel while an item waits.
See it work a real file.
Thirty minutes, one placement, worked live — start to submit-ready.