← All posts

PII Redaction for AI Agents: Why It Can't Be an Afterthought

The PII problem in AI memory is not what you think it is

Most engineering teams building AI agents understand that they shouldn't store raw PII. Ask any developer and they'll say: of course we're not storing social security numbers in the vector store.

But PII leaks through AI memory systems in ways that are less obvious — and the developer's mental model of "just don't store the sensitive parts" is not sufficient.

This post explains how PII actually leaks, why afterthought redaction approaches fail, and what architectural PII redaction for AI agents looks like.


How PII enters AI memory without anyone intending it to

Implicit extraction

AI agents don't just store what you explicitly tell them to store. They extract facts. When a language model processes a conversation and derives memories from it, the extracted facts often contain PII the agent never explicitly received.

Example: a user says "my appointment is next Tuesday at the clinic on Maple Street." The agent may extract and store: user_name: Sarah Chen, medical_appointment: 2026-04-29, location: Maple Street Clinic. The user never stated their name — the model inferred it from earlier context. The memory now contains PHI the developer didn't intentionally write.

Contextual embedding

Vector embeddings encode semantic meaning. A memory stored as "the patient prefers morning appointments" embeds differently when it was derived from a conversation that included the patient's name, diagnosis, and insurance information. The embedding itself doesn't contain raw PII — but depending on how retrieval is implemented, that PII-adjacent context can contaminate the retrieved result.

Summarization artifacts

Long-context summarization is a common memory strategy: compress a long conversation into a summary, store the summary. The problem is that LLM summarization is nondeterministic and can preserve PII that was incidental rather than salient. "User discussed their daughter's college admission and mentioned she attends Roosevelt High School in Austin" is a privacy leak, even if the developer's intent was to store a preference or intent.

Third-party data passthrough

When agents access external tools — CRMs, EMRs, financial databases — the data they retrieve gets incorporated into context. That context shapes what gets stored as memory. A healthcare agent that queries a patient record to answer a question may inadvertently store a memory that contains PHI derived from the record, even if the developer's memory write logic only intended to store a preference or intent.


Why afterthought PII redaction fails

The intuitive solution is to add a redaction layer: before writing to the memory store, scan for PII and remove it.

This is better than nothing. It is not sufficient.

The scanning problem: Named entity recognition (NER) and regex-based PII detection have false negative rates. They miss non-standard formats, contextual PII, and domain-specific identifiers. A medical record number that looks like a random string. A financial account reference that doesn't match common patterns. A name in a non-English character set.

The granularity problem: Redaction that removes identifiers but preserves context can still be identifying. "The patient with the rare genetic condition who lives in the small town near the manufacturing plant" is not anonymized just because the name was stripped.

The timing problem: If redaction happens after the AI model has already processed the data, the window for a leak is open. An agent that crashes between inference and redaction may write unredacted data. A logging system that captures intermediate states may retain PII that was later removed from the primary store.

The HIPAA AI agents problem specifically: HIPAA requires that PHI protections apply to all forms of PHI — not just the formats you anticipated. An afterthought redaction system that covers the cases you thought of is not a HIPAA-compatible data protection strategy.


What architectural PII redaction looks like

The right approach moves PII protection from an application-layer concern to an infrastructure-layer invariant.

Scan before storage, not after

In Mnemonic's architecture, every memory write is intercepted at the API layer before it reaches storage. The redaction pipeline runs synchronously on the memory content — not asynchronously, not on a best-effort basis.

// Developer writes a memory
await mnemonic.remember({
  agent: "intake-bot",
  fact: "Patient prefers morning appointments. DOB: 1978-04-15.",
  retention: "365d",
  access: ["clinical-ops"]
});

// Before any storage occurs: // 1. PII scanner runs (DOB detected) // 2. DOB is redacted: "Patient prefers morning appointments. DOB: [REDACTED]." // 3. Redaction event is logged with: original field type, redaction timestamp, agent ID // 4. Redacted version is stored // 5. Original is never written to persistent storage

The developer doesn't manage the redaction pipeline. It runs for every write, on every memory, with no opt-out path for individual entries.

Typed PII detection, not just regex

Mnemonic's redaction engine detects PII by type: names, emails, phone numbers, SSNs, dates of birth, account numbers, addresses, medical record numbers, and custom patterns configurable per tenant.

This matters because the appropriate handling of different PII types differs. An email address in a memory about user preferences is different from an email address in a memory derived from a clinical document. Typed detection enables policy-based handling — not just blanket removal.

Redaction events are first-class audit objects

Every redaction creates an immutable audit record: what type of PII was detected, in which memory, from which agent, at what time, and what action was taken. This audit trail is separate from the memory store itself.

When an auditor asks "how many times did your AI agents encounter SSNs last quarter, and what happened each time?" — that is a queryable answer, not an educated guess.

Tenant-configurable detection rules

Different regulated environments have different PII definitions. A healthcare tenant needs PHI detection. A financial services tenant needs financial identifier detection. A legal firm needs privilege-related pattern detection.

Mnemonic's detection rules are configurable at the tenant level, within the platform's baseline ruleset. The baseline always runs. Tenants can add custom patterns for domain-specific identifiers.


The AI data protection architecture that works

Wrong approach: Build AI agent → add memory → add redaction as a cleanup step → discover gaps in production.

Right approach: Use a memory infrastructure layer where redaction is the pipeline, not a plugin. Governance is not something you add to AI memory. It is the condition under which AI memory operates.

This is especially true for:

  • HIPAA AI agents — PHI protection must be demonstrable, not asserted. You need audit logs showing that redaction ran, when it ran, and what it caught.
  • GDPR-compliant AI — Data minimization is a GDPR principle. Storing more PII than necessary, even if encrypted, is a compliance risk. Redaction at write time enforces minimization.
  • SOC 2 Type II — Auditors want to see that access controls and data protections are enforced systematically, not by developer convention.

What Mnemonic provides

Mnemonic is designed for regulated environments. PII redaction is not a feature — it is an architectural primitive that every memory operation passes through.

  • Pre-storage redaction — runs before any write to persistent storage
  • Typed detection — 15+ PII categories out of the box
  • Tenant-configurable rules — extend the baseline for domain-specific identifiers
  • Immutable redaction audit log — queryable record of every redaction event
  • Zero opt-out — redaction cannot be bypassed at the API level

The result: your AI agents can operate in regulated environments without requiring a separate PII-scrubbing pass, a manual review process, or the assumption that the LLM will "just not store" the sensitive parts.


Further reading