feat(logs): redact PII from workflow logs via configurable rules by TheodoreSpeaks · Pull Request #5136 · simstudioai/sim

TheodoreSpeaks · 2026-06-19T02:35:55Z

Summary

Adds enterprise PII redaction for workflow execution logs, configured under Data Retention as org-scoped rules — each rule selects which PII entity types to mask and which workspaces it applies to (empty workspaces = all; empty entity types = redact nothing).
Reuses the guardrails Microsoft Presidio engine in mask mode, applied at the single log-persist choke point (completeWorkflowExecution), covering both the inline and externalized write paths.
Redaction is byte-budgeted/chunked (one batched subprocess per persist, not per block), structure-preserving (masks string leaves, rebuilds via structuredClone), and fail-safe (scrubs rather than leaks on error).
Adds a check-digit-validated VIN recognizer so it won't false-match arbitrary 17-char codes.
Adds per-workspace data-retention-hours overrides (resolved workspace ?? org ?? plan default) via a new nullable workspace.data_retention_settings column.
UI mirrors Access Control: a rules list + ChipModal (grouped checkbox grid for entity types, workspace multiselect), persisting on save with an unsaved-changes guard.

Type of Change

New feature

Testing

Unit tests for the redaction transform (recursion, batched substitution, structure preservation, fail-safe scrub, oversized-string skip) — passing.
Verified Presidio masking end-to-end locally (email/phone/credit-card/VIN), incl. VIN check-digit precision (valid VINs masked, random 17-char tokens left alone).
bun run lint, bun run check:api-validation:strict, and bun run check:migrations origin/staging all pass. Migration is a single additive nullable column (expand-phase, backward-compatible).

Checklist

Code follows project style guidelines
Self-reviewed my changes
Tests added/updated and passing
No new warnings introduced
I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

Enterprise PII redaction for workflow execution logs, configured under Data Retention as org-scoped rules (each rule picks entity types + which workspaces it applies to). Reuses the guardrails Presidio engine in mask mode at the log-persist choke point, with a check-digit-validated VIN recognizer. Also adds per-workspace data-retention-hours overrides.

vercel · 2026-06-19T02:36:01Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
docs	Skipped		Jun 19, 2026 3:05am

cursor · 2026-06-19T02:36:05Z

PR Summary

High Risk
Changes how execution logs are stored (PII masking) and how retention is resolved for cleanup; failures scrub content but Presidio/timeouts could affect log completion latency or content usability.

Overview
Adds enterprise PII redaction for workflow execution logs: org-level rules (Presidio entity types + workspace scope) are stored on piiRedaction, exposed via org data-retention GET/PUT, and applied at completeWorkflowExecution before inline or externalized storage. Masking is batched (maskPIIBatch / Python mask_batch), structure-preserving, and fail-safe ([REDACTION_FAILED] on errors or size limits).

Extends data retention with per-workspace hour overrides (workspace.data_retention_settings migration), a new /api/workspaces/[id]/data-retention API, shared resolveEffectiveRetentionHours, and cleanup dispatch that resolves workspace ?? org ?? plan instead of org-only.

Data Retention UI gains a PII rules section (list + modal with searchable entity grid and workspace targeting); retention hours and rules save separately. Guardrails gain a client-safe pii-entities catalog, a check-digit VIN recognizer, and refactored Python/TS subprocess helpers for batch masking.

^{Reviewed by Cursor Bugbot for commit 1635ce1. Bugbot is set up for automated code reviews on this repo. Configure here.}

TheodoreSpeaks · 2026-06-19T02:36:11Z

@greptile review

…rkflow-log # Conflicts: # packages/db/migrations/meta/0241_snapshot.json # packages/db/migrations/meta/_journal.json # scripts/check-api-validation-contracts.ts

cursor · 2026-06-19T02:41:20Z

+): number | null {
+  const { workspaceSettings, orgSettings, key, fallback } = params
+  return workspaceSettings?.[key] ?? orgSettings?.[key] ?? fallback
+}


Null retention treated as unset

High Severity

resolveEffectiveRetentionHours chains values with ??, so an explicit null retention hour (Forever in the UI) is skipped and replaced by the org value or enterprise plan fallback. Enterprise cleanup then schedules deletion when it previously skipped, and workspace overrides to Forever never win over org numeric defaults.

Additional Locations (2)

apps/sim/lib/billing/cleanup-dispatcher.ts#L271-L278

apps/sim/app/api/workspaces/[id]/data-retention/route.ts#L37-L61

^{Reviewed by Cursor Bugbot for commit d9a4dd4. Configure here.}

cursor · 2026-06-19T02:41:20Z

+    // Settings are only writable by enterprise orgs, but re-verify at read time
+    // (e.g. a plan downgrade) before doing the work.
+    if (!(await isWorkspaceOnEnterprisePlan(workspaceId))) return payload
+


Enterprise check skips active redaction

Medium Severity

When org PII redaction rules are active, applyPiiRedaction still returns the raw payload if isWorkspaceOnEnterprisePlan is false. That helper returns false on lookup errors, so a transient billing/workspace failure bypasses masking and can persist unredacted PII despite enabled rules.

^{Reviewed by Cursor Bugbot for commit d9a4dd4. Configure here.}

greptile-apps · 2026-06-19T02:44:20Z

Greptile Summary

This PR adds enterprise PII redaction for workflow execution logs, applying Microsoft Presidio in mask mode at the completeWorkflowExecution persist choke point, plus per-workspace retention-hours overrides and a settings UI mirroring the existing Access Control panel.

PII redaction engine: two-pass (collect → batch-mask) transform in pii-redaction.ts with a fail-safe scrub on Presidio failure; the Python mask_batch endpoint reuses one spaCy engine pair per subprocess invocation and adds a check-digit-validated VIN recognizer.
Retention resolution: resolveEffectivePiiRedaction unions entity types across applicable org-level rules per workspace; resolveEffectiveRetentionHours applies a workspace ?? org ?? plan cascade for hours overrides.
Schema & migration: additive nullable json column workspace.data_retention_settings in a single expand-phase migration; consistent with the existing organization.data_retention_settings type.

Confidence Score: 4/5

Safe to merge with one fix: applyPiiRedaction needs a try-catch so a transient DB failure during PII settings lookup cannot abort log persistence.

applyPiiRedaction performs a db.select() with no error handling. A transient DB failure there propagates through completeWorkflowExecution — which has no local catch around the call — and would prevent the entire execution log from being written. The rest of the change (transform logic, fail-safe scrub in redactPIIFromExecution, migration, UI) is well-structured.

apps/sim/lib/logs/execution/logger.ts — the applyPiiRedaction method needs defensive error handling around its DB query.

Important Files Changed

Filename	Overview
apps/sim/lib/logs/execution/logger.ts	Adds `applyPiiRedaction` on the hot completion path; missing try-catch around the DB query means a transient DB failure can cause log persistence to fail entirely.
apps/sim/lib/logs/execution/pii-redaction.ts	New two-pass (collect + substitute) PII redaction transform with fail-safe scrub on Presidio error; oversized-string handling and marker reuse have been flagged in prior review threads.
apps/sim/lib/billing/retention.ts	Introduces `resolveEffectivePiiRedaction` and `resolveEffectiveRetentionHours`; semantics of `entityTypes: []` discussed in prior threads are correctly documented inline here (empty = contributes nothing).
apps/sim/lib/guardrails/validate_pii.py	Adds check-digit-validated VIN recognizer and batch mask endpoint; ISO 3779 algorithm is correctly implemented with zero-weighted check-digit position.
apps/sim/lib/guardrails/validate_pii.ts	Extracts `runPythonScript` helper and adds `maskPIIBatch`; `executePythonPIIDetection` is not refactored to use the new helper, leaving duplicated subprocess-spawn boilerplate.
apps/sim/app/api/workspaces/[id]/data-retention/route.ts	New per-workspace retention-hours override endpoint; correctly gates writes behind enterprise check and correctly excludes PII rules (org-only concern) from workspace settings.
apps/sim/ee/data-retention/components/data-retention-settings.tsx	Adds PII rules UI with unsaved-changes guard; hours and rules are saved independently, which correctly avoids unintentional field clearing.
packages/db/migrations/0243_data_retention_pii_redaction_and_workspace_override.sql	Single additive nullable `json` column on `workspace`, consistent with the existing `organization.data_retention_settings` type; backward-compatible expand-phase migration.
apps/sim/lib/logs/execution/pii-redaction.test.ts	Unit tests cover recursion, batched substitution, structure preservation, fail-safe scrub, and oversized-string skip; `applyPiiRedaction` (DB path) is not tested.

_{Reviews (3): Last reviewed commit: "fix(logs): widen PII entity visibleValue..." | Re-trigger Greptile}

greptile-apps · 2026-06-19T02:44:23Z

+      rule?.appliesToAllWorkspaces === true ||
+      (Array.isArray(rule?.workspaceIds) && rule.workspaceIds.includes(params.workspaceId))
+  )
+
+  const union = new Set<string>()
+  for (const rule of applicable) {
+    if (!Array.isArray(rule.entityTypes)) continue
+    for (const t of rule.entityTypes) {
+      if (typeof t === 'string') union.add(t)
+    }
+  }
+  if (union.size === 0) return DEFAULT_PII_REDACTION
+  return { enabled: true, entityTypes: [...union] }
+}


entityTypes: [] silently means "redact nothing", contradicting the schema JSDoc

resolveEffectivePiiRedaction contributes nothing to the union when a rule has entityTypes: [], so if all applicable rules have empty entity-type lists the union stays empty and the function returns enabled: false — no redaction. However, every site that documents entityTypes carries the JSDoc "Empty = redact all detected PII" (schema.ts PiiRedactionRule, primitives.ts piiRedactionRuleSchema, pii-redaction.ts PiiRedactionOptions). An enterprise operator who creates a catch-all rule with no entity types selected, expecting Presidio to mask every recognised PII type, will instead get zero redaction — a silent privacy leak. The Zod schema accepts entityTypes: [] without error, so the bug is reachable via the API. Either the resolution logic should treat an empty entity-type list as "mask all" (pass [] through, which Presidio interprets as all types), or the JSDoc needs to be corrected to "Empty = skip rule (redact nothing)".

greptile-apps · 2026-06-19T02:44:24Z

+  return value.length > 0 && Buffer.byteLength(value, 'utf8') <= PII_MAX_STRING_BYTES
+}
+
+/**
+ * Rebuild `value` replacing every eligible string leaf with `handle(leaf)`.
+ * Used for both collection (handle records and returns the input) and
+ * substitution (handle returns the masked value), so traversal order and


Oversized strings are never replaced with REDACTION_FAILED_MARKER on the scrub path

When the total payload exceeds PII_MAX_TOTAL_BYTES, all eligible strings are replaced with REDACTION_FAILED_MARKER. But strings larger than PII_MAX_STRING_BYTES (128 KB) are excluded from collected entirely, so they pass through the substitution walk unchanged — even in the scrub/fail-safe branch. A block that emits a 200 KB string containing an email address or SSN will never be masked, regardless of whether the overall ceiling is hit or Presidio throws. The comment acknowledges this ("almost always base64 blobs") but does not log a warning when an oversized string is skipped, making it invisible to operators.

greptile-apps · 2026-06-19T02:44:25Z

    }
  }

+  /**
+   * Mask PII from log content before persistence when the execution's workspace
+   * (via workspace override or org default) has enterprise PII redaction enabled.
+   * Resolved at persist time so both the inline and externalized write paths are
+   * covered. Returns the payload unchanged when disabled or non-enterprise.
+   */
+  private async applyPiiRedaction(
+    workspaceId: string | null,
+    payload: RedactablePayload
+  ): Promise<RedactablePayload> {
+    if (!workspaceId) return payload
+
+    const [row] = await db
+      .select({ orgSettings: organization.dataRetentionSettings })
+      .from(workspace)
+      .leftJoin(organization, eq(organization.id, workspace.organizationId))
+      .where(eq(workspace.id, workspaceId))
+      .limit(1)
+    if (!row) return payload
+
+    const config = resolveEffectivePiiRedaction({ orgSettings: row.orgSettings, workspaceId })
+    if (!config.enabled) return payload
+
+    // Settings are only writable by enterprise orgs, but re-verify at read time
+    // (e.g. a plan downgrade) before doing the work.
+    if (!(await isWorkspaceOnEnterprisePlan(workspaceId))) return payload
+
+    return redactPIIFromExecution(payload, { entityTypes: config.entityTypes })
+  }
+
  async completeWorkflowExecution(params: {
    executionId: string


Unconditional DB join on every workflow completion

applyPiiRedaction always fires a workspace → LEFT JOIN organization query, plus a second isWorkspaceOnEnterprisePlan call when redaction is configured, regardless of whether the org has any PII rules or is even on an enterprise plan. For deployments where the majority of executions are personal/non-enterprise workspaces this adds two round-trips to the hot path of every workflow. A lightweight guard (e.g. caching the org's enterprise status or checking a feature-flag ahead of the query) would let the common case skip both calls entirely.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

greptile-apps · 2026-06-19T02:45:01Z

Greptile Summary

This PR adds enterprise-grade PII redaction for workflow execution logs, a new per-workspace data-retention-hours override, and a check-digit-validated VIN recognizer. The core redaction pipeline is well-engineered: a deterministic two-pass collect/substitute approach batches all strings from a single execution into one chunked Presidio call, and a fail-safe ensures PII is never persisted when masking fails.

PII redaction: hooked at completeWorkflowExecution after filterForDisplay and redactApiKeys; org-scoped rules are resolved at persist time covering both inline and externalized write paths; oversized payloads (>16 MB) are scrubbed rather than leaked.
Workspace retention overrides: new nullable workspace.data_retention_settings column with a clean workspace ?? org ?? plan-default resolution chain; a new /api/workspaces/[id]/data-retention endpoint mirrors the org-level API with proper admin + enterprise gating.
Three copies of the comment \"Empty = redact all detected PII\" on entityTypes (in schema.ts, primitives.ts, and pii-redaction.ts) directly contradict retention.ts, which treats an empty entity-type list as "contribute nothing, rule is inactive." The PR description confirms the correct semantic is "empty = redact nothing", so those three comments need updating to avoid operators mistakenly believing empty rules provide full coverage.

Confidence Score: 3/5

The redaction pipeline and workspace-override logic are structurally sound, but three copies of a wrong comment on a security-sensitive field create a genuine risk of operators deploying empty-entity-type rules under the false impression that they cover all PII.

The core machinery — two-pass collect/substitute, byte-chunked Presidio calls, fail-safe scrubbing, and the workspace ?? org ?? plan-default retention resolution — is implemented correctly. Three conflicting entityTypes comments in schema.ts, primitives.ts, and pii-redaction.ts say the opposite of what retention.ts implements. An enterprise admin who follows those comments and creates a rule with no entity types selected will have no PII masked at all — a silent failure in a privacy-protection feature.

packages/db/schema.ts, apps/sim/lib/api/contracts/primitives.ts, and apps/sim/lib/logs/execution/pii-redaction.ts all carry the wrong entityTypes comment that contradicts retention.ts.

Important Files Changed

Filename	Overview
apps/sim/lib/logs/execution/pii-redaction.ts	New PII redaction transform: deterministic two-pass collect/substitute approach is sound; fail-safe scrubbing on error is correct; misleading entityTypes comment and reuse of REDACTION_FAILED_MARKER for the byte-ceiling case need fixing.
apps/sim/lib/logs/execution/logger.ts	PII redaction correctly hooked at the single persist choke point after filterForDisplay + redactApiKeys; adds 2 extra DB queries per execution on the hot path that could be consolidated into the existing log fetch.
apps/sim/lib/billing/retention.ts	New file with clean resolution logic for both retention hours (workspace ?? org ?? fallback) and PII redaction (union of applicable rule entity types). Correctly returns disabled when union is empty, though this contradicts comments elsewhere.
packages/db/schema.ts	Adds DataRetentionSettings interface and workspace.dataRetentionSettings column; PiiRedactionRule.entityTypes comment says Empty = redact all detected PII which contradicts the actual retention.ts behavior (empty = redact nothing).
apps/sim/lib/api/contracts/primitives.ts	Adds well-validated piiRedactionRuleSchema and piiRedactionSettingsSchema; same misleading entityTypes comment as schema.ts.
apps/sim/lib/guardrails/validate_pii.ts	New maskPIIBatch function correctly chunks texts by byte budget (256KB), rejects on subprocess failure so callers can apply fail-safe; existing validatePII single-text path unchanged.
apps/sim/lib/guardrails/validate_pii.py	Adds mask_batch function reusing a single AnalyzerEngine + AnonymizerEngine across all texts in a chunk; VinRecognizer with ISO 3779 check-digit validation is well-implemented.
apps/sim/lib/billing/cleanup-dispatcher.ts	Adds workspace-level settings to the cleanup scope query; fallback is null for all enterprise jobs so behavior is identical to the previous code for unconfigured orgs.
apps/sim/app/api/workspaces/[id]/data-retention/route.ts	New workspace-level data retention GET/PUT endpoints; auth gating (session, admin permission, enterprise plan) is correct.
apps/sim/lib/logs/execution/pii-redaction.test.ts	Good coverage: structure preservation, immutability, fail-safe scrub on error, oversized-string skip, and no-op when no strings collected.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant E as Executor
    participant L as ExecutionLogger
    participant R as retention.ts
    participant P as pii-redaction.ts
    participant V as validate_pii.ts
    participant Py as validate_pii.py
    participant DB as Postgres

    E->>L: completeWorkflowExecution
    L->>L: filterForDisplay + redactApiKeys
    L->>DB: SELECT org.dataRetentionSettings
    DB-->>L: orgSettings
    L->>R: resolveEffectivePiiRedaction
    R-->>L: enabled + entityTypes
    alt PII enabled
        L->>DB: isWorkspaceOnEnterprisePlan
        DB-->>L: true
        L->>P: redactPIIFromExecution
        P->>P: collect string leaves
        loop Per 256KB chunk
            P->>V: maskPIIBatch
            V->>Py: spawn subprocess
            Py-->>V: masked strings
            V-->>P: masked strings
        end
        P-->>L: redacted payload
    end
    L->>DB: UPDATE workflowExecutionLogs
    L-->>E: WorkflowExecutionLog

%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant E as Executor
    participant L as ExecutionLogger
    participant R as retention.ts
    participant P as pii-redaction.ts
    participant V as validate_pii.ts
    participant Py as validate_pii.py
    participant DB as Postgres

    E->>L: completeWorkflowExecution
    L->>L: filterForDisplay + redactApiKeys
    L->>DB: SELECT org.dataRetentionSettings
    DB-->>L: orgSettings
    L->>R: resolveEffectivePiiRedaction
    R-->>L: enabled + entityTypes
    alt PII enabled
        L->>DB: isWorkspaceOnEnterprisePlan
        DB-->>L: true
        L->>P: redactPIIFromExecution
        P->>P: collect string leaves
        loop Per 256KB chunk
            P->>V: maskPIIBatch
            V->>Py: spawn subprocess
            Py-->>V: masked strings
            V-->>P: masked strings
        end
        P-->>L: redacted payload
    end
    L->>DB: UPDATE workflowExecutionLogs
    L-->>E: WorkflowExecutionLog

_{Reviews (2): Last reviewed commit: "Merge remote-tracking branch 'origin/sta..." | Re-trigger Greptile}

greptile-apps · 2026-06-19T02:45:04Z

+export interface PiiRedactionOptions {
+  /** Presidio entity types to mask. Empty = redact all detected PII. */
+  entityTypes: string[]
+  language?: string
+}


Contradictory entityTypes: [] semantics across the codebase

This comment says "Empty = redact all detected PII", but resolveEffectivePiiRedaction in retention.ts explicitly documents and implements the opposite: an empty union returns DEFAULT_PII_REDACTION (disabled), so empty entityTypes means redact nothing. The same misleading comment also appears on PiiRedactionRule.entityTypes in schema.ts and primitives.ts.

In practice, if an admin creates a rule with no entity types selected (believing the schema comment), resolveEffectivePiiRedaction returns enabled: false and applyPiiRedaction returns the payload unchanged — no PII is masked. The inconsistency is security-relevant: an operator who reads the schema or this interface and acts on it would mistakenly believe their logs are protected.

greptile-apps · 2026-06-19T02:45:05Z

+/** Replaces text we could not safely mask, so PII is never persisted on failure. */
+export const REDACTION_FAILED_MARKER = '[REDACTION_FAILED]'


The REDACTION_FAILED_MARKER label is reused for a deliberate performance-safety scrub (payload exceeds the 16 MB ceiling), which is not a failure of the Presidio service. An operator who sees [REDACTION_FAILED] in a log will reasonably assume the masking subprocess crashed, triggering unnecessary incident investigation. A distinct constant makes the two cases distinguishable without any logic change.

Suggested change

/** Replaces text we could not safely mask, so PII is never persisted on failure. */

export const REDACTION_FAILED_MARKER = '[REDACTION_FAILED]'

/** Replaces text we could not safely mask, so PII is never persisted on failure. */

export const REDACTION_FAILED_MARKER = '[REDACTION_FAILED]'

/**

* Replaces text when the payload exceeds {@link PII_MAX_TOTAL_BYTES} and NER

* was deliberately skipped. Distinct from {@link REDACTION_FAILED_MARKER} so

* operators can tell a size-based scrub from a Presidio service failure.

*/

export const REDACTION_OVERSIZED_MARKER = '[REDACTED_OVERSIZED]'

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

greptile-apps · 2026-06-19T02:45:06Z

+  private async applyPiiRedaction(
+    workspaceId: string | null,
+    payload: RedactablePayload
+  ): Promise<RedactablePayload> {
+    if (!workspaceId) return payload
+
+    const [row] = await db
+      .select({ orgSettings: organization.dataRetentionSettings })
+      .from(workspace)
+      .leftJoin(organization, eq(organization.id, workspace.organizationId))
+      .where(eq(workspace.id, workspaceId))
+      .limit(1)
+    if (!row) return payload
+
+    const config = resolveEffectivePiiRedaction({ orgSettings: row.orgSettings, workspaceId })
+    if (!config.enabled) return payload
+
+    // Settings are only writable by enterprise orgs, but re-verify at read time
+    // (e.g. a plan downgrade) before doing the work.
+    if (!(await isWorkspaceOnEnterprisePlan(workspaceId))) return payload
+
+    return redactPIIFromExecution(payload, { entityTypes: config.entityTypes })
+  }


Two extra DB round-trips per execution on the hot completion path

applyPiiRedaction always issues a JOIN query for org settings, then — if rules are active — calls isWorkspaceOnEnterprisePlan (another query). For the vast majority of executions (no PII rules configured, or non-enterprise workspace), this is wasted work on the critical log-persist path. The existing existingLog query at the top of completeWorkflowExecution already loads the workspace row; adding organization.dataRetentionSettings to that initial SELECT would remove the need for this second query entirely.

greptile-apps · 2026-06-19T02:45:07Z

+  /** Presidio entity types to mask. Empty = redact all detected PII. */
+  entityTypes: string[]


This comment contradicts how resolveEffectivePiiRedaction in retention.ts actually behaves. A rule with an empty entityTypes array contributes nothing to the applicable union, which makes the effective policy enabled: false — meaning nothing is redacted. The intended semantics (from retention.ts's own doc and the PR description) are "empty = redact nothing (rule contributes no types)".

Suggested change

/** Presidio entity types to mask. Empty = redact all detected PII. */

entityTypes: string[]

/** Presidio entity types to mask. Empty array = rule contributes no types (redact nothing). */

entityTypes: string[]

greptile-apps · 2026-06-19T02:45:08Z

+  /** Presidio entity types to mask. Empty = redact all detected PII. */
+  entityTypes: z.array(z.string().min(1, 'Entity type cannot be empty')).max(100),


Same misleading comment as on PiiRedactionRule in schema.ts. resolveEffectivePiiRedaction treats an empty entityTypes array on a rule as contributing nothing to the union, resulting in enabled: false (no redaction). The comment should reflect that.

Suggested change

/** Presidio entity types to mask. Empty = redact all detected PII. */

entityTypes: z.array(z.string().min(1, 'Entity type cannot be empty')).max(100),

/** Presidio entity types to mask. Empty array = rule contributes no types (redact nothing). */

entityTypes: z.array(z.string().min(1, 'Entity type cannot be empty')).max(100),

…d typecheck

TheodoreSpeaks · 2026-06-19T02:49:09Z

@greptile review

greptile-apps · 2026-06-19T02:55:27Z

+  private async applyPiiRedaction(
+    workspaceId: string | null,
+    payload: RedactablePayload
+  ): Promise<RedactablePayload> {
+    if (!workspaceId) return payload
+
+    const [row] = await db
+      .select({ orgSettings: organization.dataRetentionSettings })
+      .from(workspace)
+      .leftJoin(organization, eq(organization.id, workspace.organizationId))
+      .where(eq(workspace.id, workspaceId))
+      .limit(1)
+    if (!row) return payload
+
+    const config = resolveEffectivePiiRedaction({ orgSettings: row.orgSettings, workspaceId })
+    if (!config.enabled) return payload
+
+    // Settings are only writable by enterprise orgs, but re-verify at read time
+    // (e.g. a plan downgrade) before doing the work.
+    if (!(await isWorkspaceOnEnterprisePlan(workspaceId))) return payload
+
+    return redactPIIFromExecution(payload, { entityTypes: config.entityTypes })
+  }


applyPiiRedaction DB failure propagates to log persistence

The db.select() inside applyPiiRedaction has no error handling. A transient DB error (connection timeout, pool exhaustion) while looking up org PII settings will throw from this function, propagate through completeWorkflowExecution — which has no local catch around the applyPiiRedaction call — and cause the entire workflow execution log to fail to persist. The PR's own fail-safe principle ("scrubs rather than leaks on error") in redactPIIFromExecution is not applied here to the settings-lookup path. A try/catch around the entire method body returning payload on any error would prevent PII configuration lookup failures from destroying execution history.

…t lazy - Extend PII redaction to span error/errorMessage/toolCalls and top-level error/completionFailure/trigger/executionState (Bugbot: PII in execution metadata). executionState is safe to redact — resume reads from the separate pausedExecutions table, not the log copy. - Lazy-import validate_pii in pii-redaction so the Python/child_process guardrails module stays out of the static middleware/RSC graph. - Type the org retention mutation to the contract body (optional, non-null).

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 3 total unresolved issues (including 2 from previous reviews).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 1635ce1. Configure here.}

cursor · 2026-06-19T03:07:42Z

+    })
+  }
+
+  if (collected.length === 0) return payload


Oversized-only logs skip redaction

High Severity

When PII redaction is active, redactPIIFromExecution returns the original payload unchanged if every string leaf is skipped (e.g. over 128KB). Those fields are never masked or scrubbed, so sensitive text in large log values can still be persisted despite an enabled policy.

^{Reviewed by Cursor Bugbot for commit 1635ce1. Configure here.}

cursor Bot reviewed Jun 19, 2026

View reviewed changes

Comment thread apps/sim/lib/logs/execution/logger.ts

Comment thread apps/sim/lib/logs/execution/pii-redaction.ts

Merge remote-tracking branch 'origin/staging' into feat/redact-pii-wo…

d9a4dd4

…rkflow-log # Conflicts: # packages/db/migrations/meta/0241_snapshot.json # packages/db/migrations/meta/_journal.json # scripts/check-api-validation-contracts.ts

cursor Bot reviewed Jun 19, 2026

View reviewed changes

greptile-apps Bot reviewed Jun 19, 2026

View reviewed changes

vercel Bot deployed to Preview June 19, 2026 02:45 View deployment

fix(logs): widen PII entity visibleValues to string[] for strict buil…

147b143

…d typecheck

vercel Bot temporarily deployed to Preview June 19, 2026 02:48 Inactive

greptile-apps Bot reviewed Jun 19, 2026

View reviewed changes

vercel Bot temporarily deployed to Preview June 19, 2026 03:05 Inactive

cursor Bot reviewed Jun 19, 2026

View reviewed changes

		/** Replaces text we could not safely mask, so PII is never persisted on failure. */
		export const REDACTION_FAILED_MARKER = '[REDACTION_FAILED]'

		/** Presidio entity types to mask. Empty = redact all detected PII. */
		entityTypes: string[]

		/** Presidio entity types to mask. Empty = redact all detected PII. */
		entityTypes: z.array(z.string().min(1, 'Entity type cannot be empty')).max(100),

Conversation

TheodoreSpeaks commented Jun 19, 2026

Summary

Type of Change

Testing

Checklist

Uh oh!

vercel Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

TheodoreSpeaks commented Jun 19, 2026

Uh oh!

Uh oh!

Uh oh!

cursor Bot Jun 19, 2026

Choose a reason for hiding this comment

Null retention treated as unset

Uh oh!

cursor Bot Jun 19, 2026

Choose a reason for hiding this comment

Enterprise check skips active redaction

Uh oh!

greptile-apps Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

greptile-apps Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Jun 19, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

TheodoreSpeaks commented Jun 19, 2026

Uh oh!

greptile-apps Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 19, 2026

Choose a reason for hiding this comment

Oversized-only logs skip redaction

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

vercel Bot commented Jun 19, 2026 •

edited

Loading

cursor Bot commented Jun 19, 2026 •

edited

Loading

greptile-apps Bot commented Jun 19, 2026 •

edited

Loading