Skip to content

[DRAFT] MAINT: Production schema migration guard + add deliberate migration script for release#2028

Draft
jsong468 wants to merge 2 commits into
microsoft:mainfrom
jsong468:azure_sql_migration_guard
Draft

[DRAFT] MAINT: Production schema migration guard + add deliberate migration script for release#2028
jsong468 wants to merge 2 commits into
microsoft:mainfrom
jsong468:azure_sql_migration_guard

Conversation

@jsong468

Copy link
Copy Markdown
Contributor

Description

PyRIT's AzureSQLMemory runs alembic upgrade head automatically on construction, meaning any user connecting to a shared Azure SQL database triggers a schema upgrade using whatever migration files are on their machine. This PR prevents accidental schema modifications to production and adds a deliberate release-time migration path.

Runtime Guard (AzureSQLMemory)

  • When the connection string matches AZURE_SQL_DB_CONNECTION_STRING_PROD, the constructor now runs check-only mode instead of full migration
  • Check-only verifies the database schema matches the Python models without modifying it
  • If schemas match → normal startup, no error
  • If schemas don't match → logs a warning but does not block startup, so developers on newer code can still query prod data
  • If the prod env var is not set → existing behavior (full migration), no breaking change
  • skip_schema_migration=True bypasses both guard and migration entirely

_check_schema_migration on MemoryInterface

  • Added as a sibling to _run_schema_migration to keep both code paths at the same abstraction level
  • _run_schema_migration: upgrade + verify (unchanged)
  • _check_schema_migration: verify only (new)
  • These are not split into separate run/check primitives because upgrade-without-verify has no valid use case — the combined _run prevents callers from accidentally skipping verification

Release Migration Script (build_scripts/migrate_prod_memory_schema.py)

  • The single deliberate path for upgrading prod's schema, run during the release process (Step 9)
  • Requires --target-revision (explicit revision ID, not head) so migration is deterministic and tied to the exact release
  • Validates release environment: must be on a releases/ branch, clean working tree in pyrit/memory/, no .dev version suffix
  • Reads AZURE_SQL_DB_CONNECTION_STRING_PROD from ~/.pyrit/.env automatically (same files as initialize_pyrit_async), loaded with override=False so explicit env vars take precedence
  • Reports before/after revision for auditability
  • Runs check_schema_migrations after upgrade to confirm models match
  • Rolls back automatically on failure via engine.begin() (transactional DDL)
  • Interactive "type yes" confirmation when running in a terminal; skipped in CI
  • --skip-environment-check flag for CI pipelines where Git state may differ

memory_migrations.py head Subcommand

  • Prints the current Alembic head revision ID so releasers can identify the target revision
  • Sits alongside existing generate and check subcommands

Design Decisions

  • Warn instead of raise on schema mismatch: When a developer on main (with unreleased schema changes) connects to prod, the guard detects the mismatch but only logs a warning — it does not block startup. This preserves the primary goal (no accidental schema modification) while keeping prod usable for querying data. The previous iteration raised AutogenerateDiffsDetected, which blocked developers from using prod at all after any schema PR merged to main. Question for reviewers: should the mismatch behavior be configurable (e.g., warn vs. raise)? should an error always be raised? or is warn-only the right permanent default?
  • Check-only instead of hard block for prod: We considered raising RuntimeError unconditionally when connecting to prod, but this would force every user to pass skip_schema_migration=True, which also skips schema verification. Check-only lets developers connect to prod normally while ensuring the schema is never modified outside of a release.
  • Explicit revision over head: Using --target-revision instead of upgrade head ensures the migration is deterministic. If unreleased migration files exist locally, head would apply them; an explicit revision tied to the release prevents this.
  • _check_schema_migration as a method on MemoryInterface: Rather than importing check_schema_migrations directly from migration.py in AzureSQLMemory, we wrapped it in a method on the base class. This keeps both migration paths (_run and _check) at the same abstraction level, makes it reusable for SQLiteMemory if needed, and simplifies testing (patch.object instead of module-path patching).

Tests and Documentation

Unit Tests Added

  • 5 tests in test_azure_sql_memory.py (prod guard):

    • Prod connection runs check-only, not migration
    • Prod connection with schema mismatch warns but does not block startup
    • Non-prod connection runs full migration normally
    • No prod env var set runs full migration normally
    • skip_schema_migration=True bypasses guard and migration
  • 4 tests in test_migration.py:

    • _check_schema_migration delegates to check_schema_migrations
    • _check_schema_migration logs a warning on mismatch instead of raising
    • _check_schema_migration raises RuntimeError when engine is None
    • memory_migrations.py head outputs a valid hex revision ID

Manual E2E Verification

  • Environment checks block on wrong branch, dirty tree, dev version (exit 1)
  • Invalid revision ID caught with available revisions listed (exit 1)
  • Full migration against fresh SQLite DB succeeds and schema check passes (exit 0)
  • Idempotent: "already at target revision" exits cleanly (exit 0)
  • .env loading works correctly (override=False)

Documentation

  • Added Step 9 ("Migrate Production Database Schema") to doc/contributing/10_release_process.md
  • Includes prerequisites, target revision identification, migration command, post-migration verification, and rollback policy
  • Renumbered subsequent steps (10–12)

@jsong468 jsong468 marked this pull request as draft June 17, 2026 02:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant