[BREAKING] MAINT: Standardize garak.encoding defaults and fix atomic-attack name collisions by varunj-msft · Pull Request #2058 · microsoft/PyRIT

varunj-msft · 2026-06-19T22:44:31Z

Description

Part of the Standardizing Scenarios effort. This standardizes the garak.encoding
scenario so its default run is fast and representative instead of exhaustive, and
fixes a latent atomic-attack naming bug.

Two changes:

Default strategy is now a curated DEFAULT aggregate (Base16, ROT13, MorseCode —
one base-N, one substitution cipher, one symbolic alphabet) instead of ALL. This
drops a default run from 106 to 16 atomic attacks. ALL is still available for an
exhaustive run. The fast path is --strategies rot13 --max-dataset-size 1.
Atomic-attack names were not unique: every converter variant of an encoding shared
the encoding name (e.g. all four base64 variants × five prompt configs were named
"base64"). Since results and the display map are keyed by atomic_attack_name, those
collapsed to a single key, corrupting result tracking and --resume. Names are now
unique per variant (e.g. base64_urlsafe_decode0), and display_group keeps the
per-encoding grouping in reports.

The encoding axis here is the scheme, not an attack technique, so SINGLE_TURN/MULTI_TURN
aggregates don't apply and are intentionally not added.

Breaking: same constructor call now produces different default atomic attacks, so the
scenario VERSION is bumped 1 -> 2. On --resume against an old result this raises a clear
ValueError instead of silently merging incompatible runs. Public API and constructor
signatures are unchanged.

Also adds a backend regression test pinning the real EncodingDatasetConfiguration
round-trip, since the backend silently degrades a lost config subclass to a base config.

Tests and Documentation

Tests (tests/unit/scenario/garak/test_encoding.py, tests/unit/backend/test_scenario_run_service.py):

VERSION bump and DEFAULT-as-default-strategy
DEFAULT membership = {Base16, ROT13, MorseCode}, and DEFAULT ⊂ ALL
atomic counts: DEFAULT=16, ALL=106, ROT13 fast path=6
name uniqueness under ALL (106 unique) and across multi-variant encodings
display_group still groups variants by encoding
baseline-disabled and custom-template count edge cases
real EncodingDatasetConfiguration round-trips through the backend service

Full scenario unit suite passes (708). ruff, ruff format, and ty are clean.

Documentation (doc/scanner/garak.py + .ipynb):

documented DEFAULT (curated) vs ALL (exhaustive) and the rot13 fast path
ran jupytext to keep the .py and .ipynb in sync; notebook outputs preserved

… collisions

rlundeen2 · 2026-06-20T04:07:02Z

    """

-    # Aggregate member
+    # Aggregate members


Should we make this run more attacks? What's the run time, and how does it compare to other scanners?

I think target of 10-20 minutes is good and this may finish too qquickly

rlundeen2 · 2026-06-20T04:08:13Z

+        # ``encoding_name`` drives strategy selection and user-facing grouping (display_group);
+        # ``variant_slug`` is unique per row so that atomic-attack names stay unique even when one
+        # encoding name maps to multiple converter variants (e.g. base64, ascii85).
+        # NOTE: some base64 variants are near-duplicates (default == standard_b64encode; b2a only


should we trim the near duplicates base64 now with the version dump?

MAINT: Standardize garak.encoding defaults and fix atomic-attack name…

27f6851

… collisions

rlundeen2 reviewed Jun 20, 2026

View reviewed changes

rlundeen2 self-assigned this Jun 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BREAKING] MAINT: Standardize garak.encoding defaults and fix atomic-attack name collisions#2058

[BREAKING] MAINT: Standardize garak.encoding defaults and fix atomic-attack name collisions#2058
varunj-msft wants to merge 1 commit into
microsoft:mainfrom
varunj-msft:varunj-msft/8380-Standardizing-Scenarios-Garak-Encoding-Defaults

varunj-msft commented Jun 19, 2026

Uh oh!

rlundeen2 Jun 20, 2026

Uh oh!

rlundeen2 Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

varunj-msft commented Jun 19, 2026

Description

Tests and Documentation

Uh oh!

rlundeen2 Jun 20, 2026

Choose a reason for hiding this comment

Uh oh!

rlundeen2 Jun 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants