Add eval CI jobs for edge-tooling by kasturinarra · Pull Request #81386 · openshift/release

kasturinarra · 2026-07-02T10:57:03Z

Summary

Adds 4 eval CI job entries for openshift-eng/edge-tooling:
- eval-cluster-diagnostic — manual trigger via /test eval-cluster-diagnostic
- eval-cluster-diagnostic-changed — auto-triggers when plugins/two-node/evals/.*cluster-diagnostic files change
- eval-threat-model-tnf — manual trigger via /test eval-threat-model-tnf
- eval-threat-model-tnf-changed — auto-triggers when plugins/two-node/evals/.*threat-model-tnf files change
All use openshift-claude-agent-eval workflow, claude-opus-4-6 model, parallelism 3
All are optional: true — failures don't block merge

Context

Replaces #81166 which used on-the-fly eval generation (EVAL_SETUP_SCRIPT). This PR points at committed eval configs from edge-tooling PR #178 instead — no dynamic generation, enabling regression detection with committed judges.

Pattern follows openshift-eng/ai-helpers eval CI config exactly (per-skill manual + -changed auto-trigger pairs).

Test plan

Verify YAML is valid (done locally)
After edge-tooling PR Add config locations for config-updater #178 merges, test with /test eval-cluster-diagnostic on an edge-tooling PR
Verify -changed auto-trigger fires when eval case files are modified

🤖 Generated with Claude Code

Summary by CodeRabbit

This PR updates CI configuration in openshift/release for openshift-eng/edge-tooling by adding two new optional Claude eval presubmit entries that run the openshift-claude-agent-eval workflow with the claude-opus-4-6 model and EVAL_PARALLELISM: "3". The jobs support both manual and change-driven execution:

eval-all (manual via /test eval-all): runs with EVAL_DISCOVER: "true" to auto-discover eval configs.
eval-changed (manual via /test eval-changed, auto-triggered via run_if_changed: ^plugins/.*/evals/|^plugins/.*/skills/): runs with EVAL_CHANGED_ONLY: "true" (and EVAL_DISCOVER: "true") to limit execution to changed evals.

To support these modes, the openshift-claude-agent-eval step runner was enhanced to:

validate EVAL_DISCOVER vs EVAL_CONFIG mutual exclusivity,
loop over multiple selected/discovered eval configs (with per-config RUN_ID),
filter “changed” evals/cases when EVAL_CHANGED_ONLY=true and PULL_BASE_SHA is set,
aggregate results into a combined junit_claude-eval.xml,
write/flush JUnit after each config completes so results aren’t lost if the step is interrupted,
add a step-level timeout guard (~2h50m) to stop before the 3-hour limit is reached.

coderabbitai · 2026-07-02T10:57:33Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 8f6af81f-e858-4d54-be89-d3b6e092db74

📥 Commits

Reviewing files that changed from the base of the PR and between 2ce7768 and 88a7f89.

📒 Files selected for processing (1)

ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-commands.sh

🚧 Files skipped from review as they are similar to previous changes (1)

ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-commands.sh

Walkthrough

This PR adds discovery-based openshift-claude-agent-eval jobs and updates the eval step to run one or more configs, optionally limit execution to changed evals, and aggregate results into a combined JUnit report.

Changes

Claude agent eval discovery flow

Layer / File(s)	Summary
Inputs and job wiring `ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-ref.yaml`, `ci-operator/config/openshift-eng/edge-tooling/openshift-eng-edge-tooling-main.yaml`	Adds `EVAL_DISCOVER` to the step environment and adds two optional CI jobs that invoke the eval workflow with discovery, model, parallelism, and changed-only settings.
Config selection `ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-commands.sh`	Documents `EVAL_DISCOVER`, builds `CONFIGS_TO_RUN` from discovery or `EVAL_CONFIG`, rejects setting both, filters discovered configs to changed evals when requested, and validates config existence in the non-discovery path.
Per-config eval execution `ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-commands.sh`	Loops over selected configs, derives per-config run IDs and cases, runs `claude` with per-config arguments, and writes combined JUnit output across all configs.

Estimated code review effort: 3 (Moderate) | ~25 minutes

🚥 Pre-merge checks | ✅ 15

✅ Passed checks (15 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly matches the main change: adding eval CI jobs for edge-tooling.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names	✅ Passed	PR only changes ci-operator YAML and a shell step; no Ginkgo It/Describe/Context/When titles were added or modified.
Test Structure And Quality	✅ Passed	No Ginkgo test code was changed; the PR only updates ci-operator YAML and a shell eval step script, so the test-quality rubric is not applicable.
Microshift Test Compatibility	✅ Passed	Touched files only add CI wiring and eval runner logic; no new Ginkgo It/Describe/Context tests or MicroShift-unsafe API use appear.
Single Node Openshift (Sno) Test Compatibility	✅ Passed	The PR only changes ci-operator YAML and an eval shell script; no Ginkgo e2e tests or SNO-sensitive test bodies were added.
Topology-Aware Scheduling Compatibility	✅ Passed	PASS: The PR only changes CI job config and eval step scripts; no deployment/controller code or scheduling fields (nodeSelector/affinity/PDB) were added.
Ote Binary Stdout Contract	✅ Passed	PR only changes ci-operator config and a shell step script; no OTE Go binary main/init/TestMain stdout contract code is touched.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	No new Ginkgo e2e tests were added; the PR only changes CI config and an eval harness script.
No-Weak-Crypto	✅ Passed	Changed files only add eval job YAML/env wiring; grep found no MD5/SHA1/DES/RC4/3DES/Blowfish/ECB or custom crypto/secret comparisons in the diff.
Container-Privileges	✅ Passed	Touched CI YAML and step-registry files only add eval jobs/env vars; no privileged, hostPID/Network/IPC, SYS_ADMIN, or allowPrivilegeEscalation settings appear.
No-Sensitive-Data-In-Logs	✅ Passed	Only operational metadata is logged; token handling prints presence/path, and no direct passwords, tokens, PII, or hostnames are echoed.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/config/openshift-eng/edge-tooling/openshift-eng-edge-tooling-main.yaml`:
- Around line 67-76: The eval job in openshift-claude-agent-eval is setting
EVAL_CHANGED_ONLY without a non-empty EVAL_CASES_DIR, so the changed-only path
becomes a no-op. Update the eval job definition for
eval-cluster-diagnostic-changed to either add EVAL_CASES_DIR alongside
EVAL_CONFIG and EVAL_MODEL in the env block, or remove EVAL_CHANGED_ONLY if the
intent is to run the full eval instead.

In
`@ci-operator/step-registry/lvms/zstream/trigger/lvms-zstream-trigger-commands.sh`:
- Around line 99-105: `get_last_digest` is currently treating any previously
recorded image digest as tested, even when the prior trigger run failed. Update
the logic around `get_last_digest` and the summary-write path that stores the
image after `jobs_failed > 0` so failed Gangway trigger attempts do not persist
the digest for reuse; only record the image when the run succeeds, or clear/skip
writing it on failure so the next cron run retries the same digest.
- Around line 283-289: The entry parsing in
trigger/lvms-zstream-trigger-commands.sh can still pass a trailing Prow JS
semicolon into jq, causing the lookup to fall back to unknown. Update the entry
assignment logic near the jq pipeline to strip the trailing “;” from prow_json
before parsing, matching the behavior already used by load_previous_summary, so
the JSON is valid and the last run state is extracted correctly.
- Around line 165-168: The snapshot extraction pipeline in extract_snapshot is
aborting under set -euo pipefail when grep finds no match, which prevents the
per-release error handler from running. Update the extraction logic around the
jq/grep/head pipeline so that a missing snapshot yields an empty
string/zero-status result instead of failing, while preserving the current
parsing behavior for valid matches.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 5ddda36d-ab21-494e-820e-cf2b347b59b3

📥 Commits

Reviewing files that changed from the base of the PR and between d947fcf and 007eb23.

⛔ Files ignored due to path filters (1)

ci-operator/jobs/openshift/lvm-operator/openshift-lvm-operator-main-periodics.yaml is excluded by !ci-operator/jobs/**

📒 Files selected for processing (7)

ci-operator/config/openshift-eng/edge-tooling/openshift-eng-edge-tooling-main.yaml
ci-operator/config/openshift/lvm-operator/openshift-lvm-operator-main__zstream.yaml
ci-operator/step-registry/lvms/zstream/OWNERS
ci-operator/step-registry/lvms/zstream/trigger/OWNERS
ci-operator/step-registry/lvms/zstream/trigger/lvms-zstream-trigger-commands.sh
ci-operator/step-registry/lvms/zstream/trigger/lvms-zstream-trigger-ref.metadata.json
ci-operator/step-registry/lvms/zstream/trigger/lvms-zstream-trigger-ref.yaml

coderabbitai · 2026-07-02T11:07:38Z

+get_last_digest() {
+    local release="$1"
+    if [[ -z "${PREV_SUMMARY}" ]]; then
+        echo ""
+        return
+    fi
+    jq -r --arg r "${release}" '.[$r].image // "" | split("@") | if length > 1 then .[1] else "" end' "${PREV_SUMMARY}" 2>/dev/null || echo ""


🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Don’t mark a digest tested when trigger attempts failed.

get_last_digest accepts any prior image, but Line 410 writes that image even when jobs_failed > 0; the next cron run will skip the same digest instead of retrying failed Gangway triggers.

Proposed fix

- jq -r --arg r "${release}" '.[$r].image // "" | split("@") | if length > 1 then .[1] else "" end' "${PREV_SUMMARY}" 2>/dev/null || echo "" + jq -r --arg r "${release}" ' + .[$r] + | select((.status == "triggered" and (.jobs_failed // 0) == 0) or .status == "skipped") + | .image // "" + | split("@") + | if length > 1 then .[1] else "" end + ' "${PREV_SUMMARY}" 2>/dev/null || echo ""

Also applies to: 407-410

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/lvms/zstream/trigger/lvms-zstream-trigger-commands.sh` around lines 99 - 105, `get_last_digest` is currently treating any previously recorded image digest as tested, even when the prior trigger run failed. Update the logic around `get_last_digest` and the summary-write path that stores the image after `jobs_failed > 0` so failed Gangway trigger attempts do not persist the digest for reuse; only record the image when the run succeeds, or clear/skip writing it on failure so the next cron run retries the same digest.

coderabbitai · 2026-07-02T11:07:38Z

+    echo "${files_json}" | jq -r '
+        [.[] | select(.filename | contains("catalog")) | .patch // ""] |
+        join("\n")
+    ' | grep -oP '(?<=snapshot: )\S+' | head -1


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Make snapshot extraction return empty instead of aborting.

With set -euo pipefail, grep returning no match makes snapshot=$(extract_snapshot ...) exit before the handler at Line 342 can write the intended per-release error summary.

Proposed fix

echo "${files_json}" | jq -r ' [.[] | select(.filename | contains("catalog")) | .patch // ""] | join("\n") - ' | grep -oP '(?<=snapshot: )\S+' | head -1 + ' | { grep -oP '(?<=snapshot: )\S+' || true; } | head -1 }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

echo "${files_json}" | jq -r '

[.[] | select(.filename | contains("catalog")) | .patch // ""] |

join("\n")

' | grep -oP '(?<=snapshot: )\S+' | head -1

echo "${files_json}" | jq -r '

[.[] | select(.filename | contains("catalog")) | .patch // ""] |

join("\n")

' | { grep -oP '(?<=snapshot: )\S+' || true; } | head -1

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/lvms/zstream/trigger/lvms-zstream-trigger-commands.sh` around lines 165 - 168, The snapshot extraction pipeline in extract_snapshot is aborting under set -euo pipefail when grep finds no match, which prevents the per-release error handler from running. Update the extraction logic around the jq/grep/head pipeline so that a missing snapshot yields an empty string/zero-status result instead of failing, while preserving the current parsing behavior for valid matches.

coderabbitai · 2026-07-02T11:07:38Z

+        local entry
+        entry=$(echo "${prow_json}" | sed 's/^var allBuilds = //' | \
+            jq -r --arg n "${test_name}" '
+                [.items[] | select(.status.state == "success" or .status.state == "failure" or .status.state == "error" or .status.state == "aborted")] |
+                if length > 0 then .[0] else null end |
+                if . then {name: $n, state: .status.state, url: .status.url, started: .status.startTime} else {name: $n, state: "unknown"} end
+            ' 2>/dev/null || echo "{\"name\": \"${test_name}\", \"state\": \"unknown\"}")


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Strip the trailing Prow JS semicolon here too.

load_previous_summary already removes ;$; this path does not, so jq can fail on the trailing semicolon and report every last run as unknown.

Proposed fix

- entry=$(echo "${prow_json}" | sed 's/^var allBuilds = //' | \ + entry=$(echo "${prow_json}" | sed 's/^var allBuilds = //; s/;$//' | \ jq -r --arg n "${test_name}" '

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

local entry

entry=$(echo "${prow_json}" | sed 's/^var allBuilds = //' | \

jq -r --arg n "${test_name}" '

[.items[] | select(.status.state == "success" or .status.state == "failure" or .status.state == "error" or .status.state == "aborted")] |

if length > 0 then .[0] else null end |

if . then {name: $n, state: .status.state, url: .status.url, started: .status.startTime} else {name: $n, state: "unknown"} end

' 2>/dev/null || echo "{\"name\": \"${test_name}\", \"state\": \"unknown\"}")

local entry

entry=$(echo "${prow_json}" | sed 's/^var allBuilds = //; s/;$//' | \

jq -r --arg n "${test_name}" '

[.items[] | select(.status.state == "success" or .status.state == "failure" or .status.state == "error" or .status.state == "aborted")] |

if length > 0 then .[0] else null end |

if . then {name: $n, state: .status.state, url: .status.url, started: .status.startTime} else {name: $n, state: "unknown"} end

' 2>/dev/null || echo "{\"name\": \"${test_name}\", \"state\": \"unknown\"}")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/lvms/zstream/trigger/lvms-zstream-trigger-commands.sh` around lines 283 - 289, The entry parsing in trigger/lvms-zstream-trigger-commands.sh can still pass a trailing Prow JS semicolon into jq, causing the lookup to fall back to unknown. Update the entry assignment logic near the jq pipeline to strip the trailing “;” from prow_json before parsing, matching the behavior already used by load_previous_summary, so the JSON is valid and the last run state is extracted correctly.

kasturinarra · 2026-07-02T14:02:32Z

/test eval-cluster-diagnostic

openshift-ci · 2026-07-02T14:02:39Z

@kasturinarra: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test boskos-config

/test boskos-config-generation

/test check-gh-automation

/test check-gh-automation-tide

/test check-trigger-trusted-apps

/test ci-operator-config

/test ci-operator-config-metadata

/test ci-operator-registry

/test ci-secret-bootstrap-config-validation

/test ci-testgrid-allow-list

/test cluster-manifest-verifier

/test clusterimageset-validate

/test config

/test core-valid

/test generated-config

/test generated-dashboards

/test hyperfleet-risk-scorer-test

/test image-mirroring-config-validation

/test jira-lifecycle-config

/test labels

/test openshift-image-mirror-mappings

/test ordered-prow-config

/test owners

/test pr-reminder-config

/test prow-config

/test prow-config-filenames

/test prow-config-semantics

/test pylint

/test release-config

/test release-controller-config

/test rover-groups-config-validation

/test secret-generator-config-valid

/test services-valid

/test stackrox-stackrox-stackrox-stackrox-check

/test step-registry-metadata

/test step-registry-shellcheck

/test sync-rover-groups

/test verified-config

/test yamllint

The following commands are available to trigger optional jobs:

/test check-cluster-profiles-config

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-release-check-gh-automation

pull-ci-openshift-release-main-ci-operator-config

pull-ci-openshift-release-main-ci-operator-config-metadata

pull-ci-openshift-release-main-ci-operator-registry

pull-ci-openshift-release-main-config

pull-ci-openshift-release-main-core-valid

pull-ci-openshift-release-main-generated-config

pull-ci-openshift-release-main-ordered-prow-config

pull-ci-openshift-release-main-owners

pull-ci-openshift-release-main-prow-config-filenames

pull-ci-openshift-release-main-prow-config-semantics

pull-ci-openshift-release-main-release-controller-config

pull-ci-openshift-release-openshift-image-mirror-mappings

pull-ci-openshift-release-yamllint

Details

In response to this:

/test eval-cluster-diagnostic

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

kasturinarra · 2026-07-02T14:05:23Z

/test eval-cluster-diagnostic

openshift-ci · 2026-07-02T14:05:28Z

@kasturinarra: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test boskos-config

/test boskos-config-generation

/test check-gh-automation

/test check-gh-automation-tide

/test check-trigger-trusted-apps

/test ci-operator-config

/test ci-operator-config-metadata

/test ci-operator-registry

/test ci-secret-bootstrap-config-validation

/test ci-testgrid-allow-list

/test cluster-manifest-verifier

/test clusterimageset-validate

/test config

/test core-valid

/test generated-config

/test generated-dashboards

/test hyperfleet-risk-scorer-test

/test image-mirroring-config-validation

/test jira-lifecycle-config

/test labels

/test openshift-image-mirror-mappings

/test ordered-prow-config

/test owners

/test pr-reminder-config

/test prow-config

/test prow-config-filenames

/test prow-config-semantics

/test pylint

/test release-config

/test release-controller-config

/test rover-groups-config-validation

/test secret-generator-config-valid

/test services-valid

/test stackrox-stackrox-stackrox-stackrox-check

/test step-registry-metadata

/test step-registry-shellcheck

/test sync-rover-groups

/test verified-config

/test yamllint

The following commands are available to trigger optional jobs:

/test check-cluster-profiles-config

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-release-check-gh-automation

pull-ci-openshift-release-main-ci-operator-config

pull-ci-openshift-release-main-ci-operator-config-metadata

pull-ci-openshift-release-main-ci-operator-registry

pull-ci-openshift-release-main-config

pull-ci-openshift-release-main-core-valid

pull-ci-openshift-release-main-generated-config

pull-ci-openshift-release-main-ordered-prow-config

pull-ci-openshift-release-main-owners

pull-ci-openshift-release-main-prow-config-filenames

pull-ci-openshift-release-main-prow-config-semantics

pull-ci-openshift-release-main-release-controller-config

pull-ci-openshift-release-openshift-image-mirror-mappings

pull-ci-openshift-release-yamllint

Details

In response to this:

/test eval-cluster-diagnostic

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

openshift-ci · 2026-07-02T17:48:59Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kasturinarra
Once this PR has been reviewed and has the lgtm label, please assign prashanth684 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

~~ci-operator/config/openshift-eng/edge-tooling/OWNERS~~ [kasturinarra]
~~ci-operator/jobs/openshift-eng/edge-tooling/OWNERS~~ [kasturinarra]
ci-operator/step-registry/openshift/claude/agent-eval/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

When EVAL_DISCOVER is set ("true" or a glob pattern), the workflow auto-discovers eval configs, diffs against PULL_BASE_SHA to run only affected evals, and produces per-eval JUnit test cases. Single EVAL_CONFIG mode is preserved for backward compatibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-commands.sh`:
- Around line 207-216: The per-config eval in
`openshift-claude-agent-eval-commands.sh` still uses a fixed `timeout 7200`
inside the serial `CONFIGS_TO_RUN` loop, which can overrun the overall step
budget and prevent the final JUnit output from being written. Update the timeout
logic around the `claude` invocation in the discovery-mode loop to use the
remaining time budget per config, or otherwise write JUnit results incrementally
so each run is preserved even if later configs hit the step limit.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 6a6cf46f-d381-4301-bf14-55a7f0802c1f

📥 Commits

Reviewing files that changed from the base of the PR and between ffe9d4a and 2ce7768.

⛔ Files ignored due to path filters (1)

ci-operator/jobs/openshift-eng/edge-tooling/openshift-eng-edge-tooling-main-presubmits.yaml is excluded by !ci-operator/jobs/**

📒 Files selected for processing (3)

ci-operator/config/openshift-eng/edge-tooling/openshift-eng-edge-tooling-main.yaml
ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-commands.sh
ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-ref.yaml

Write JUnit XML after each eval config completes so results are preserved if the step gets killed mid-loop. Add a step-level time guard (2h50m) to skip remaining configs before hitting the 3h limit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

openshift-merge-bot · 2026-07-02T18:30:50Z

[REHEARSALNOTIFIER]
@kasturinarra: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name	Repo	Type	Reason
pull-ci-openshift-eng-edge-tooling-main-eval-all	openshift-eng/edge-tooling	presubmit	Presubmit changed
pull-ci-openshift-eng-edge-tooling-main-eval-changed	openshift-eng/edge-tooling	presubmit	Presubmit changed
pull-ci-openshift-eng-ai-helpers-main-eval-payload-analysis	openshift-eng/ai-helpers	presubmit	Registry content changed
pull-ci-openshift-eng-ai-helpers-main-eval-payload-analysis-changed	openshift-eng/ai-helpers	presubmit	Registry content changed
pull-ci-openshift-eng-ai-helpers-main-eval-payload-analysis-minimal	openshift-eng/ai-helpers	presubmit	Registry content changed
pull-ci-openshift-eng-ai-helpers-main-eval-classify-review-comment	openshift-eng/ai-helpers	presubmit	Registry content changed
pull-ci-openshift-eng-ai-helpers-main-eval-address-reviews	openshift-eng/ai-helpers	presubmit	Registry content changed

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

openshift-ci · 2026-07-02T18:34:34Z

@kasturinarra: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci Bot requested review from dhensel-rh and jeff-roche July 2, 2026 10:57

coderabbitai Bot reviewed Jul 2, 2026

View reviewed changes

kasturinarra force-pushed the edge-tooling-eval-ci branch from 007eb23 to a870610 Compare July 2, 2026 11:52

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 2, 2026

kasturinarra force-pushed the edge-tooling-eval-ci branch from a870610 to 1c5c4dc Compare July 2, 2026 17:27

Add eval discovery mode to agent-eval workflow

ffe9d4a

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

kasturinarra force-pushed the edge-tooling-eval-ci branch from 1c5c4dc to ffe9d4a Compare July 2, 2026 17:28

openshift-ci Bot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 2, 2026

kasturinarra force-pushed the edge-tooling-eval-ci branch from 2ed5fe3 to 2ce7768 Compare July 2, 2026 17:52

coderabbitai Bot reviewed Jul 2, 2026

View reviewed changes

Comment thread ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-commands.sh

Uh oh!

Conversation

kasturinarra commented Jul 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jul 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 2, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jul 2, 2026

Choose a reason for hiding this comment

Uh oh!

kasturinarra commented Jul 2, 2026

Uh oh!

openshift-ci Bot commented Jul 2, 2026

Uh oh!

kasturinarra commented Jul 2, 2026

Uh oh!

openshift-ci Bot commented Jul 2, 2026

Uh oh!

openshift-ci Bot commented Jul 2, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

openshift-merge-bot Bot commented Jul 2, 2026

Uh oh!

openshift-ci Bot commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kasturinarra commented Jul 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jul 2, 2026 •

edited

Loading