Skip to content

Add eval CI jobs for edge-tooling#81386

Open
kasturinarra wants to merge 3 commits into
openshift:mainfrom
kasturinarra:edge-tooling-eval-ci
Open

Add eval CI jobs for edge-tooling#81386
kasturinarra wants to merge 3 commits into
openshift:mainfrom
kasturinarra:edge-tooling-eval-ci

Conversation

@kasturinarra

@kasturinarra kasturinarra commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds 4 eval CI job entries for openshift-eng/edge-tooling:
    • eval-cluster-diagnostic — manual trigger via /test eval-cluster-diagnostic
    • eval-cluster-diagnostic-changed — auto-triggers when plugins/two-node/evals/.*cluster-diagnostic files change
    • eval-threat-model-tnf — manual trigger via /test eval-threat-model-tnf
    • eval-threat-model-tnf-changed — auto-triggers when plugins/two-node/evals/.*threat-model-tnf files change
  • All use openshift-claude-agent-eval workflow, claude-opus-4-6 model, parallelism 3
  • All are optional: true — failures don't block merge

Context

Replaces #81166 which used on-the-fly eval generation (EVAL_SETUP_SCRIPT). This PR points at committed eval configs from edge-tooling PR #178 instead — no dynamic generation, enabling regression detection with committed judges.

Pattern follows openshift-eng/ai-helpers eval CI config exactly (per-skill manual + -changed auto-trigger pairs).

Test plan

  • Verify YAML is valid (done locally)
  • After edge-tooling PR Add config locations for config-updater #178 merges, test with /test eval-cluster-diagnostic on an edge-tooling PR
  • Verify -changed auto-trigger fires when eval case files are modified

🤖 Generated with Claude Code

Summary by CodeRabbit

This PR updates CI configuration in openshift/release for openshift-eng/edge-tooling by adding two new optional Claude eval presubmit entries that run the openshift-claude-agent-eval workflow with the claude-opus-4-6 model and EVAL_PARALLELISM: "3". The jobs support both manual and change-driven execution:

  • eval-all (manual via /test eval-all): runs with EVAL_DISCOVER: "true" to auto-discover eval configs.
  • eval-changed (manual via /test eval-changed, auto-triggered via run_if_changed: ^plugins/.*/evals/|^plugins/.*/skills/): runs with EVAL_CHANGED_ONLY: "true" (and EVAL_DISCOVER: "true") to limit execution to changed evals.

To support these modes, the openshift-claude-agent-eval step runner was enhanced to:

  • validate EVAL_DISCOVER vs EVAL_CONFIG mutual exclusivity,
  • loop over multiple selected/discovered eval configs (with per-config RUN_ID),
  • filter “changed” evals/cases when EVAL_CHANGED_ONLY=true and PULL_BASE_SHA is set,
  • aggregate results into a combined junit_claude-eval.xml,
  • write/flush JUnit after each config completes so results aren’t lost if the step is interrupted,
  • add a step-level timeout guard (~2h50m) to stop before the 3-hour limit is reached.

@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 8f6af81f-e858-4d54-be89-d3b6e092db74

📥 Commits

Reviewing files that changed from the base of the PR and between 2ce7768 and 88a7f89.

📒 Files selected for processing (1)
  • ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-commands.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-commands.sh

Walkthrough

This PR adds discovery-based openshift-claude-agent-eval jobs and updates the eval step to run one or more configs, optionally limit execution to changed evals, and aggregate results into a combined JUnit report.

Changes

Claude agent eval discovery flow

Layer / File(s) Summary
Inputs and job wiring
ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-ref.yaml, ci-operator/config/openshift-eng/edge-tooling/openshift-eng-edge-tooling-main.yaml
Adds EVAL_DISCOVER to the step environment and adds two optional CI jobs that invoke the eval workflow with discovery, model, parallelism, and changed-only settings.
Config selection
ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-commands.sh
Documents EVAL_DISCOVER, builds CONFIGS_TO_RUN from discovery or EVAL_CONFIG, rejects setting both, filters discovered configs to changed evals when requested, and validates config existence in the non-discovery path.
Per-config eval execution
ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-commands.sh
Loops over selected configs, derives per-config run IDs and cases, runs claude with per-config arguments, and writes combined JUnit output across all configs.

Estimated code review effort: 3 (Moderate) | ~25 minutes

🚥 Pre-merge checks | ✅ 15
✅ Passed checks (15 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: adding eval CI jobs for edge-tooling.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR only changes ci-operator YAML and a shell step; no Ginkgo It/Describe/Context/When titles were added or modified.
Test Structure And Quality ✅ Passed No Ginkgo test code was changed; the PR only updates ci-operator YAML and a shell eval step script, so the test-quality rubric is not applicable.
Microshift Test Compatibility ✅ Passed Touched files only add CI wiring and eval runner logic; no new Ginkgo It/Describe/Context tests or MicroShift-unsafe API use appear.
Single Node Openshift (Sno) Test Compatibility ✅ Passed The PR only changes ci-operator YAML and an eval shell script; no Ginkgo e2e tests or SNO-sensitive test bodies were added.
Topology-Aware Scheduling Compatibility ✅ Passed PASS: The PR only changes CI job config and eval step scripts; no deployment/controller code or scheduling fields (nodeSelector/affinity/PDB) were added.
Ote Binary Stdout Contract ✅ Passed PR only changes ci-operator config and a shell step script; no OTE Go binary main/init/TestMain stdout contract code is touched.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No new Ginkgo e2e tests were added; the PR only changes CI config and an eval harness script.
No-Weak-Crypto ✅ Passed Changed files only add eval job YAML/env wiring; grep found no MD5/SHA1/DES/RC4/3DES/Blowfish/ECB or custom crypto/secret comparisons in the diff.
Container-Privileges ✅ Passed Touched CI YAML and step-registry files only add eval jobs/env vars; no privileged, hostPID/Network/IPC, SYS_ADMIN, or allowPrivilegeEscalation settings appear.
No-Sensitive-Data-In-Logs ✅ Passed Only operational metadata is logged; token handling prints presence/path, and no direct passwords, tokens, PII, or hostnames are echoed.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@openshift-ci openshift-ci Bot requested review from dhensel-rh and jeff-roche July 2, 2026 10:57

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/config/openshift-eng/edge-tooling/openshift-eng-edge-tooling-main.yaml`:
- Around line 67-76: The eval job in openshift-claude-agent-eval is setting
EVAL_CHANGED_ONLY without a non-empty EVAL_CASES_DIR, so the changed-only path
becomes a no-op. Update the eval job definition for
eval-cluster-diagnostic-changed to either add EVAL_CASES_DIR alongside
EVAL_CONFIG and EVAL_MODEL in the env block, or remove EVAL_CHANGED_ONLY if the
intent is to run the full eval instead.

In
`@ci-operator/step-registry/lvms/zstream/trigger/lvms-zstream-trigger-commands.sh`:
- Around line 99-105: `get_last_digest` is currently treating any previously
recorded image digest as tested, even when the prior trigger run failed. Update
the logic around `get_last_digest` and the summary-write path that stores the
image after `jobs_failed > 0` so failed Gangway trigger attempts do not persist
the digest for reuse; only record the image when the run succeeds, or clear/skip
writing it on failure so the next cron run retries the same digest.
- Around line 283-289: The entry parsing in
trigger/lvms-zstream-trigger-commands.sh can still pass a trailing Prow JS
semicolon into jq, causing the lookup to fall back to unknown. Update the entry
assignment logic near the jq pipeline to strip the trailing “;” from prow_json
before parsing, matching the behavior already used by load_previous_summary, so
the JSON is valid and the last run state is extracted correctly.
- Around line 165-168: The snapshot extraction pipeline in extract_snapshot is
aborting under set -euo pipefail when grep finds no match, which prevents the
per-release error handler from running. Update the extraction logic around the
jq/grep/head pipeline so that a missing snapshot yields an empty
string/zero-status result instead of failing, while preserving the current
parsing behavior for valid matches.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 5ddda36d-ab21-494e-820e-cf2b347b59b3

📥 Commits

Reviewing files that changed from the base of the PR and between d947fcf and 007eb23.

⛔ Files ignored due to path filters (1)
  • ci-operator/jobs/openshift/lvm-operator/openshift-lvm-operator-main-periodics.yaml is excluded by !ci-operator/jobs/**
📒 Files selected for processing (7)
  • ci-operator/config/openshift-eng/edge-tooling/openshift-eng-edge-tooling-main.yaml
  • ci-operator/config/openshift/lvm-operator/openshift-lvm-operator-main__zstream.yaml
  • ci-operator/step-registry/lvms/zstream/OWNERS
  • ci-operator/step-registry/lvms/zstream/trigger/OWNERS
  • ci-operator/step-registry/lvms/zstream/trigger/lvms-zstream-trigger-commands.sh
  • ci-operator/step-registry/lvms/zstream/trigger/lvms-zstream-trigger-ref.metadata.json
  • ci-operator/step-registry/lvms/zstream/trigger/lvms-zstream-trigger-ref.yaml

Comment on lines +99 to +105
get_last_digest() {
local release="$1"
if [[ -z "${PREV_SUMMARY}" ]]; then
echo ""
return
fi
jq -r --arg r "${release}" '.[$r].image // "" | split("@") | if length > 1 then .[1] else "" end' "${PREV_SUMMARY}" 2>/dev/null || echo ""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Don’t mark a digest tested when trigger attempts failed.

get_last_digest accepts any prior image, but Line 410 writes that image even when jobs_failed > 0; the next cron run will skip the same digest instead of retrying failed Gangway triggers.

Proposed fix
-    jq -r --arg r "${release}" '.[$r].image // "" | split("@") | if length > 1 then .[1] else "" end' "${PREV_SUMMARY}" 2>/dev/null || echo ""
+    jq -r --arg r "${release}" '
+      .[$r]
+      | select((.status == "triggered" and (.jobs_failed // 0) == 0) or .status == "skipped")
+      | .image // ""
+      | split("@")
+      | if length > 1 then .[1] else "" end
+    ' "${PREV_SUMMARY}" 2>/dev/null || echo ""

Also applies to: 407-410

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/lvms/zstream/trigger/lvms-zstream-trigger-commands.sh`
around lines 99 - 105, `get_last_digest` is currently treating any previously
recorded image digest as tested, even when the prior trigger run failed. Update
the logic around `get_last_digest` and the summary-write path that stores the
image after `jobs_failed > 0` so failed Gangway trigger attempts do not persist
the digest for reuse; only record the image when the run succeeds, or clear/skip
writing it on failure so the next cron run retries the same digest.

Comment on lines +165 to +168
echo "${files_json}" | jq -r '
[.[] | select(.filename | contains("catalog")) | .patch // ""] |
join("\n")
' | grep -oP '(?<=snapshot: )\S+' | head -1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Make snapshot extraction return empty instead of aborting.

With set -euo pipefail, grep returning no match makes snapshot=$(extract_snapshot ...) exit before the handler at Line 342 can write the intended per-release error summary.

Proposed fix
     echo "${files_json}" | jq -r '
         [.[] | select(.filename | contains("catalog")) | .patch // ""] |
         join("\n")
-    ' | grep -oP '(?<=snapshot: )\S+' | head -1
+    ' | { grep -oP '(?<=snapshot: )\S+' || true; } | head -1
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
echo "${files_json}" | jq -r '
[.[] | select(.filename | contains("catalog")) | .patch // ""] |
join("\n")
' | grep -oP '(?<=snapshot: )\S+' | head -1
echo "${files_json}" | jq -r '
[.[] | select(.filename | contains("catalog")) | .patch // ""] |
join("\n")
' | { grep -oP '(?<=snapshot: )\S+' || true; } | head -1
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/lvms/zstream/trigger/lvms-zstream-trigger-commands.sh`
around lines 165 - 168, The snapshot extraction pipeline in extract_snapshot is
aborting under set -euo pipefail when grep finds no match, which prevents the
per-release error handler from running. Update the extraction logic around the
jq/grep/head pipeline so that a missing snapshot yields an empty
string/zero-status result instead of failing, while preserving the current
parsing behavior for valid matches.

Comment on lines +283 to +289
local entry
entry=$(echo "${prow_json}" | sed 's/^var allBuilds = //' | \
jq -r --arg n "${test_name}" '
[.items[] | select(.status.state == "success" or .status.state == "failure" or .status.state == "error" or .status.state == "aborted")] |
if length > 0 then .[0] else null end |
if . then {name: $n, state: .status.state, url: .status.url, started: .status.startTime} else {name: $n, state: "unknown"} end
' 2>/dev/null || echo "{\"name\": \"${test_name}\", \"state\": \"unknown\"}")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Strip the trailing Prow JS semicolon here too.

load_previous_summary already removes ;$; this path does not, so jq can fail on the trailing semicolon and report every last run as unknown.

Proposed fix
-        entry=$(echo "${prow_json}" | sed 's/^var allBuilds = //' | \
+        entry=$(echo "${prow_json}" | sed 's/^var allBuilds = //; s/;$//' | \
             jq -r --arg n "${test_name}" '
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
local entry
entry=$(echo "${prow_json}" | sed 's/^var allBuilds = //' | \
jq -r --arg n "${test_name}" '
[.items[] | select(.status.state == "success" or .status.state == "failure" or .status.state == "error" or .status.state == "aborted")] |
if length > 0 then .[0] else null end |
if . then {name: $n, state: .status.state, url: .status.url, started: .status.startTime} else {name: $n, state: "unknown"} end
' 2>/dev/null || echo "{\"name\": \"${test_name}\", \"state\": \"unknown\"}")
local entry
entry=$(echo "${prow_json}" | sed 's/^var allBuilds = //; s/;$//' | \
jq -r --arg n "${test_name}" '
[.items[] | select(.status.state == "success" or .status.state == "failure" or .status.state == "error" or .status.state == "aborted")] |
if length > 0 then .[0] else null end |
if . then {name: $n, state: .status.state, url: .status.url, started: .status.startTime} else {name: $n, state: "unknown"} end
' 2>/dev/null || echo "{\"name\": \"${test_name}\", \"state\": \"unknown\"}")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/lvms/zstream/trigger/lvms-zstream-trigger-commands.sh`
around lines 283 - 289, The entry parsing in
trigger/lvms-zstream-trigger-commands.sh can still pass a trailing Prow JS
semicolon into jq, causing the lookup to fall back to unknown. Update the entry
assignment logic near the jq pipeline to strip the trailing “;” from prow_json
before parsing, matching the behavior already used by load_previous_summary, so
the JSON is valid and the last run state is extracted correctly.

@kasturinarra kasturinarra force-pushed the edge-tooling-eval-ci branch from 007eb23 to a870610 Compare July 2, 2026 11:52
@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 2, 2026
@kasturinarra

Copy link
Copy Markdown
Contributor Author

/test eval-cluster-diagnostic

@openshift-ci

openshift-ci Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

@kasturinarra: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test boskos-config
/test boskos-config-generation
/test check-gh-automation
/test check-gh-automation-tide
/test check-trigger-trusted-apps
/test ci-operator-config
/test ci-operator-config-metadata
/test ci-operator-registry
/test ci-secret-bootstrap-config-validation
/test ci-testgrid-allow-list
/test cluster-manifest-verifier
/test clusterimageset-validate
/test config
/test core-valid
/test generated-config
/test generated-dashboards
/test hyperfleet-risk-scorer-test
/test image-mirroring-config-validation
/test jira-lifecycle-config
/test labels
/test openshift-image-mirror-mappings
/test ordered-prow-config
/test owners
/test pr-reminder-config
/test prow-config
/test prow-config-filenames
/test prow-config-semantics
/test pylint
/test release-config
/test release-controller-config
/test rover-groups-config-validation
/test secret-generator-config-valid
/test services-valid
/test stackrox-stackrox-stackrox-stackrox-check
/test step-registry-metadata
/test step-registry-shellcheck
/test sync-rover-groups
/test verified-config
/test yamllint

The following commands are available to trigger optional jobs:

/test check-cluster-profiles-config

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-release-check-gh-automation
pull-ci-openshift-release-main-ci-operator-config
pull-ci-openshift-release-main-ci-operator-config-metadata
pull-ci-openshift-release-main-ci-operator-registry
pull-ci-openshift-release-main-config
pull-ci-openshift-release-main-core-valid
pull-ci-openshift-release-main-generated-config
pull-ci-openshift-release-main-ordered-prow-config
pull-ci-openshift-release-main-owners
pull-ci-openshift-release-main-prow-config-filenames
pull-ci-openshift-release-main-prow-config-semantics
pull-ci-openshift-release-main-release-controller-config
pull-ci-openshift-release-openshift-image-mirror-mappings
pull-ci-openshift-release-yamllint
Details

In response to this:

/test eval-cluster-diagnostic

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kasturinarra

Copy link
Copy Markdown
Contributor Author

/test eval-cluster-diagnostic

@openshift-ci

openshift-ci Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

@kasturinarra: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test boskos-config
/test boskos-config-generation
/test check-gh-automation
/test check-gh-automation-tide
/test check-trigger-trusted-apps
/test ci-operator-config
/test ci-operator-config-metadata
/test ci-operator-registry
/test ci-secret-bootstrap-config-validation
/test ci-testgrid-allow-list
/test cluster-manifest-verifier
/test clusterimageset-validate
/test config
/test core-valid
/test generated-config
/test generated-dashboards
/test hyperfleet-risk-scorer-test
/test image-mirroring-config-validation
/test jira-lifecycle-config
/test labels
/test openshift-image-mirror-mappings
/test ordered-prow-config
/test owners
/test pr-reminder-config
/test prow-config
/test prow-config-filenames
/test prow-config-semantics
/test pylint
/test release-config
/test release-controller-config
/test rover-groups-config-validation
/test secret-generator-config-valid
/test services-valid
/test stackrox-stackrox-stackrox-stackrox-check
/test step-registry-metadata
/test step-registry-shellcheck
/test sync-rover-groups
/test verified-config
/test yamllint

The following commands are available to trigger optional jobs:

/test check-cluster-profiles-config

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-release-check-gh-automation
pull-ci-openshift-release-main-ci-operator-config
pull-ci-openshift-release-main-ci-operator-config-metadata
pull-ci-openshift-release-main-ci-operator-registry
pull-ci-openshift-release-main-config
pull-ci-openshift-release-main-core-valid
pull-ci-openshift-release-main-generated-config
pull-ci-openshift-release-main-ordered-prow-config
pull-ci-openshift-release-main-owners
pull-ci-openshift-release-main-prow-config-filenames
pull-ci-openshift-release-main-prow-config-semantics
pull-ci-openshift-release-main-release-controller-config
pull-ci-openshift-release-openshift-image-mirror-mappings
pull-ci-openshift-release-yamllint
Details

In response to this:

/test eval-cluster-diagnostic

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kasturinarra kasturinarra force-pushed the edge-tooling-eval-ci branch from a870610 to 1c5c4dc Compare July 2, 2026 17:27
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kasturinarra kasturinarra force-pushed the edge-tooling-eval-ci branch from 1c5c4dc to ffe9d4a Compare July 2, 2026 17:28
@openshift-ci

openshift-ci Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kasturinarra
Once this PR has been reviewed and has the lgtm label, please assign prashanth684 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 2, 2026
When EVAL_DISCOVER is set ("true" or a glob pattern), the workflow
auto-discovers eval configs, diffs against PULL_BASE_SHA to run only
affected evals, and produces per-eval JUnit test cases. Single
EVAL_CONFIG mode is preserved for backward compatibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kasturinarra kasturinarra force-pushed the edge-tooling-eval-ci branch from 2ed5fe3 to 2ce7768 Compare July 2, 2026 17:52

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-commands.sh`:
- Around line 207-216: The per-config eval in
`openshift-claude-agent-eval-commands.sh` still uses a fixed `timeout 7200`
inside the serial `CONFIGS_TO_RUN` loop, which can overrun the overall step
budget and prevent the final JUnit output from being written. Update the timeout
logic around the `claude` invocation in the discovery-mode loop to use the
remaining time budget per config, or otherwise write JUnit results incrementally
so each run is preserved even if later configs hit the step limit.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 6a6cf46f-d381-4301-bf14-55a7f0802c1f

📥 Commits

Reviewing files that changed from the base of the PR and between ffe9d4a and 2ce7768.

⛔ Files ignored due to path filters (1)
  • ci-operator/jobs/openshift-eng/edge-tooling/openshift-eng-edge-tooling-main-presubmits.yaml is excluded by !ci-operator/jobs/**
📒 Files selected for processing (3)
  • ci-operator/config/openshift-eng/edge-tooling/openshift-eng-edge-tooling-main.yaml
  • ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-commands.sh
  • ci-operator/step-registry/openshift/claude/agent-eval/openshift-claude-agent-eval-ref.yaml

Write JUnit XML after each eval config completes so results are
preserved if the step gets killed mid-loop. Add a step-level time
guard (2h50m) to skip remaining configs before hitting the 3h limit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@kasturinarra: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-openshift-eng-edge-tooling-main-eval-all openshift-eng/edge-tooling presubmit Presubmit changed
pull-ci-openshift-eng-edge-tooling-main-eval-changed openshift-eng/edge-tooling presubmit Presubmit changed
pull-ci-openshift-eng-ai-helpers-main-eval-payload-analysis openshift-eng/ai-helpers presubmit Registry content changed
pull-ci-openshift-eng-ai-helpers-main-eval-payload-analysis-changed openshift-eng/ai-helpers presubmit Registry content changed
pull-ci-openshift-eng-ai-helpers-main-eval-payload-analysis-minimal openshift-eng/ai-helpers presubmit Registry content changed
pull-ci-openshift-eng-ai-helpers-main-eval-classify-review-comment openshift-eng/ai-helpers presubmit Registry content changed
pull-ci-openshift-eng-ai-helpers-main-eval-address-reviews openshift-eng/ai-helpers presubmit Registry content changed
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@openshift-ci

openshift-ci Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

@kasturinarra: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant