Skip to content

Remove manual Redfish fencing roles for 4.22 GA#87

Open
fonta-rh wants to merge 1 commit into
openshift-eng:mainfrom
fonta-rh:tnt-redfish-cleanup
Open

Remove manual Redfish fencing roles for 4.22 GA#87
fonta-rh wants to merge 1 commit into
openshift-eng:mainfrom
fonta-rh:tnt-redfish-cleanup

Conversation

@fonta-rh

@fonta-rh fonta-rh commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Remove the bare-metal roles/redfish/ role and redfish.yml playbook — the CEO auto-configures STONITH from BMH resources since 4.19
  • Strip manual stonith configuration from roles/kcli/kcli-redfish/, keeping only the ksushy BMC simulator startup needed for virtual deployments
  • Remove the is_ocp_4_19 version gate and redfish prompt from setup.yml post_tasks (157 lines)
  • Bump default OCP version from 4.20 to 4.22, remove stale DevPreviewNoUpgrade/TechPreview feature gates from example configs
  • Update release image references to 4.22 in example configs

Test plan

  • make shellcheck passes (pre-existing SC2034 warning in resolve_vm_ip.sh is unrelated)
  • ansible-playbook --syntax-check setup.yml passes
  • ansible-playbook --syntax-check kcli-install.yml passes
  • grep -r 'redfish' deploy/ shows only kcli-redfish/ksushy references — no dangling refs to removed role
  • Deploy a kcli fencing cluster and verify CEO auto-configures stonith (ksushy starts, fencing works without manual intervention)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • OpenShift fencing-topology deployments now set up fencing automatically during installation, with ksushy BMC simulation for kcli-based installs.
  • Bug Fixes

    • Removed several manual Redfish/STONITH setup steps from cluster setup and install flows, simplifying the fencing experience.
  • Documentation

    • Updated cluster and kcli guides to reflect the new automatic fencing flow and added ksushy simulator verification steps.
  • Chores

    • Updated default OpenShift version references to 4.22 in deployment examples and scripts.

The CEO auto-configures STONITH fencing from BMH resources since 4.19,
making the manual bare-metal redfish role and post-install stonith
configuration dead code. Strip these and bump stale version defaults.

Removed:
- roles/redfish/ and redfish.yml (bare-metal manual stonith)
- kcli-redfish.yml playbook and stonith task files (kcli manual stonith)
- setup.yml post_tasks block (is_ocp_4_19 version gate + redfish prompt)

Kept:
- roles/kcli/kcli-redfish/ with only ksushy startup (BMC simulator
  still needed for virtual deployments)

Updated:
- Default OCP version 4.20 → 4.22
- Removed DevPreviewNoUpgrade/TechPreview feature gates from examples
- Updated release image references to 4.22
- Cleaned up docs (CLAUDE.md, READMEs)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@openshift-ci openshift-ci Bot requested review from jeff-roche and jerpeter1 June 29, 2026 10:49
@openshift-ci

openshift-ci Bot commented Jun 29, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fonta-rh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 29, 2026
@openshift-ci

openshift-ci Bot commented Jun 29, 2026

Copy link
Copy Markdown

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 29, 2026
@openshift-ci

openshift-ci Bot commented Jun 29, 2026

Copy link
Copy Markdown

@fonta-rh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/shellcheck 463def1 link true /test shellcheck
ci/prow/images 463def1 link true /test images

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Walkthrough

Removes manual Redfish/PCS STONITH playbooks (redfish.yml, kcli-redfish.yml) and associated role task files. The kcli-redfish role is refocused as a pre-install ksushy BMC simulator starter, with CEO now auto-configuring STONITH. Default OCP version references are bumped from 4.20/4.21 to 4.22 across configs and scripts.

Changes

Redfish STONITH Removal and kcli-redfish Refactor

Layer / File(s) Summary
Delete manual STONITH playbooks and role tasks
deploy/openshift-clusters/setup.yml, deploy/openshift-clusters/kcli-install.yml
Removes post_tasks interactive Redfish stonith block from setup.yml, adds inventory update task, and removes the fencing post-install step from kcli-install.yml that invoked kcli-redfish.yml. Deleted files include redfish.yml, kcli-redfish.yml, roles/redfish/..., and roles/kcli/kcli-redfish/tasks/configure_stonith.yml, post-installation.yml, verify_stonith.yml.
Refocus kcli-redfish role as ksushy pre-install simulator
deploy/openshift-clusters/roles/kcli/kcli-redfish/tasks/main.yml, roles/kcli/kcli-redfish/README.md
Updates role task comment and debug label to describe ksushy BMC simulator startup. Rewrites README with concise description, minimal requirements, compact variables table, and simplified troubleshooting.
Update deployment docs and CLAUDE.md
CLAUDE.md, deploy/openshift-clusters/README.md, deploy/openshift-clusters/README-kcli.md
Removes redfish.yml command from CLAUDE.md and replaces redfish component entry with kcli/kcli-redfish. README.md now states CEO auto-configures STONITH. README-kcli.md fencing section rewritten with ksushy simulator subsection, variable table, and verification steps.
Bump default OCP version to 4.22
deploy/openshift-clusters/vars/init-host.yml, deploy/aws-hypervisor/scripts/configure.sh, roles/dev-scripts/install-dev/files/config_*_example.sh
Updates default_ocp_version to "4.22", fallback version in configure.sh, and OPENSHIFT_RELEASE_IMAGE tags to 4.22.0 in both example scripts. Adds BMC_DRIVER=redfish export to config_fencing_example.sh.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested reviewers

  • slintes
  • jaypoulz
🚥 Pre-merge checks | ✅ 9 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Ai-Attribution ⚠️ Warning AI use is disclosed, but HEAD only has Co-Authored-By: Claude Opus 4.6; no Assisted-by/Generated-by Red Hat trailer is present. Replace the AI co-author line with a Red Hat Assisted-by: or Generated-by: trailer in the commit/PR metadata.
✅ Passed checks (9 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the PR’s main change: removing manual Redfish fencing setup for OpenShift 4.22 GA.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
No-Weak-Crypto ✅ Passed No touched file adds MD5/SHA1/DES/RC4/ECB or custom crypto; only strong password generation (openssl rand) and a token placeholder appear.
Container-Privileges ✅ Passed PASS: The PR only edits docs/Ansible playbooks; targeted searches found no privileged/hostPID/hostNetwork/hostIPC/SYS_ADMIN/allowPrivilegeEscalation settings in changed manifests.
No-Sensitive-Data-In-Logs ✅ Passed No changed log lines expose actual secrets; the new debug output is status/metadata only, and the manual Redfish credential-heavy path was removed.
No-Hardcoded-Secrets ✅ Passed No hardcoded secrets were introduced; the only token-like literal is a CI_TOKEN placeholder, and the password is generated at runtime.
No-Injection-Vectors ✅ Passed No listed injection patterns appeared in the touched files; removed redfish playbooks are gone and remaining shell/YAML uses no eval, yaml.load, or shell=True.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch tnt-redfish-cleanup

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
deploy/openshift-clusters/roles/kcli/kcli-redfish/tasks/main.yml (1)

26-42: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Validate before starting ksushy.

include_tasks: start_ksushy.yml runs before the assert, so a bad config can still start or modify the simulator service before the role fails. Move validation ahead of any side effects.

🔧 Suggested reorder
+- name: Validate configuration
+  assert:
+    that:
+      - test_cluster_name != ""
+      - ksushy_ip != ""
+      - bmc_user != ""
+      - bmc_password != ""
+  delegate_to: localhost
+
 - name: Start ksushy BMC simulator
   include_tasks: start_ksushy.yml
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deploy/openshift-clusters/roles/kcli/kcli-redfish/tasks/main.yml` around
lines 26 - 42, The validation in kcli-redfish currently happens after
start_ksushy.yml, so bad input can trigger side effects before the role fails.
Move the assert block in tasks/main.yml ahead of include_tasks: start_ksushy.yml
so test_cluster_name, ksushy_ip, bmc_user, and bmc_password are checked before
any simulator startup or modification occurs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@deploy/openshift-clusters/README.md`:
- Around line 184-186: Remove the stale OpenShift 4.19.x version qualifier from
the automatic Redfish STONITH description in the fencing topology section.
Update the wording around the automatic fencing behavior for
cluster-etcd-operator (CEO) and BareMetalHost resources so it no longer reads as
version-locked, and ensure the README text reflects the current 4.22 flow
instead of referencing only 4.19.x.

In
`@deploy/openshift-clusters/roles/dev-scripts/install-dev/files/config_arbiter_example.sh`:
- Line 36: The OPENSHIFT_RELEASE_IMAGE example in config_arbiter_example.sh is
x86_64-only and currently lacks the ARM64/aarch64 guidance that
config_fencing_example.sh includes. Update this example to explicitly document
the aarch64/Graviton override variables next to OPENSHIFT_RELEASE_IMAGE,
including the required Metal3 container image overrides, so users copying the
script to ARM64 hosts can switch to the correct payload architecture instead of
the default x86_64 image.

---

Outside diff comments:
In `@deploy/openshift-clusters/roles/kcli/kcli-redfish/tasks/main.yml`:
- Around line 26-42: The validation in kcli-redfish currently happens after
start_ksushy.yml, so bad input can trigger side effects before the role fails.
Move the assert block in tasks/main.yml ahead of include_tasks: start_ksushy.yml
so test_cluster_name, ksushy_ip, bmc_user, and bmc_password are checked before
any simulator startup or modification occurs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 051753f2-f9ef-488b-8bc9-f2e0c5ca2305

📥 Commits

Reviewing files that changed from the base of the PR and between 903e071 and 463def1.

📒 Files selected for processing (20)
  • CLAUDE.md
  • deploy/aws-hypervisor/scripts/configure.sh
  • deploy/openshift-clusters/README-kcli.md
  • deploy/openshift-clusters/README.md
  • deploy/openshift-clusters/kcli-install.yml
  • deploy/openshift-clusters/kcli-redfish.yml
  • deploy/openshift-clusters/redfish.yml
  • deploy/openshift-clusters/roles/dev-scripts/install-dev/files/config_arbiter_example.sh
  • deploy/openshift-clusters/roles/dev-scripts/install-dev/files/config_fencing_example.sh
  • deploy/openshift-clusters/roles/kcli/kcli-redfish/README.md
  • deploy/openshift-clusters/roles/kcli/kcli-redfish/tasks/configure_stonith.yml
  • deploy/openshift-clusters/roles/kcli/kcli-redfish/tasks/main.yml
  • deploy/openshift-clusters/roles/kcli/kcli-redfish/tasks/post-installation.yml
  • deploy/openshift-clusters/roles/kcli/kcli-redfish/tasks/verify_stonith.yml
  • deploy/openshift-clusters/roles/redfish/README.md
  • deploy/openshift-clusters/roles/redfish/defaults/main.yml
  • deploy/openshift-clusters/roles/redfish/tasks/main.yml
  • deploy/openshift-clusters/roles/redfish/tasks/process_bmh.yml
  • deploy/openshift-clusters/setup.yml
  • deploy/openshift-clusters/vars/init-host.yml
💤 Files with no reviewable changes (10)
  • deploy/openshift-clusters/roles/redfish/README.md
  • deploy/openshift-clusters/roles/redfish/tasks/process_bmh.yml
  • deploy/openshift-clusters/roles/redfish/tasks/main.yml
  • deploy/openshift-clusters/roles/kcli/kcli-redfish/tasks/post-installation.yml
  • deploy/openshift-clusters/roles/kcli/kcli-redfish/tasks/configure_stonith.yml
  • deploy/openshift-clusters/kcli-redfish.yml
  • deploy/openshift-clusters/redfish.yml
  • deploy/openshift-clusters/roles/redfish/defaults/main.yml
  • deploy/openshift-clusters/roles/kcli/kcli-redfish/tasks/verify_stonith.yml
  • deploy/openshift-clusters/setup.yml

Comment on lines 184 to +186
For clusters using the fencing topology on OpenShift 4.19.x, automatic Redfish stonith configuration is available. This feature configures Pacemaker stonith resources using Redfish fencing for BareMetalHost resources.

Redfish configuration can be applied in two ways:

**Integrated Usage:**
- When running the main deployment playbook in interactive mode with fencing topology, you will be prompted to configure Redfish stonith automatically
- Redfish configuration runs as part of the main deployment workflow

**Standalone Usage:**
- Redfish configuration can be run independently using: `ansible-playbook redfish.yml`
- This allows for running it separately from the main deployment or re-running it if needed

For detailed configuration options, verification commands, and requirements, refer to the [Redfish role documentation](roles/redfish/README.md).
Fencing topology clusters use automatic fencing configuration via the cluster-etcd-operator (CEO). The CEO discovers BareMetalHost resources and configures STONITH automatically during installation. No manual Redfish configuration is required.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Remove the stale 4.19.x qualifier.

This section still limits automatic Redfish STONITH to OpenShift 4.19.x, which conflicts with the new 4.22 flow and makes the docs read as version-locked to an old release. Please update or drop the version reference.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deploy/openshift-clusters/README.md` around lines 184 - 186, Remove the stale
OpenShift 4.19.x version qualifier from the automatic Redfish STONITH
description in the fencing topology section. Update the wording around the
automatic fencing behavior for cluster-etcd-operator (CEO) and BareMetalHost
resources so it no longer reads as version-locked, and ensure the README text
reflects the current 4.22 flow instead of referencing only 4.19.x.

# and select your preferred version. Public sources can be found at https://mirror.openshift.com/pub/openshift-v4/

export OPENSHIFT_RELEASE_IMAGE=quay.io/openshift-release-dev/ocp-release:4.21.0-x86_64
export OPENSHIFT_RELEASE_IMAGE=quay.io/openshift-release-dev/ocp-release:4.22.0-x86_64

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Add the aarch64 override guidance next to this release image.

This example now hardcodes an x86_64 payload image but still omits the ARM64 notes that config_fencing_example.sh already carries. Copying this file onto Graviton/aarch64 hypervisors will still point users at the wrong payload architecture and miss the required Metal3 container overrides.

Suggested update
 export OPENSHIFT_RELEASE_IMAGE=quay.io/openshift-release-dev/ocp-release:4.22.0-x86_64
+# aarch64 (Graviton): switch the payload image to the matching architecture and
+# override Metal3 infrastructure images with arm64 rebuilds.
+# if [ "$(uname -m)" = "aarch64" ]; then
+#     export OPENSHIFT_RELEASE_IMAGE=quay.io/openshift-release-dev/ocp-release:4.22.0-aarch64
+#     export IRONIC_IMAGE=quay.io/rh-edge-enablement/ironic:2026-06
+#     export VBMC_IMAGE=quay.io/rh-edge-enablement/vbmc:2026-06
+#     export SUSHY_TOOLS_IMAGE=quay.io/rh-edge-enablement/sushy-tools:2026-06
+# fi

Based on learnings, "When reviewing dev-scripts config example shell files in two-node-toolbox, ensure that the ARM64 (aarch64/Graviton) image override variables are handled explicitly... Upstream Metal3 images ... are x86_64-only and will fail on aarch64 hosts with Exec format error."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export OPENSHIFT_RELEASE_IMAGE=quay.io/openshift-release-dev/ocp-release:4.22.0-x86_64
export OPENSHIFT_RELEASE_IMAGE=quay.io/openshift-release-dev/ocp-release:4.22.0-x86_64
# aarch64 (Graviton): switch the payload image to the matching architecture and
# override Metal3 infrastructure images with arm64 rebuilds.
# if [ "$(uname -m)" = "aarch64" ]; then
# export OPENSHIFT_RELEASE_IMAGE=quay.io/openshift-release-dev/ocp-release:4.22.0-aarch64
# export IRONIC_IMAGE=quay.io/rh-edge-enablement/ironic:2026-06
# export VBMC_IMAGE=quay.io/rh-edge-enablement/vbmc:2026-06
# export SUSHY_TOOLS_IMAGE=quay.io/rh-edge-enablement/sushy-tools:2026-06
# fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@deploy/openshift-clusters/roles/dev-scripts/install-dev/files/config_arbiter_example.sh`
at line 36, The OPENSHIFT_RELEASE_IMAGE example in config_arbiter_example.sh is
x86_64-only and currently lacks the ARM64/aarch64 guidance that
config_fencing_example.sh includes. Update this example to explicitly document
the aarch64/Graviton override variables next to OPENSHIFT_RELEASE_IMAGE,
including the required Metal3 container image overrides, so users copying the
script to ARM64 hosts can switch to the correct payload architecture instead of
the default x86_64 image.

Source: Learnings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant