Skip to content

Add baremetal node adoption automation#88

Open
fonta-rh wants to merge 17 commits into
openshift-eng:mainfrom
fonta-rh:ocpedge-2774-baremetal-adopt
Open

Add baremetal node adoption automation#88
fonta-rh wants to merge 17 commits into
openshift-eng:mainfrom
fonta-rh:ocpedge-2774-baremetal-adopt

Conversation

@fonta-rh

@fonta-rh fonta-rh commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add baremetal-adopt.sh to onboard existing baremetal nodes into the dev-scripts deployment workflow — parses inventory_baremetal.ini, validates BMC credentials via Redfish, and generates ironic_nodes.json + config_baremetal_fencing.sh
  • Add baremetal-wizard.sh as a standalone interactive wizard with input validation (IPv4, MAC format), re-prompt on error, summary table with masked passwords, and Y/n/q confirmation flow
  • Add inventory_baremetal.ini.sample template and baremetal-adopt, baremetal-verify, baremetal-wizard Make targets

Jira: OCPEDGE-2774
Related: OCPEDGE-2775 (baremetal install via dev-scripts — consumes adoption artifacts)

Test plan

  • Run make baremetal-wizard — verify input validation, re-prompts, summary, and confirmation
  • Run make baremetal-adopt with existing inventory_baremetal.ini — skips wizard, proceeds directly
  • Run make baremetal-verify — verify-only mode, no artifacts generated
  • Verify generated ironic_nodes.json schema matches dev-scripts expectations
  • Verify config_baremetal_fencing.sh correctly sources base config + adds baremetal overrides
  • Run make shellcheck — no new warnings
  • Tested against real HPE ProLiant e920t iLO 5 BMCs (RDU2 lab) — both nodes validated via Redfish, artifacts generated with auto-discovered system paths

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Added baremetal adoption commands to generate TNF setup artifacts, verify BMC credentials, and run an interactive inventory wizard.
    • The wizard creates inventory_baremetal.ini with validated hostnames, BMC details, and optional PXE boot MAC (auto-discovery when omitted).
    • Added --verify-only and --skip-verify support for safer credential checks.
  • Documentation
    • Extended make help with a new “Baremetal Adoption” section describing the new commands.
  • Chores
    • Added a baremetal inventory sample and updated ignore rules for generated artifacts and credential-containing files.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 29, 2026
@openshift-ci

openshift-ci Bot commented Jun 29, 2026

Copy link
Copy Markdown

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci

openshift-ci Bot commented Jun 29, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fonta-rh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 29, 2026
@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds baremetal adoption tooling: a wizard writes inventory_baremetal.ini, the adoption script verifies BMC access and generates deployment artifacts, and Makefile targets, sample inventory, and ignore rules support the new flow.

Changes

Baremetal adoption flow

Layer / File(s) Summary
Entry points and inventory docs
deploy/Makefile, deploy/openshift-clusters/inventory_baremetal.ini.sample, deploy/openshift-clusters/.gitignore
Adds baremetal adoption targets and help text, updates ignore rules for generated adoption artifacts, and documents the baremetal inventory sample and its required fields.
Inventory wizard
deploy/openshift-clusters/scripts/baremetal-wizard.sh
Adds interactive prompts, validation, confirmation, and inventory file generation for baremetal node definitions.
Inventory parsing and verification
deploy/openshift-clusters/scripts/baremetal-adopt.sh
Parses the baremetal inventory, applies defaults, and verifies BMC access through Redfish, including inventory auto-generation when missing.
Deployment artifact generation
deploy/openshift-clusters/scripts/baremetal-adopt.sh
Generates ironic_nodes.json and fencing config overrides, then writes them under the cluster output directory.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

ready-for-human-review


Caution

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

  • Ignore

❌ Failed checks (2 errors, 2 warnings)

Check name Status Explanation Resolution
No-Sensitive-Data-In-Logs ❌ Error Wizard/adopt output prints node names and BMC addresses in the summary/verification table, exposing internal hostnames/IPs on stdout. Redact BMC addresses/hostnames (and any customer identifiers) in stdout/stderr, or move them behind an explicit verbose/debug flag; keep passwords masked.
No-Hardcoded-Secrets ❌ Error deploy/openshift-clusters/inventory_baremetal.ini.sample hardcodes bmc_pass=changeme for both nodes, which is a credential literal. Replace the sample password with a non-secret placeholder or omit it, and ensure tracked files never contain real or default credentials.
Docstring Coverage ⚠️ Warning Docstring coverage is 48.15% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Ai-Attribution ⚠️ Warning PR mentions Claude Code, but the HEAD commit uses only Co-Authored-By and no Assisted-by/Generated-by trailer is present. Replace the AI tool credit with a Red Hat Assisted-by or Generated-by trailer, and remove Co-Authored-By for the AI assistant.
✅ Passed checks (7 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the PR’s main change: automating baremetal node adoption and related tooling.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
No-Weak-Crypto ✅ Passed No MD5/SHA1/DES/RC4/3DES/Blowfish/ECB, custom crypto, or secret/token comparisons found in the added files.
Container-Privileges ✅ Passed No touched file adds privileged/hostPID/hostNetwork/hostIPC/SYS_ADMIN/allowPrivilegeEscalation settings or K8s manifests.
No-Injection-Vectors ✅ Passed No listed injection patterns appear in the new code; the added shell scripts use quoted args and no eval/exec/os.system-style calls.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@deploy/openshift-clusters/inventory_baremetal.ini.sample`:
- Around line 8-9: The sample command references the wrong Make target name, so
update the documented invocation to use the actual target added by this PR. Fix
the guidance in the inventory_baremetal.ini.sample comment to match the Makefile
target name baremetal-adopt, keeping the rest of the setup instructions
unchanged.

In `@deploy/openshift-clusters/scripts/baremetal-adopt.sh`:
- Around line 57-71: The option parsing in baremetal-adopt.sh assigns values for
CLUSTER_NAME and CONFIG_BASE from $2 without checking that an argument exists,
so update the case handling for --cluster-name and --config-base to validate the
next token before shifting. In the argument parsing loop, add a guard that emits
a usage/error message and exits cleanly when either flag is provided without a
value, and only then perform the shift 2.
- Around line 157-159: The inventory parser in baremetal-adopt.sh currently
accepts any non-empty NODE_NAMES list, but this workflow must only support
exactly two nodes. Update the validation logic around the NODE_NAMES parsing
step to reject inventories unless ${`#NODE_NAMES`[@]} is exactly 2, and keep the
existing die/info flow in the same parsing function so unsupported topologies
fail before artifacts are generated.
- Around line 253-270: The nodes_json construction in baremetal-adopt.sh is
manually concatenating JSON, so embedded quotes or backslashes in fields like
name, bmc_user, or bmc_pass can break ironic_nodes.json. Replace the
string-building block with a real JSON encoder path, such as generating the node
object via jq -n with --arg inputs (or equivalent), and keep the existing fields
redfish address, ports, and properties mapped through the encoder so the output
is always valid JSON.
- Around line 276-277: The JSON artifacts written by the adoption script are
inheriting the caller’s umask, so sensitive BMC credentials can end up
world-readable. Update the write paths in the `baremetal-adopt.sh` flow around
the `nodes_json`/`output_file` generation (including the later similar block
noted in the comment) to explicitly create these files with restrictive
permissions before or during the write, rather than relying on `echo | jq`. Keep
the existing `info` logging, but ensure the generated credential files are only
accessible to the owning user.
- Around line 94-143: The inventory parsing loop in baremetal-adopt.sh currently
strips inline content after # and tokenizes node fields on whitespace, so BMC
credentials with spaces or # are corrupted before verification. Update the
parsing logic around the in_nodes/in_vars handling to preserve quoted values for
bmc_user, bmc_pass, and related fields, or validate and reject unsupported
characters consistently with the wizard input. Make sure the node pair parsing
no longer loses data when reading the rest of each baremetal_nodes entry.
- Around line 169-171: The Redfish curl calls in the baremetal adoption script
are always using insecure TLS, which ignores the bmc_verify_ca setting. Update
the Redfish request paths that fetch systems and any other BMC endpoints to
conditionally use certificate verification based on bmc_verify_ca, instead of
hardcoding -k. Make the curl options in baremetal-adopt.sh honor the existing
bmc_verify_ca variable consistently in both call sites so TLS verification is
only disabled when explicitly requested.
- Around line 173-174: The Redfish system ID extraction is returning the literal
string null instead of an empty value, which prevents the fallback logic from
running. Update the jq-based lookup used for the system ID in baremetal-adopt.sh
so that missing Members[0]."`@odata.id`" is normalized to empty before assignment,
and apply the same handling in the other matching block mentioned in the comment
so the redfish:// address never ends up with /null.

In `@deploy/openshift-clusters/scripts/baremetal-wizard.sh`:
- Around line 39-41: The --output option handling in baremetal-wizard.sh reads
OUTPUT from $2 without first checking that an argument exists, which can trigger
an unbound-variable failure instead of usage handling. Update the case branch
for --output to validate that the next positional argument is present before
assigning OUTPUT and shifting, and route the missing-argument path through the
script’s existing usage/error flow so baremetal-wizard.sh --output reports a
proper message.
- Around line 8-13: The help/usage text in baremetal-wizard.sh still refers to
the old command name and should be updated to match the actual script. Change
the usage banner and any generated inventory comment in the script’s help/output
logic so they consistently use baremetal-wizard.sh instead of
wizard-baremetal.sh, including the relevant strings near the top-of-file help
block and the inventory generation path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 62d66789-5fa1-4fce-8065-c0760040c417

📥 Commits

Reviewing files that changed from the base of the PR and between 903e071 and daac159.

📒 Files selected for processing (5)
  • deploy/Makefile
  • deploy/openshift-clusters/.gitignore
  • deploy/openshift-clusters/inventory_baremetal.ini.sample
  • deploy/openshift-clusters/scripts/baremetal-adopt.sh
  • deploy/openshift-clusters/scripts/baremetal-wizard.sh

Comment thread deploy/openshift-clusters/inventory_baremetal.ini.sample Outdated
Comment on lines +57 to +71
--cluster-name)
CLUSTER_NAME="$2"
shift 2
;;
--skip-verify)
SKIP_VERIFY=true
shift
;;
--verify-only)
VERIFY_ONLY=true
shift
;;
--config-base)
CONFIG_BASE="$2"
shift 2

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Validate option arguments before shifting.

--cluster-name and --config-base read $2 unconditionally. If either flag is passed without a value, the script aborts under set -u instead of returning a clean usage error.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deploy/openshift-clusters/scripts/baremetal-adopt.sh` around lines 57 - 71,
The option parsing in baremetal-adopt.sh assigns values for CLUSTER_NAME and
CONFIG_BASE from $2 without checking that an argument exists, so update the case
handling for --cluster-name and --config-base to validate the next token before
shifting. In the argument parsing loop, add a guard that emits a usage/error
message and exits cleanly when either flag is provided without a value, and only
then perform the shift 2.

Comment thread deploy/openshift-clusters/scripts/baremetal-adopt.sh
Comment on lines +157 to +159
[[ ${#NODE_NAMES[@]} -eq 0 ]] && die "No nodes found in inventory"
info "Parsed ${#NODE_NAMES[@]} node(s) from inventory"
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Reject inventories that are not exactly two nodes.

This workflow is for TNF two-node adoption, but the parser accepts any non-empty inventory. A 3+ node file will still generate artifacts even though it does not match the supported topology.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deploy/openshift-clusters/scripts/baremetal-adopt.sh` around lines 157 - 159,
The inventory parser in baremetal-adopt.sh currently accepts any non-empty
NODE_NAMES list, but this workflow must only support exactly two nodes. Update
the validation logic around the NODE_NAMES parsing step to reject inventories
unless ${`#NODE_NAMES`[@]} is exactly 2, and keep the existing die/info flow in
the same parsing function so unsupported topologies fail before artifacts are
generated.

Comment thread deploy/openshift-clusters/scripts/baremetal-adopt.sh Outdated
Comment thread deploy/openshift-clusters/scripts/baremetal-adopt.sh Outdated
Comment thread deploy/openshift-clusters/scripts/baremetal-adopt.sh Outdated
Comment thread deploy/openshift-clusters/scripts/baremetal-adopt.sh Outdated
Comment thread deploy/openshift-clusters/scripts/baremetal-wizard.sh
Comment on lines +39 to +41
--output)
OUTPUT="$2"
shift 2

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Validate option arguments before reading $2.

--output consumes $2 unconditionally. Invoking the script as baremetal-wizard.sh --output exits with Bash's unbound-variable error instead of the intended usage message.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deploy/openshift-clusters/scripts/baremetal-wizard.sh` around lines 39 - 41,
The --output option handling in baremetal-wizard.sh reads OUTPUT from $2 without
first checking that an argument exists, which can trigger an unbound-variable
failure instead of usage handling. Update the case branch for --output to
validate that the next positional argument is present before assigning OUTPUT
and shifting, and route the missing-argument path through the script’s existing
usage/error flow so baremetal-wizard.sh --output reports a proper message.

@fonta-rh fonta-rh marked this pull request as ready for review June 29, 2026 13:00
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 29, 2026
@openshift-ci openshift-ci Bot requested review from jaypoulz and qJkee June 29, 2026 13:01
@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 29, 2026
@fonta-rh fonta-rh force-pushed the ocpedge-2774-baremetal-adopt branch from daac159 to 2f8c4ae Compare June 29, 2026 13:04
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 29, 2026
@coderabbitai coderabbitai Bot added the ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review label Jun 29, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
deploy/openshift-clusters/scripts/baremetal-adopt.sh (1)

91-143: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Align shell variable casing with the repository standard.

This new script uses lower-case local variables throughout. Please rename shell variables consistently to UPPER_CASE, or document an explicit local-variable exception. As per coding guidelines, “Use meaningful variable names in UPPER_CASE.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deploy/openshift-clusters/scripts/baremetal-adopt.sh` around lines 91 - 143,
The new parsing logic in baremetal-adopt.sh uses lower-case local variables,
which is inconsistent with the repository’s shell naming standard. Update the
variables used in the main loop and parsing blocks (such as in_nodes, in_vars,
key, val, name, rest, and the BMC-related locals) to UPPER_CASE, or add a clear
explicit exception if local lowercase variables are intended. Keep the naming
consistent throughout the script, including the variable assignments and
condition checks in the same section.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@deploy/openshift-clusters/scripts/baremetal-adopt.sh`:
- Around line 8-16: The help text and generated comments in baremetal-adopt.sh
use the wrong script name, which can mislead users. Update the usage line and
any related generated output strings in the script to reference
baremetal-adopt.sh consistently, and check the help-printing logic or comment
block that emits these labels so the displayed command name matches the actual
script.
- Around line 169-187: The Redfish curl calls in the BMC helper functions are
exposing `${bmc_pass}` on the command line, so update the request flow in the
systems lookup and `verify_bmc` logic to avoid argv-based credentials. Use a
temporary restrictive curl config or netrc-based approach for authentication,
reference the existing `systems_json` fetch and `http_code` check, and ensure
any temporary secret file is removed immediately after the request completes.

---

Nitpick comments:
In `@deploy/openshift-clusters/scripts/baremetal-adopt.sh`:
- Around line 91-143: The new parsing logic in baremetal-adopt.sh uses
lower-case local variables, which is inconsistent with the repository’s shell
naming standard. Update the variables used in the main loop and parsing blocks
(such as in_nodes, in_vars, key, val, name, rest, and the BMC-related locals) to
UPPER_CASE, or add a clear explicit exception if local lowercase variables are
intended. Keep the naming consistent throughout the script, including the
variable assignments and condition checks in the same section.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 77e14be3-4ba0-4be6-a3b0-3da7d3f23182

📥 Commits

Reviewing files that changed from the base of the PR and between daac159 and 2f8c4ae.

📒 Files selected for processing (5)
  • deploy/Makefile
  • deploy/openshift-clusters/.gitignore
  • deploy/openshift-clusters/inventory_baremetal.ini.sample
  • deploy/openshift-clusters/scripts/baremetal-adopt.sh
  • deploy/openshift-clusters/scripts/baremetal-wizard.sh
✅ Files skipped from review due to trivial changes (2)
  • deploy/openshift-clusters/.gitignore
  • deploy/openshift-clusters/inventory_baremetal.ini.sample
🚧 Files skipped from review as they are similar to previous changes (2)
  • deploy/Makefile
  • deploy/openshift-clusters/scripts/baremetal-wizard.sh

Comment thread deploy/openshift-clusters/scripts/baremetal-adopt.sh
Comment on lines +169 to +187
systems_json=$(curl -sk --connect-timeout 5 --max-time 10 \
-u "${bmc_user}:${bmc_pass}" \
"https://${bmc_ip}/redfish/v1/Systems/" 2>/dev/null) || return 1

echo "${systems_json}" | jq -r '.Members[0]."@odata.id"' 2>/dev/null
}

verify_bmc() {
local name="$1" bmc_ip="$2" bmc_user="$3" bmc_pass="$4"
local rc=0

printf " %-12s %-20s " "${name}" "${bmc_ip}"

# Verify Redfish root is reachable and credentials work
local http_code
http_code=$(curl -sk --connect-timeout 5 --max-time 10 \
-o /dev/null -w '%{http_code}' \
-u "${bmc_user}:${bmc_pass}" \
"https://${bmc_ip}/redfish/v1/" 2>/dev/null) || http_code="000"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Avoid passing BMC passwords in curl argv.

Both Redfish calls put ${bmc_pass} in the curl command line, which can leak through process listings while the request is running. Use a restrictive temporary curl config/netrc file and delete it after the request. As per coding guidelines, “Handle secrets and credentials securely.”

🧰 Tools
🪛 ast-grep (0.44.0)

[warning] 183-186: curl is invoked with -k/--insecure, which disables TLS certificate verification and exposes the connection to man-in-the-middle attacks. Remove the insecure flag and let curl validate the server certificate; if you need to trust a private CA, pin it with --cacert instead.
Context: curl -sk --connect-timeout 5 --max-time 10
-o /dev/null -w '%{http_code}'
-u "${bmc_user}:${bmc_pass}"
"https://${bmc_ip}/redfish/v1/"
Note: [CWE-295] Improper Certificate Validation.

(curl-insecure-tls-bash)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deploy/openshift-clusters/scripts/baremetal-adopt.sh` around lines 169 - 187,
The Redfish curl calls in the BMC helper functions are exposing `${bmc_pass}` on
the command line, so update the request flow in the systems lookup and
`verify_bmc` logic to avoid argv-based credentials. Use a temporary restrictive
curl config or netrc-based approach for authentication, reference the existing
`systems_json` fetch and `http_code` check, and ensure any temporary secret file
is removed immediately after the request completes.

Source: Coding guidelines

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@deploy/openshift-clusters/scripts/baremetal-adopt.sh`:
- Around line 179-181: The Redfish boot MAC discovery in baremetal-adopt.sh is
using a malformed EthernetInterfaces URL built from system_id, so fix the
request path in the ifaces_json curl call to target the proper
EthernetInterfaces collection. If no MAC can be discovered after that, do not
write DISCOVERY_FAILED into ironic_nodes.json; instead make the script abort
immediately and stop generating a placeholder port entry. Use the existing
discovery flow around ifaces_json and the ironic_nodes.json write path to locate
the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 0eacb9ad-52cd-4098-b98e-98bd9394bb01

📥 Commits

Reviewing files that changed from the base of the PR and between 70c5269 and 076fb67.

📒 Files selected for processing (3)
  • deploy/openshift-clusters/inventory_baremetal.ini.sample
  • deploy/openshift-clusters/scripts/baremetal-adopt.sh
  • deploy/openshift-clusters/scripts/baremetal-wizard.sh
✅ Files skipped from review due to trivial changes (1)
  • deploy/openshift-clusters/inventory_baremetal.ini.sample
🚧 Files skipped from review as they are similar to previous changes (1)
  • deploy/openshift-clusters/scripts/baremetal-wizard.sh

Comment thread deploy/openshift-clusters/scripts/baremetal-adopt.sh Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
deploy/openshift-clusters/scripts/baremetal-adopt.sh (1)

179-216: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Use uppercase names for the new shell locals.

The new helper introduces lowercase variables such as boot_order, options_json, option_paths, and raw_mac. As per coding guidelines, shell scripts should use UPPER_CASE for variable names.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@deploy/openshift-clusters/scripts/baremetal-adopt.sh` around lines 179 - 216,
The new helper in baremetal-adopt.sh uses lowercase shell locals like
boot_order, options_json, option_paths, and raw_mac, which violates the shell
variable naming convention. Update the local variable names in the boot-order
parsing logic and the PXE IPv4/MAC matching block to UPPER_CASE, and make sure
every reference inside the same section is renamed consistently so the flow in
the boot order/BootOptions handling still works.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@deploy/openshift-clusters/scripts/baremetal-adopt.sh`:
- Around line 187-189: The curl URL in the BootOptions lookup is concatenating
system_id directly with BootOptions, which breaks Redfish paths like
/redfish/v1/Systems/1. Update the URL construction in the baremetal-adopt.sh
logic that sets options_json so it inserts the missing separator between the
existing system_id value and BootOptions, preserving the expected
/.../Systems/1/BootOptions/ path.

---

Nitpick comments:
In `@deploy/openshift-clusters/scripts/baremetal-adopt.sh`:
- Around line 179-216: The new helper in baremetal-adopt.sh uses lowercase shell
locals like boot_order, options_json, option_paths, and raw_mac, which violates
the shell variable naming convention. Update the local variable names in the
boot-order parsing logic and the PXE IPv4/MAC matching block to UPPER_CASE, and
make sure every reference inside the same section is renamed consistently so the
flow in the boot order/BootOptions handling still works.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 405def63-c36a-4905-8be9-78cc95bbd331

📥 Commits

Reviewing files that changed from the base of the PR and between 076fb67 and 7b57fc5.

📒 Files selected for processing (1)
  • deploy/openshift-clusters/scripts/baremetal-adopt.sh

Comment thread deploy/openshift-clusters/scripts/baremetal-adopt.sh Outdated
#
# NOTE: This is separate from inventory.ini, which targets the hypervisor host.
# This file describes the physical baremetal nodes to be adopted as OpenShift nodes.
# inventory.ini → hypervisor (where dev-scripts runs)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a hypervisor necessary to deploy here ? Seems like overhead. I guess dev-scripts need to run somewhere ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there is no hypervisor deploy. These 2 comment lines are to describe the difference between inventory.ini and inventory_baremetal.ini

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, ok. Sorry this makes sense now that I have context of the existing inventory.ini

@dhensel-rh dhensel-rh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poka-yoke review — silent-wrong-output risks

Three places where the script produces output that looks correct but causes failures downstream:

1. JSON injection via string interpolation (baremetal-adopt.sh:290-315)

generate_ironic_nodes_json() interpolates BMC credentials directly into a heredoc. A password like P@ss"word or admin\n produces malformed JSON. The trailing jq . catches it, but the error is a cryptic parse failure with no hint which field broke.

Fix: Build the JSON with jq instead of string interpolation:

jq -n --arg addr "redfish://${bmc_address}/${system_id}" \
      --arg user "${bmc_user}" \
      --arg pass "${bmc_pass}" \
      --arg mac  "${boot_mac}" \
      '{name: $name, driver: "redfish", driver_info: {address: $addr, username: $user, password: $pass}, ports: [{address: $mac}]}'

2. DISCOVERY_FAILED written as a MAC address (baremetal-adopt.sh:278-281)

When Redfish MAC discovery fails, the script sets boot_mac="DISCOVERY_FAILED" and writes it into ironic_nodes.json as "address": "DISCOVERY_FAILED". This is syntactically valid JSON, so it passes any schema check and reaches dev-scripts, which fails later with an unrelated error.

Fix: Fail hard or track failures and exit non-zero after generation:

echo "  ERROR: ${name}: could not discover boot MAC — set boot_mac in inventory" >&2
INCOMPLETE=true

Then after the loop: ${INCOMPLETE} && die "Incomplete artifacts — set missing boot_mac values in inventory".

3. --skip-verify still contacts BMCs for discovery (baremetal-adopt.sh:268-283)

The flag skips credential verification but Redfish discovery (system ID + boot MAC) still runs. A user passing --skip-verify because they're on a different network gets silent DISCOVERY_FAILED values baked into the artifact.

Fix: Either gate all BMC contact behind --skip-verify (requiring boot_mac to be set in inventory), or rename the flag to --skip-credential-check so the scope is clear.

fonta-rh added a commit to fonta-rh/two-node-toolbox-tnf that referenced this pull request Jun 29, 2026
- Fix jq null handling in Redfish system ID discovery (// empty)
- Fix script names in usage text and generated comments
- Fix missing / separator in BootOptions URL construction
- Add --inventory flag and info messages to wizard trigger
- Add warning when inventory has != 2 nodes
- Honor bmc_verify_ca in curl calls via bmc_curl wrapper
- Restrict permissions on generated credential artifacts (umask 077)

Co-Authored-By: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fonta-rh added a commit to fonta-rh/two-node-toolbox-tnf that referenced this pull request Jun 29, 2026
- Replace heredoc JSON interpolation with jq -n --arg to prevent
  malformed output from passwords containing quotes or backslashes
- Fail hard on missing boot MAC instead of writing DISCOVERY_FAILED
  placeholder that silently breaks dev-scripts downstream
- Gate all BMC contact (discovery + verification) behind --skip-verify
  so offline users don't get silent failures baked into artifacts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@fonta-rh

fonta-rh commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

/retest-required

@openshift-ci openshift-ci Bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 2, 2026
fonta-rh and others added 5 commits July 2, 2026 11:57
Introduce adopt-baremetal.sh to onboard existing baremetal nodes into
the dev-scripts deployment workflow. Parses inventory_baremetal.ini
(template or wizard-generated), validates BMC credentials via Redfish,
and generates ironic_nodes.json + config_baremetal_fencing.sh artifacts
for NODES_PLATFORM=baremetal deployments.

OCPEDGE-2774

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Split the interactive wizard out of adopt-baremetal.sh into its own
baremetal-wizard.sh script. Add input validation (IPv4, MAC format,
hostname), re-prompt on invalid input instead of dying, a summary
table with masked passwords before confirmation, and Y/n/q flow.

Rename all baremetal scripts and Make targets to use a baremetal-*
prefix for consistent grouping (baremetal-adopt, baremetal-verify,
baremetal-wizard).

OCPEDGE-2774

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The wizard only accepted IPv4 addresses for BMC endpoints, but real
environments (e.g., HPE iLO) commonly use FQDNs. Accept both IPv4
and hostnames, and update prompts/labels accordingly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bmc_address accepts both IPs and hostnames (matching wizard change).
boot_mac is now optional — when omitted, the adopt script queries
Redfish EthernetInterfaces for an enabled NIC's MAC. Falls back to
a clear warning if discovery fails (e.g., firmware doesn't expose MACs).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the EthernetInterfaces-based discovery (which returned the
wrong NIC) with BootOptions-based discovery. Walks the BIOS boot
order, finds the first PXE IPv4 entry, and extracts the MAC from
the UEFI device path. Tested against HPE iLO 5 (Edgeline e920t).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fonta-rh and others added 7 commits July 2, 2026 11:57
- Fix jq null handling in Redfish system ID discovery (// empty)
- Fix script names in usage text and generated comments
- Fix missing / separator in BootOptions URL construction
- Add --inventory flag and info messages to wizard trigger
- Add warning when inventory has != 2 nodes
- Honor bmc_verify_ca in curl calls via bmc_curl wrapper
- Restrict permissions on generated credential artifacts (umask 077)

Co-Authored-By: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace heredoc JSON interpolation with jq -n --arg to prevent
  malformed output from passwords containing quotes or backslashes
- Fail hard on missing boot MAC instead of writing DISCOVERY_FAILED
  placeholder that silently breaks dev-scripts downstream
- Gate all BMC contact (discovery + verification) behind --skip-verify
  so offline users don't get silent failures baked into artifacts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Redfish @odata.id values include a trailing slash (/redfish/v1/Systems/1/),
which produced a double-slash URL (.../Systems/1//BootOptions/) that iLO
returns null content for. Strip the trailing slash after the leading one.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add declarative [baremetal_network] INI section for cluster-wide network
vars (machine_network, gateway, api_vip, ingress_vip) and per-node
node_ip field. The adoption script translates these into dev-scripts
exports (EXTERNAL_SUBNET_V4, BAREMETAL_GATEWAY, BAREMETAL_API_VIP,
BAREMETAL_INGRESS_VIP, BAREMETAL_IPS) and always emits bridge overrides
(MANAGE_BR_BRIDGE=n, MANAGE_PRO_BRIDGE=n, MANAGE_INT_BRIDGE=n).

All new fields are optional for backward compatibility. The wizard emits
skipped fields as commented placeholders so users can fill them later
without referencing the sample file. BAREMETAL_IPS is only emitted when
ALL nodes have node_ip set.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The wizard now collects optional provisioning host config (ssh_target,
ssh_key, dev_scripts_path, working_dir) and writes a [provisioning_host]
section to inventory_baremetal.ini. Skipped fields are commented out.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Without an explicit port in the Redfish URL, CEO's fence_redfish agent
fails to connect to the BMC during TNF auth jobs. The port (default 443)
now flows through verification, discovery, and ironic_nodes.json generation
as redfish://host:port/redfish/v1/Systems/1.

Adds bmc_port as a per-node field with a group default in
[baremetal_nodes:vars], matching the existing inheritance pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Place ironic_nodes.json and config_baremetal_fencing.sh in
roles/dev-scripts/install-dev/files/ instead of clusters/<name>/
so existing Ansible config.yml tasks can reference them by basename.
Remove --cluster-name flag (only used for the clusters/ subdirectory).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@fonta-rh fonta-rh force-pushed the ocpedge-2774-baremetal-adopt branch from 73d00d9 to 8d9c936 Compare July 2, 2026 09:58
fonta-rh and others added 5 commits July 2, 2026 12:55
On multi-NIC servers the PXE boot NIC may differ from the data NIC.
Add a per-node data_mac inventory field that emits BAREMETAL_MACS in
config_baremetal_fencing.sh so dev-scripts uses the correct MAC for
agent-config hostname mapping. Also add the missing [provisioning_host]
section to the inventory sample.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ve GATEWAY

Three vars are required for the baremetal DHCP deploy path but were
silently optional: BAREMETAL_IPS, BAREMETAL_API_VIP, and
AGENT_E2E_TEST_SCENARIO. Fail early at adoption time instead of
crashing deep in dev-scripts. Remove GATEWAY (nothing reads it) and
add a post-generation reminder to verify CI_TOKEN and release image.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
DHCP is the only supported network mode and agent-config always needs
the data NIC MAC for hostname mapping. Error early if data_mac is
missing rather than silently falling back to the boot MAC.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BMCs need the full agent ISO URL to mount via Redfish VirtualMedia.
Add iso_url as a required field in [baremetal_network], parsed and
emitted as BAREMETAL_ISO_SERVER in config_baremetal_fencing.sh.
Also make data_mac mandatory since DHCP is the only supported mode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WIZ_PASSES=()
WIZ_MACS=()
WIZ_NODE_IPS=()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need these two ?

Suggested change
WIZ_PORTS=()
WIZ_DATA_MACS=()

@dhensel-rh dhensel-rh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poka-yoke review — 4 mistake-proofing findings (see inline comments).

##############################################################################

main() {
parse_args "$@"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poka-yoke: --skip-verify + --verify-only are contradictory but accepted

With both flags set, the script skips verification, then prints "Verification complete" and exits — telling the user verification passed when nothing was verified.

Suggestion — add a mutual-exclusion check right after parse_args:

if ${SKIP_VERIFY} && ${VERIFY_ONLY}; then
    die "--skip-verify and --verify-only are mutually exclusive"
fi

system_id="redfish/v1/Systems/1"
else
system_id=$(discover_redfish_system_id "${bmc_address}" "${bmc_user}" "${bmc_pass}" "${bmc_port}" 2>/dev/null) || true
system_id="${system_id:-/redfish/v1/Systems/1}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poka-yoke: silent fallback produces plausible-but-wrong artifacts

When discover_redfish_system_id fails, this silently falls back to /redfish/v1/Systems/1. The generated ironic_nodes.json looks correct but may point to the wrong system on a multi-system BMC (blade chassis). The failure only surfaces much later during deployment.

Suggestion — log when falling back so the user knows discovery failed:

if [[ -z "${system_id}" ]]; then
    echo "  WARNING: ${name}: Redfish system discovery failed, using default /redfish/v1/Systems/1" >&2
    system_id="/redfish/v1/Systems/1"
fi

"${SCRIPT_DIR}/baremetal-wizard.sh" --output "${INVENTORY}"
fi

parse_inventory

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poka-yoke: validate required fields before expensive BMC calls

data_mac and node_ip are stored as empty strings here, then BMC verification runs (multiple Redfish round-trips per node). Only during artifact generation does the script die with "data_mac is required". All that network wait was wasted on something that could have been caught instantly.

Suggestion — validate right after parse_inventory:

parse_inventory

for ((i = 0; i < ${#NODE_NAMES[@]}; i++)); do
    [[ -z "${NODE_DATA_MACS[$i]}" ]] && die "Node '${NODE_NAMES[$i]}': data_mac is required"
    [[ -z "${NODE_IPS[$i]}" ]] && die "Node '${NODE_NAMES[$i]}': node_ip is required"
done

;;
*)
echo ""
info "Starting over — re-enter node information"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poka-yoke: typo in confirmation restarts the entire wizard

The *) catch-all restarts data entry from scratch. Typing "ye" instead of "yes", or any accidental keypress, discards all node data the user just entered — potentially several minutes of BMC details.

Suggestion — reject unknown input and re-prompt instead of restarting:

while true; do
    read -rp "Proceed with this configuration? [Y/n/q]: " confirm
    confirm="${confirm:-Y}"
    case "${confirm}" in
        [Yy]|[Yy]es) break 2 ;;   # break outer loop
        [Nn]|[Nn]o)  info "Starting over"; break ;;  # restart
        [Qq]|[Qq]uit) die "Wizard cancelled by user" ;;
        *) echo "  Please enter Y, n, or q" >&2 ;;
    esac
done

@dhensel-rh dhensel-rh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: wizard retry loop corrupts data (see inline comment).

Comment on lines +406 to +411
WIZ_NAMES=()
WIZ_IPS=()
WIZ_USERS=()
WIZ_PASSES=()
WIZ_MACS=()
WIZ_NODE_IPS=()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: WIZ_PORTS and WIZ_DATA_MACS not reset on retry

This block resets 6 of the 8 arrays but misses WIZ_PORTS and WIZ_DATA_MACS. If the user answers "n" at the summary and re-enters data, ports and data MACs from the first round accumulate. On the second pass with 2 nodes, WIZ_PORTS has 4 entries — the inventory writer then indexes past the intended range and writes stale values from round 1.

Add the missing two:

WIZ_NAMES=()
WIZ_IPS=()
WIZ_USERS=()
WIZ_PASSES=()
WIZ_PORTS=()
WIZ_MACS=()
WIZ_DATA_MACS=()
WIZ_NODE_IPS=()

@dhensel-rh dhensel-rh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Limitation: INI parser silently truncates passwords with spaces (see inline comments).

rest="${line#* }"

local bmc_address="" bmc_user="" bmc_pass="" bmc_port="" boot_mac="" node_ip="" data_mac=""
for pair in ${rest}; do

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Limitation: word-splitting silently truncates values containing spaces

for pair in ${rest} splits on whitespace. A password like bmc_pass=my secret parses as bmc_pass=my (truncated) and secret (dropped silently — no case branch matches bare words). The user gets no error; the password just doesn't work at deploy time.

Not necessarily worth fixing (the INI format inherently doesn't support unquoted spaces), but worth documenting the constraint in the sample file so users know to avoid spaces in passwords.

# bmc_address - BMC/iDRAC/iLO management address (IP or hostname)
# bmc_user - BMC login username
# bmc_pass - BMC login password
# bmc_port - (optional) BMC Redfish port (default: 443)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a note that values must not contain spaces (the parser uses whitespace as the field delimiter). e.g.:

#   bmc_pass   - BMC login password (must not contain spaces)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants