OCPEDGE-2416: Add oc adm transition topology command for SNO to HA transitions#2295
OCPEDGE-2416: Add oc adm transition topology command for SNO to HA transitions#2295jaypoulz wants to merge 2 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a feature-gated ChangesTransition Topology Command
Estimated code review effort: 4 (Complex) | ~60 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error)
✅ Passed checks (14 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/cli/admin/admin.go`:
- Around line 79-82: Add a small table-driven unit test for NewCommandAdmin that
verifies the feature-gated topology command registration in both enabled and
disabled cases. Cover the branch that calls
kcmdutil.FeatureGate(transitionTopologyFeatureGate).IsEnabled() and appends
transition.NewCmdTransition to clusterManagement, asserting that oc adm topology
is present only when the gate is enabled and absent when disabled.
In `@pkg/cli/admin/transition/preflight/client_validator_test.go`:
- Around line 222-230: Replace the `t.Skip(...)` placeholders in
`TestValidate_SNOToHA_SomeChecksFail` and `TestValidate_SNOToHA_AllChecksFail`
with real table-driven `testing.T` coverage for `ClientSideValidator.Validate`,
exercising the remaining operator/node/etcd/error-path branches. Use co-located
cases in `client_validator_test.go`, mirror the existing validator test style,
and assert the expected failures instead of skipping so the new logic is
actually covered.
In `@pkg/cli/admin/transition/preflight/client_validator.go`:
- Around line 270-297: The infrastructure node count check in
ClientSideValidator.validateExactInfrastructureNodeCount is too strict for
general HA transitions because it always requires zero non-control-plane nodes.
Update this validation so dedicated worker nodes are not treated as invalid for
every HA transition, either by removing the exact-zero requirement or by gating
it behind a compact-topology-specific signal; keep the existing count logic and
CheckResult flow in place.
In `@pkg/cli/admin/transition/preflight/types_test.go`:
- Around line 49-201: Add table-driven tests in types_test.go for the untested
helper methods on ValidationResult and CheckResult, especially
HasErrorCheckFailures, HasWarningCheckFailures, and ValidationResult.Error().
Cover passed, failed, and unknown combinations so the severity aggregation
behavior is locked down. Use the existing ValidationResult and CheckResult
symbols to verify the returned booleans and the aggregate error text matches the
expected block/bypass output used by client_validator.go and prompt.go.
In `@pkg/cli/admin/transition/prompt/types.go`:
- Around line 15-19: Update the interface comment in PromptForTransition to
match the actual warning-path behavior: warning-severity failures should block
and return false unless --allow-transition-with-warnings is set, in which case
they may proceed. Keep the bullet list aligned with the current implementation
and tests, and adjust the wording around the warning case so it reflects the
immediate-false flow instead of prompting.
In `@pkg/cli/admin/transition/status.go`:
- Around line 153-198: The etcd status command currently prints fetch failures
from the operator and ConfigMap lookups but still returns success, so it should
instead track those section errors and fail the command after the report is
printed. Update the status flow in the command that reads from
o.operatorClient.OperatorV1().Etcds().Get and
o.kubeClient.CoreV1().ConfigMaps(...).Get to collect any errors while still
emitting partial output, then return a utilerrors.Aggregate built from the
accumulated failures instead of nil when one or more lookups fail.
- Around line 60-64: The StatusOptions.Run flow is using a non-cancellable
context and swallowing read errors, so update it to use a timeout-bound context
instead of context.TODO() for the ConfigV1/Get reads and any related API calls.
In Run, thread that cancellable context through the infrastructure and
etcd/ConfigMap fetches, and make sure failures are returned to the caller rather
than only printed. Keep the logic in StatusOptions.Run and the read helpers
consistent so any health-check read failure causes the command to exit with an
error.
In `@pkg/cli/admin/transition/transition_test.go`:
- Around line 11-235: The current test file only covers metadata, validation,
and discovery; the new mutating code in TransitionOptions.runInitiate,
TransitionOptions.runStatus, and related status helpers still lacks coverage.
Add table-driven tests in this co-located *_test.go using fake.NewClientset()
and action recording to verify the Infrastructure update, warning-bypass
behavior, --dry-run behavior, and partial etcd failure handling. Make sure the
tests exercise the initiate/status paths in
pkg/cli/admin/transition/transition.go and pkg/cli/admin/transition/status.go
rather than only the command structure and discovery output.
In `@pkg/cli/admin/transition/transition.go`:
- Around line 208-226: Run() is using an unbounded context, so the
Infrastructure GET and later cluster calls can stall indefinitely; thread the
command context from Complete() into TransitionOptions, then use a bounded
timeout context inside TransitionOptions.Run before calling
configClient.ConfigV1().Infrastructures().Get and the discovery/initiate paths.
Update the Run flow to derive from the stored cmd.Context() instead of
context.TODO(), and ensure any helper invoked from Run uses that bounded context
as well.
- Around line 118-127: The root command in transition.NewCmd/transition command
setup currently accepts unexpected positional args because it only defines Run
without args validation. Add cobra.NoArgs to the cobra.Command for the root
“topology [status]” command so typos like extra positional arguments fail fast
instead of entering the root flow.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
| if kcmdutil.FeatureGate(transitionTopologyFeatureGate).IsEnabled() { | ||
| clusterManagement = append(clusterManagement, transition.NewCmdTransition(f, streams)) | ||
| } | ||
|
|
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Add a unit test for the feature-gated command registration.
This branch decides whether oc adm topology exists at all, so it should have a small table-driven NewCommandAdmin test covering the gate enabled/disabled cases. As per coding guidelines, "Write unit tests for every change."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/cli/admin/admin.go` around lines 79 - 82, Add a small table-driven unit
test for NewCommandAdmin that verifies the feature-gated topology command
registration in both enabled and disabled cases. Cover the branch that calls
kcmdutil.FeatureGate(transitionTopologyFeatureGate).IsEnabled() and appends
transition.NewCmdTransition to clusterManagement, asserting that oc adm topology
is present only when the gate is enabled and absent when disabled.
Source: Coding guidelines
| // TestNewValidationResult tests ValidationResult construction | ||
| func TestNewValidationResult(t *testing.T) { | ||
| current := configv1.SingleReplicaTopologyMode | ||
| target := configv1.HighlyAvailableTopologyMode | ||
|
|
||
| result := NewValidationResult(current, target) | ||
|
|
||
| if result.Current != current { | ||
| t.Errorf("expected current=%s, got %s", current, result.Current) | ||
| } | ||
| if result.Target != target { | ||
| t.Errorf("expected target=%s, got %s", target, result.Target) | ||
| } | ||
| if result.Status != ValidationStatusUnknown { | ||
| t.Errorf("expected status=Unknown, got %s", result.Status) | ||
| } | ||
| if len(result.Checks) != 0 { | ||
| t.Errorf("expected empty checks slice, got %d checks", len(result.Checks)) | ||
| } | ||
| } | ||
|
|
||
| // TestValidationResult_AddCheckResult tests adding check results | ||
| func TestValidationResult_AddCheckResult(t *testing.T) { | ||
| result := NewValidationResult(configv1.SingleReplicaTopologyMode, configv1.HighlyAvailableTopologyMode) | ||
|
|
||
| check1 := CheckResult{Name: "Check 1", Severity: CheckSeverityWarning, Status: CheckStatusPassed} | ||
| check2 := CheckResult{Name: "Check 2", Severity: CheckSeverityError, Status: CheckStatusFailed, Message: "failed"} | ||
|
|
||
| result.AddCheck(check1) | ||
| if len(result.Checks) != 1 { | ||
| t.Fatalf("expected 1 check, got %d", len(result.Checks)) | ||
| } | ||
| if result.Checks[0].Name != "Check 1" { | ||
| t.Errorf("expected first check name='Check 1', got %s", result.Checks[0].Name) | ||
| } | ||
|
|
||
| result.AddCheck(check2) | ||
| if len(result.Checks) != 2 { | ||
| t.Fatalf("expected 2 checks, got %d", len(result.Checks)) | ||
| } | ||
| if result.Checks[1].Name != "Check 2" { | ||
| t.Errorf("expected second check name='Check 2', got %s", result.Checks[1].Name) | ||
| } | ||
| } | ||
|
|
||
| // TestValidationResult_Status_AllPass tests status when all checks pass | ||
| func TestValidationResult_Status_AllPass(t *testing.T) { | ||
| result := NewValidationResult(configv1.SingleReplicaTopologyMode, configv1.HighlyAvailableTopologyMode) | ||
|
|
||
| result.AddCheck(CheckResult{Name: "Check 1", Severity: CheckSeverityWarning, Status: CheckStatusPassed}) | ||
| result.AddCheck(CheckResult{Name: "Check 2", Severity: CheckSeverityError, Status: CheckStatusPassed}) | ||
|
|
||
| if result.Status != ValidationStatusAvailable { | ||
| t.Errorf("expected status=Available when all checks pass, got %s", result.Status) | ||
| } | ||
| } | ||
|
|
||
| // TestValidationResult_Status_SomeFail tests status when checks fail | ||
| func TestValidationResult_Status_SomeFail(t *testing.T) { | ||
| result := NewValidationResult(configv1.SingleReplicaTopologyMode, configv1.HighlyAvailableTopologyMode) | ||
|
|
||
| result.AddCheck(CheckResult{Name: "Check 1", Severity: CheckSeverityWarning, Status: CheckStatusPassed}) | ||
| result.AddCheck(CheckResult{Name: "Check 2", Severity: CheckSeverityError, Status: CheckStatusFailed, Message: "failed"}) | ||
|
|
||
| if result.Status != ValidationStatusUnavailable { | ||
| t.Errorf("expected status=Unavailable when a check fails, got %s", result.Status) | ||
| } | ||
| } | ||
|
|
||
| // TestValidationResult_Status_Unknown tests status when a check is unknown | ||
| func TestValidationResult_Status_Unknown(t *testing.T) { | ||
| result := NewValidationResult(configv1.SingleReplicaTopologyMode, configv1.HighlyAvailableTopologyMode) | ||
|
|
||
| result.AddCheck(CheckResult{Name: "Check 1", Severity: CheckSeverityWarning, Status: CheckStatusPassed}) | ||
| result.AddCheck(CheckResult{Name: "Check 2", Severity: CheckSeverityError, Status: CheckStatusUnknown, Message: "could not check"}) | ||
|
|
||
| if result.Status != ValidationStatusUnknown { | ||
| t.Errorf("expected status=Unknown when a check is unknown, got %s", result.Status) | ||
| } | ||
| } | ||
|
|
||
| // TestCheckResult_Construction tests CheckResult construction | ||
| func TestCheckResult_Construction(t *testing.T) { | ||
| check := CheckResult{ | ||
| Name: "Test Check", | ||
| Severity: CheckSeverityError, | ||
| Status: CheckStatusFailed, | ||
| Message: "test failure", | ||
| } | ||
|
|
||
| if check.Name != "Test Check" { | ||
| t.Errorf("expected name='Test Check', got %s", check.Name) | ||
| } | ||
| if check.Severity != CheckSeverityError { | ||
| t.Errorf("expected severity=Error, got %s", check.Severity) | ||
| } | ||
| if check.Status != CheckStatusFailed { | ||
| t.Errorf("expected status=Failed, got %s", check.Status) | ||
| } | ||
| if check.Message != "test failure" { | ||
| t.Errorf("expected message='test failure', got %s", check.Message) | ||
| } | ||
| } | ||
|
|
||
| // TestCheckResult_String_Passed tests formatting of passed check | ||
| func TestCheckResult_String_Passed(t *testing.T) { | ||
| check := CheckResult{ | ||
| Name: "Passed Check", | ||
| Severity: CheckSeverityWarning, | ||
| Status: CheckStatusPassed, | ||
| } | ||
|
|
||
| expected := " Passed Check: passed" | ||
| if check.String() != expected { | ||
| t.Errorf("expected %q, got %q", expected, check.String()) | ||
| } | ||
| } | ||
|
|
||
| // TestCheckResult_String_Failed tests formatting of failed check with message | ||
| func TestCheckResult_String_Failed(t *testing.T) { | ||
| check := CheckResult{ | ||
| Name: "Failed Check", | ||
| Severity: CheckSeverityError, | ||
| Status: CheckStatusFailed, | ||
| Message: "something went wrong", | ||
| } | ||
|
|
||
| expected := " Failed Check: something went wrong" | ||
| if check.String() != expected { | ||
| t.Errorf("expected %q, got %q", expected, check.String()) | ||
| } | ||
| } | ||
|
|
||
| // TestCheckResult_String_Unknown tests formatting of unknown check | ||
| func TestCheckResult_String_Unknown(t *testing.T) { | ||
| check := CheckResult{ | ||
| Name: "Unknown Check", | ||
| Severity: CheckSeverityWarning, | ||
| Status: CheckStatusUnknown, | ||
| Message: "could not determine", | ||
| } | ||
|
|
||
| expected := " Unknown Check: unknown (could not determine)" | ||
| if check.String() != expected { | ||
| t.Errorf("expected %q, got %q", expected, check.String()) | ||
| } | ||
| } | ||
|
|
||
| // TestClientSideValidator_ImplementsInterface tests interface compliance | ||
| func TestClientSideValidator_ImplementsInterface(t *testing.T) { | ||
| // Compile-time check that ClientSideValidator implements Validator | ||
| var _ Validator = (*ClientSideValidator)(nil) | ||
| } |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win
Cover the untested severity helpers and aggregate error output.
HasErrorCheckFailures, HasWarningCheckFailures, and ValidationResult.Error() drive the block/bypass behavior used by pkg/cli/admin/transition/preflight/client_validator.go and pkg/cli/admin/transition/prompt/prompt.go, but this suite never exercises them. A small table-driven set for passed/failed/unknown combinations would lock down the behavior this command now depends on. As per coding guidelines, "Write unit tests for every change" and "Unit tests should use co-located *_test.go files with table-driven tests and standard testing.T".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/cli/admin/transition/preflight/types_test.go` around lines 49 - 201, Add
table-driven tests in types_test.go for the untested helper methods on
ValidationResult and CheckResult, especially HasErrorCheckFailures,
HasWarningCheckFailures, and ValidationResult.Error(). Cover passed, failed, and
unknown combinations so the severity aggregation behavior is locked down. Use
the existing ValidationResult and CheckResult symbols to verify the returned
booleans and the aggregate error text matches the expected block/bypass output
used by client_validator.go and prompt.go.
Source: Coding guidelines
| func (o *StatusOptions) Run() error { | ||
| ctx := context.TODO() | ||
|
|
||
| // Get Infrastructure resource | ||
| infra, err := o.configClient.ConfigV1().Infrastructures().Get(ctx, InfrastructureResourceName, metav1.GetOptions{}) |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the target file and nearby context.
sed -n '1,260p' pkg/cli/admin/transition/status.go
# Find related uses of context and error handling in nearby admin transition code.
rg -n "context\\.TODO\\(|context\\.Background\\(|WithTimeout\\(|CheckErr\\(|Aggregate|fmt\\.Printf|Fprintf|klog" pkg/cli/admin/transition -g '*.go'Repository: openshift/oc
Length of output: 14259
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '1,260p' pkg/cli/admin/transition/status.go
rg -n "context\\.TODO\\(|context\\.Background\\(|WithTimeout\\(|CheckErr\\(|Aggregate|fmt\\.Printf|Fprintf|klog" pkg/cli/admin/transition -g '*.go'Repository: openshift/oc
Length of output: 14259
Use a cancellable timeout context and surface read failures.
status still uses context.TODO() for every API call, so a slow request can hang indefinitely. It also prints etcd / ConfigMap read failures and returns success, which makes the command report a result when health could not be determined. Thread a timeout-bound context through the reads and return those errors instead of swallowing them.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/cli/admin/transition/status.go` around lines 60 - 64, The
StatusOptions.Run flow is using a non-cancellable context and swallowing read
errors, so update it to use a timeout-bound context instead of context.TODO()
for the ConfigV1/Get reads and any related API calls. In Run, thread that
cancellable context through the infrastructure and etcd/ConfigMap fetches, and
make sure failures are returned to the caller rather than only printed. Keep the
logic in StatusOptions.Run and the read helpers consistent so any health-check
read failure causes the command to exit with an error.
Source: Path instructions
| // Display etcd status | ||
| fmt.Fprintf(o.Out, "\netcd Status:\n\n") | ||
|
|
||
| etcd, err := o.operatorClient.OperatorV1().Etcds().Get(ctx, EtcdOperatorResourceName, metav1.GetOptions{}) | ||
| if err != nil { | ||
| fmt.Fprintf(o.Out, " Failed to get etcd operator: %v\n", err) | ||
| return nil | ||
| } | ||
|
|
||
| // Check etcd conditions | ||
| var etcdStatus []string | ||
| for _, cond := range etcd.Status.Conditions { | ||
| switch cond.Type { | ||
| case EtcdMembersAvailableCondition: | ||
| if cond.Status == "True" { | ||
| etcdStatus = append(etcdStatus, "Members available") | ||
| } else { | ||
| etcdStatus = append(etcdStatus, fmt.Sprintf("Members not available: %s", cond.Message)) | ||
| } | ||
| case "Progressing": | ||
| if cond.Status == "True" { | ||
| etcdStatus = append(etcdStatus, fmt.Sprintf("Progressing: %s", cond.Message)) | ||
| } | ||
| case "Degraded": | ||
| if cond.Status == "True" { | ||
| etcdStatus = append(etcdStatus, fmt.Sprintf("Degraded: %s", cond.Message)) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if len(etcdStatus) == 0 { | ||
| fmt.Fprintf(o.Out, " etcd healthy\n") | ||
| } else { | ||
| for _, s := range etcdStatus { | ||
| fmt.Fprintf(o.Out, " %s\n", s) | ||
| } | ||
| } | ||
|
|
||
| // Get etcd voting member count | ||
| cm, err := o.kubeClient.CoreV1().ConfigMaps(EtcdNamespace).Get(ctx, EtcdEndpointsConfigMapName, metav1.GetOptions{}) | ||
| if err != nil { | ||
| fmt.Fprintf(o.Out, " Failed to get %s ConfigMap: %v\n", EtcdEndpointsConfigMapName, err) | ||
| } else { | ||
| votingMembers := len(cm.Data) | ||
| fmt.Fprintf(o.Out, " Voting members: %d\n", votingMembers) | ||
| } |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Don't return success when etcd health could not be fetched.
The etcd operator and ConfigMap failures are printed and then ignored, so scripts still get exit code 0 even though the command could not determine etcd status. Please collect those section failures and return an aggregated error after printing the partial report.
As per coding guidelines, pkg/cli/**/*.go: utilerrors.Aggregate for batch errors.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/cli/admin/transition/status.go` around lines 153 - 198, The etcd status
command currently prints fetch failures from the operator and ConfigMap lookups
but still returns success, so it should instead track those section errors and
fail the command after the report is printed. Update the status flow in the
command that reads from o.operatorClient.OperatorV1().Etcds().Get and
o.kubeClient.CoreV1().ConfigMaps(...).Get to collect any errors while still
emitting partial output, then return a utilerrors.Aggregate built from the
accumulated failures instead of nil when one or more lookups fail.
Source: Coding guidelines
40a0310 to
f1c4c7d
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/cli/admin/transition/preflight/client_validator_test.go (1)
139-220: 📐 Maintainability & Code Quality | 🔵 Trivial | 🏗️ Heavy liftCollapse the repeated validator assertions into table-driven
cmp.Diffcases. The current one-off field checks are duplicated across several scenarios; assembling expectedValidationResult/CheckResultvalues and comparing them directly will keep this suite easier to extend and align it with the test conventions here.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/cli/admin/transition/preflight/client_validator_test.go` around lines 139 - 220, The SNO→HA happy-path test is manually asserting many fields one by one instead of using the table-driven cmp-based style used elsewhere in this suite. Refactor TestValidate_SNOToHA_AllChecksPass to build the expected ValidationResult and CheckResult values, then compare the full result with cmp.Diff so the assertions are centralized and easier to extend. Keep using the existing validator setup and the Validate method, but replace the repeated per-check status, severity, and name checks with direct diff-based comparisons.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@pkg/cli/admin/transition/preflight/client_validator_test.go`:
- Around line 139-220: The SNO→HA happy-path test is manually asserting many
fields one by one instead of using the table-driven cmp-based style used
elsewhere in this suite. Refactor TestValidate_SNOToHA_AllChecksPass to build
the expected ValidationResult and CheckResult values, then compare the full
result with cmp.Diff so the assertions are centralized and easier to extend.
Keep using the existing validator setup and the Validate method, but replace the
repeated per-check status, severity, and name checks with direct diff-based
comparisons.
f1c4c7d to
faf2c69
Compare
|
@coderabbitai review |
✅ Action performedReview finished.
|
faf2c69 to
e3eb318
Compare
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
pkg/cli/admin/transition/transition.go (1)
147-149: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick winBound the command context before the cluster calls.
This still threads an unbounded
cmd.Context()into the Infrastructure GET, preflight validation, and update path, so an API stall can hang the command indefinitely. Derive a timeout-scoped context inRun()and pass that bounded context through the rest of the flow. As per path instructions,**/*.go:context.Context for cancellation and timeouts.Suggested fix
import ( "context" "fmt" + "time" @@ +const defaultTransitionTimeout = 30 * time.Second + // Run executes the topology transition command func (o *TransitionOptions) Run() error { + ctx, cancel := context.WithTimeout(o.ctx, defaultTransitionTimeout) + defer cancel() + // Get current topology from Infrastructure resource - infra, err := o.configClient.ConfigV1().Infrastructures().Get(o.ctx, InfrastructureResourceName, metav1.GetOptions{}) + infra, err := o.configClient.ConfigV1().Infrastructures().Get(ctx, InfrastructureResourceName, metav1.GetOptions{}) if err != nil { return fmt.Errorf("failed to get Infrastructure resource: %w", err) } @@ - return o.runInitiateMode(o.ctx, currentTopology, targetTopology) + return o.runInitiateMode(ctx, currentTopology, targetTopology) }Also applies to: 215-231
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/cli/admin/transition/transition.go` around lines 147 - 149, The transition command is still using an unbounded cmd.Context() for the Infrastructure GET, preflight validation, and update flow. In Run() on the transition command, derive a timeout-scoped context before any cluster/API calls and store/use that bounded context throughout the rest of the path. Update the relevant transition flow methods (including the Infrastructure GET and validation/update steps) to accept and pass this scoped context instead of cmd.Context().Source: Path instructions
pkg/cli/admin/transition/preflight/client_validator.go (1)
270-297: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winDon't treat dedicated workers as a readiness failure for every HA transition.
This still hard-requires exactly
0non-control-plane nodes and reports any worker-bearing cluster as needing "HighlyAvailable compact topology". A supported SNO → HA transition can legitimately keep dedicated workers, so this warning blocks the default flow on valid clusters unless the user opts into--allow-transition-with-warnings.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/cli/admin/transition/preflight/client_validator.go` around lines 270 - 297, The HA infrastructure node check in validateExactInfrastructureNodeCount is too strict because it hardcodes required as 0 and fails any cluster with non-control-plane nodes. Update the logic so the default readiness path only warns or skips this condition for supported SNO→HA transitions, and keep the failure behind the existing --allow-transition-with-warnings behavior. Use validateExactInfrastructureNodeCount, CheckNameInfrastructureNodeCount, and the message construction to adjust the status/message so worker-bearing clusters are not treated as a blocking failure by default.
🧹 Nitpick comments (1)
pkg/cli/admin/transition/prompt/prompt_test.go (1)
134-220: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd the missing warning-bypass +
--yescase.The suite covers warnings with interactive confirmation and all-pass with
--yes, but it never exercises the combinedAllowTransitionWithWarnings=true+Yes=truebranch. That's the non-interactive warning path automation will use, so I'd add it—ideally as another row in a small table-driven matrix. As per coding guidelines,**/*_test.go:Write unit tests for every changeandUnit tests should use co-located *_test.go files with table-driven tests and standard testing.T.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/cli/admin/transition/prompt/prompt_test.go` around lines 134 - 220, The transition prompt tests are missing coverage for the non-interactive warning bypass path when both AllowTransitionWithWarnings and Yes are true. Add a test case in PromptForTransition/InteractivePrompter that exercises this combined branch, ideally by converting the existing PromptForTransition cases into a small table-driven set. Verify the confirmation returns true and that the output matches the expected warning-bypass/--yes messaging without prompting for input.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/cli/admin/transition/preflight/types.go`:
- Around line 101-126: Preserve the aggregate Unknown state in
ValidationResult.AddCheck: right now CheckStatusPassed can reset vr.Status from
ValidationStatusUnknown back to ValidationStatusAvailable, even when the Unknown
came from a previous failed API call. Update AddCheck in ValidationResult so
once any CheckResult has Status CheckStatusUnknown, later passes never overwrite
that state, and only the initial no-checks state can transition to Available;
keep the existing precedence logic for CheckStatusFailed and CheckStatusUnknown
while using a separate “have seen any checks” condition if needed.
In `@pkg/cli/admin/transition/transition.go`:
- Around line 235-250: The user-facing output in
TransitionOptions.runDiscoveryMode is ignoring write failures from fmt.Fprintf,
so broken pipes and other output errors are lost; update this path to propagate
every write error back to the caller, ideally with a small helper around o.Out
writes. Apply the same pattern to the related output blocks in TransitionOptions
so all CLI messages fail cleanly when writing to stdout/stderr fails.
---
Duplicate comments:
In `@pkg/cli/admin/transition/preflight/client_validator.go`:
- Around line 270-297: The HA infrastructure node check in
validateExactInfrastructureNodeCount is too strict because it hardcodes required
as 0 and fails any cluster with non-control-plane nodes. Update the logic so the
default readiness path only warns or skips this condition for supported SNO→HA
transitions, and keep the failure behind the existing
--allow-transition-with-warnings behavior. Use
validateExactInfrastructureNodeCount, CheckNameInfrastructureNodeCount, and the
message construction to adjust the status/message so worker-bearing clusters are
not treated as a blocking failure by default.
In `@pkg/cli/admin/transition/transition.go`:
- Around line 147-149: The transition command is still using an unbounded
cmd.Context() for the Infrastructure GET, preflight validation, and update flow.
In Run() on the transition command, derive a timeout-scoped context before any
cluster/API calls and store/use that bounded context throughout the rest of the
path. Update the relevant transition flow methods (including the Infrastructure
GET and validation/update steps) to accept and pass this scoped context instead
of cmd.Context().
---
Nitpick comments:
In `@pkg/cli/admin/transition/prompt/prompt_test.go`:
- Around line 134-220: The transition prompt tests are missing coverage for the
non-interactive warning bypass path when both AllowTransitionWithWarnings and
Yes are true. Add a test case in PromptForTransition/InteractivePrompter that
exercises this combined branch, ideally by converting the existing
PromptForTransition cases into a small table-driven set. Verify the confirmation
returns true and that the output matches the expected warning-bypass/--yes
messaging without prompting for input.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 68174a2e-da7f-46d4-8b4b-6882db9a6fca
📒 Files selected for processing (13)
pkg/cli/admin/admin.gopkg/cli/admin/transition/constants.gopkg/cli/admin/transition/preflight/client_validator.gopkg/cli/admin/transition/preflight/client_validator_test.gopkg/cli/admin/transition/preflight/constants.gopkg/cli/admin/transition/preflight/types.gopkg/cli/admin/transition/preflight/types_test.gopkg/cli/admin/transition/prompt/prompt.gopkg/cli/admin/transition/prompt/prompt_test.gopkg/cli/admin/transition/prompt/types.gopkg/cli/admin/transition/status.gopkg/cli/admin/transition/transition.gopkg/cli/admin/transition/transition_test.go
✅ Files skipped from review due to trivial changes (1)
- pkg/cli/admin/transition/constants.go
🚧 Files skipped from review as they are similar to previous changes (5)
- pkg/cli/admin/admin.go
- pkg/cli/admin/transition/prompt/types.go
- pkg/cli/admin/transition/preflight/constants.go
- pkg/cli/admin/transition/prompt/prompt.go
- pkg/cli/admin/transition/status.go
e3eb318 to
449810d
Compare
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
pkg/cli/admin/transition/preflight/client_validator_test.go (1)
72-466: 📐 Maintainability & Code Quality | 🔵 Trivial | 🏗️ Heavy liftRefactor these validator cases into table-driven tests and compare assembled results with
cmp.Diff.
The repeated fake-client setup and field-by-field assertions across these cases make the matrix hard to extend; a table-driventesting.Tshape with full-result comparisons will keep the validator coverage easier to maintain.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/cli/admin/transition/preflight/client_validator_test.go` around lines 72 - 466, The validator tests are highly repetitive and should be refactored into table-driven cases with assembled expected results compared via cmp.Diff. Consolidate the repeated fake client setup and assertions in the TestValidate_* functions, especially around NewClientSideValidator and the validate* helpers, into a single table of inputs/expected outputs. Build the expected ValidationResult/Check values per case and use cmp.Diff for the full comparison instead of many field-by-field checks.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/cli/admin/transition/preflight/client_validator.go`:
- Around line 130-149: The feature-gate validation in checkFeatureGateEnabled is
too broad because it returns success when MutableTopology appears in any
featureGates entry. Restrict the scan to the authoritative active version entry
from featureGate.Status.FeatureGates so preflight only passes when that specific
version has the gate enabled, and update the CheckResult path accordingly. Add a
regression test around client_validator.go that covers mixed-version
status.featureGates data to confirm an unrelated version no longer causes
CheckNameFeatureGateEnabled to pass.
In `@pkg/cli/admin/transition/transition.go`:
- Around line 331-338: The warning-check block in transition logic is printing
an error but still returns success, so update the
`result.HasWarningCheckFailures()` path in `transition.go` to return a non-nil
error instead of `nil` after the `fprintf` calls. Keep the existing user-facing
messages, but ensure the `o.AllowTransitionWithWarnings` guard in the transition
flow causes a non-zero exit by propagating an error from this branch.
---
Nitpick comments:
In `@pkg/cli/admin/transition/preflight/client_validator_test.go`:
- Around line 72-466: The validator tests are highly repetitive and should be
refactored into table-driven cases with assembled expected results compared via
cmp.Diff. Consolidate the repeated fake client setup and assertions in the
TestValidate_* functions, especially around NewClientSideValidator and the
validate* helpers, into a single table of inputs/expected outputs. Build the
expected ValidationResult/Check values per case and use cmp.Diff for the full
comparison instead of many field-by-field checks.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 2450944d-45c8-4007-95b1-d443c0454032
📒 Files selected for processing (10)
pkg/cli/admin/admin.gopkg/cli/admin/transition/constants.gopkg/cli/admin/transition/preflight/client_validator.gopkg/cli/admin/transition/preflight/client_validator_test.gopkg/cli/admin/transition/preflight/constants.gopkg/cli/admin/transition/preflight/types.gopkg/cli/admin/transition/preflight/types_test.gopkg/cli/admin/transition/status.gopkg/cli/admin/transition/transition.gopkg/cli/admin/transition/transition_test.go
✅ Files skipped from review due to trivial changes (1)
- pkg/cli/admin/transition/preflight/types_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
- pkg/cli/admin/admin.go
- pkg/cli/admin/transition/constants.go
- pkg/cli/admin/transition/preflight/constants.go
- pkg/cli/admin/transition/status.go
|
@jaypoulz: This pull request references OCPEDGE-2416 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
449810d to
a210137
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
pkg/cli/admin/transition/preflight/client_validator.go (1)
171-199: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winSelect the feature-gate entry by payload version, not by index.
Line 183 assumes
status.featureGates[0]is the active gate set, but this list is versioned and mixed-version entries can coexist. That makes preflight depend on slice order instead of the cluster's active payload version, soMutableTopologycan be reported incorrectly. Please resolve the active version first and only inspect the matchingFeatureGateDetailsentry, with a mixed-version regression test alongside it.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/cli/admin/transition/preflight/client_validator.go` around lines 171 - 199, The FeatureGate check in client_validator.go is incorrectly assuming the first Status.FeatureGates entry is the active one; update the preflight logic in the FeatureGate validation path to resolve the cluster’s active payload version first, then inspect only the matching FeatureGateDetails entry for MutableTopology. Keep the CheckNameFeatureGateEnabled flow intact, but replace the index-based lookup in the activeVersion logic with version matching, and add a regression test covering mixed-version Status.FeatureGates ordering.
🧹 Nitpick comments (1)
pkg/cli/admin/transition/preflight/validator_helpers_test.go (1)
42-50: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winCompare the full
CheckResultin the table cases.These cases only assert
Status, so a regression inName,Severity, orMessagewould still pass even though those fields are part of the validator contract and affect phase gating/output. Populate an expectedCheckResultper case and compare withcmp.Diff. As per coding guidelines, "Use google/go-cmp for test assertions by assembling the expected object and comparing with actual, rather than checking individual fields with if statements."Also applies to: 210-223
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/cli/admin/transition/preflight/validator_helpers_test.go` around lines 42 - 50, The table-driven validator tests only verify CheckStatus, so regressions in CheckResult.Name, Severity, or Message can slip through. Update the validator test cases around validatorTestCase and the related table entries to include a full expected CheckResult for each scenario, then compare the actual result from ClientSideValidator with the expected object using cmp.Diff instead of field-by-field checks. Ensure the assertions cover the complete contract returned by the validator, not just Status.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/cli/admin/transition/preflight/client_validator_test.go`:
- Around line 533-543: The test currently only verifies that some warning check
failed, which can miss regressions in the degraded cluster-operator path. Update
the assertions in client_validator_test.go to explicitly find and fail the
cluster-operator warning by name in result.Checks, using the existing
CheckSeverityWarning and CheckStatusFailed checks, so the test proves the
intended warning is the one failing.
---
Duplicate comments:
In `@pkg/cli/admin/transition/preflight/client_validator.go`:
- Around line 171-199: The FeatureGate check in client_validator.go is
incorrectly assuming the first Status.FeatureGates entry is the active one;
update the preflight logic in the FeatureGate validation path to resolve the
cluster’s active payload version first, then inspect only the matching
FeatureGateDetails entry for MutableTopology. Keep the
CheckNameFeatureGateEnabled flow intact, but replace the index-based lookup in
the activeVersion logic with version matching, and add a regression test
covering mixed-version Status.FeatureGates ordering.
---
Nitpick comments:
In `@pkg/cli/admin/transition/preflight/validator_helpers_test.go`:
- Around line 42-50: The table-driven validator tests only verify CheckStatus,
so regressions in CheckResult.Name, Severity, or Message can slip through.
Update the validator test cases around validatorTestCase and the related table
entries to include a full expected CheckResult for each scenario, then compare
the actual result from ClientSideValidator with the expected object using
cmp.Diff instead of field-by-field checks. Ensure the assertions cover the
complete contract returned by the validator, not just Status.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 5c62fc86-97cd-4ce0-a9af-ae636f4ac8f0
📒 Files selected for processing (11)
pkg/cli/admin/admin.gopkg/cli/admin/transition/constants.gopkg/cli/admin/transition/preflight/client_validator.gopkg/cli/admin/transition/preflight/client_validator_test.gopkg/cli/admin/transition/preflight/constants.gopkg/cli/admin/transition/preflight/types.gopkg/cli/admin/transition/preflight/types_test.gopkg/cli/admin/transition/preflight/validator_helpers_test.gopkg/cli/admin/transition/status.gopkg/cli/admin/transition/transition.gopkg/cli/admin/transition/transition_test.go
🚧 Files skipped from review as they are similar to previous changes (5)
- pkg/cli/admin/admin.go
- pkg/cli/admin/transition/preflight/types_test.go
- pkg/cli/admin/transition/constants.go
- pkg/cli/admin/transition/preflight/constants.go
- pkg/cli/admin/transition/status.go
Update vendored dependencies to include Infrastructure.spec.controlPlaneTopology field from OCPEDGE-2408 (Mutable Topology enhancement). Key changes: - github.com/openshift/api: updated to commit 8f87a41c3 - github.com/openshift/client-go: updated to master Transitive dependency updates pulled by go mod vendor: - github.com/go-openapi/swag (refactored internal structure) - github.com/emicklei/go-restful/v3 - Various other indirect dependencies
63e0d4f to
cbd36f8
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: jaypoulz The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
cbd36f8 to
12bde7e
Compare
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
🧹 Nitpick comments (4)
pkg/cli/admin/transition/preflight/client_validator_test.go (1)
112-152: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚖️ Poor tradeoffPrefer
google/go-cmpover per-fieldifassertions.Nearly every test in this file asserts individual
CheckResult/ValidationResultfields via separateifstatements (e.g. Name, Severity, Status, Message checked independently). Consider assembling the expectedCheckResult/ValidationResultand diffing withcmp.Diff, which is more resilient to future field additions and produces clearer failure output.As per coding guidelines, "Use
google/go-cmpfor test assertions by assembling the expected object and comparing with actual, rather than checking individual fields with if statements."Also applies to: 178-210, 236-261, 275-305, 522-548
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/cli/admin/transition/preflight/client_validator_test.go` around lines 112 - 152, Replace the per-field `if` assertions in `client_validator_test.go` with `google/go-cmp` comparisons by building the expected `ValidationResult` and `CheckResult` values and diffing them against the actual result. Update the checks around `result`, `result.Checks`, and the severity/status/name assertions in these tests to use `cmp.Diff`, keeping only any minimal sanity checks needed for collection sizes or ordering. Apply the same pattern consistently in the other affected test blocks referenced in this file so future field additions don’t require updating many individual assertions.Source: Coding guidelines
pkg/cli/admin/transition/preflight/types.go (1)
101-134: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAddCheck logic is correct but fragile — consider the simpler recompute-from-scratch approach.
Traced through all status-transition orderings (Passed/Failed/Unknown in various sequences) and the
isFirstCheck+ branching logic does correctly implement the stated priority (Unknown > NotReady > Ready), fixing the bug flagged in the earlier review. However, this branching style is easy to get subtly wrong on future edits (e.g. adding a newCheckStatusor reordering checks), and the emptyelse if vr.Status == ValidationStatusReady { // Keep Ready status }branch (lines 117-119) does nothing.A recompute-from-scratch loop over
vr.Checkson every call (as originally suggested) is harder to get wrong and easier to review at a glance.♻️ Suggested simplification
func (vr *ValidationResult) AddCheck(check CheckResult) { - // Check if this is the first check (to distinguish initial Unknown from API error Unknown) - isFirstCheck := len(vr.Checks) == 0 - vr.Checks = append(vr.Checks, check) - // Update overall status based on check results - // Priority: Unknown (API error) > Not Ready (check failed) > Ready (all passed) - switch check.Status { - case CheckStatusPassed: - // Set to Ready if this is the first check (initial Unknown state) - // Keep Ready if already Ready - // Do NOT transition from Unknown to Ready if Unknown came from a previous API error - if isFirstCheck && vr.Status == ValidationStatusUnknown { - vr.Status = ValidationStatusReady - } else if vr.Status == ValidationStatusReady { - // Keep Ready status - } - // else: keep current status (Not Ready/Unknown from previous checks) - - case CheckStatusFailed: - // Set to Not Ready if this is the first check (initial Unknown state) OR status is Ready - // Keep Unknown if it came from a previous API error (len(vr.Checks) > 1 && status is Unknown) - if isFirstCheck || vr.Status == ValidationStatusReady { - vr.Status = ValidationStatusNotReady - } - // else: keep Unknown status from previous API error - - case CheckStatusUnknown: - // Unknown (API error) takes precedence over everything - vr.Status = ValidationStatusUnknown - } + // Priority: Unknown (API error) > Not Ready (check failed) > Ready (all passed) + vr.Status = ValidationStatusReady + for _, c := range vr.Checks { + switch c.Status { + case CheckStatusUnknown: + vr.Status = ValidationStatusUnknown + return + case CheckStatusFailed: + vr.Status = ValidationStatusNotReady + } + } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/cli/admin/transition/preflight/types.go` around lines 101 - 134, The AddCheck method in ValidationResult uses fragile incremental branching with isFirstCheck and status-specific cases; simplify it by recomputing vr.Status from vr.Checks on each call instead of mutating based on prior state. Update AddCheck to append the new CheckResult, then iterate over vr.Checks to derive the final ValidationStatus using the same priority rules in ValidationResult/AddCheck (Unknown > NotReady > Ready), which also removes the no-op Ready branch and makes future changes safer.pkg/cli/admin/transition/preflight/client_validator.go (1)
83-122: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winUnused helper functions and duplicated node-listing boilerplate.
checkUnknown,checkFailed,checkPassed(lines 86-112) andlistNodes(lines 114-122) are defined but never called anywhere in this file — everyvalidate*method buildsCheckResult{}literals directly and re-implements the "list nodes, handle error" pattern independently invalidateControlPlaneNodeCount,validateExactInfrastructureNodeCount,validateControlPlaneNodesSchedulable, andvalidateControlPlaneNodesReady. This means 4 separateNodes().List()API calls perValidate()invocation, and the same failure-handling boilerplate repeated 4 times.Wiring these functions in would reduce duplication and cut redundant API round-trips.
♻️ Proposed refactor sketch
func (v *ClientSideValidator) validateControlPlaneNodeCount(ctx context.Context, required int) CheckResult { - nodes, err := v.kubeClient.CoreV1().Nodes().List(ctx, metav1.ListOptions{}) - if err != nil { - return CheckResult{ - Name: CheckNameControlPlaneNodeCount, - Severity: CheckSeverityWarning, - Status: CheckStatusUnknown, - Message: fmt.Sprintf("failed to list nodes: %v", err), - } - } + nodes, errResult := v.listNodes(ctx, CheckNameControlPlaneNodeCount, CheckSeverityWarning) + if errResult != nil { + return *errResult + }Apply similarly to the other three node-based checks, and consider fetching the node list once in
Validate()and passing it to all four checks to avoid 4x API calls.Also applies to: 287-437
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/cli/admin/transition/preflight/client_validator.go` around lines 83 - 122, The helper functions checkUnknown, checkFailed, checkPassed, and listNodes are currently unused and the node-listing logic is duplicated across the validate* methods in ClientSideValidator. Refactor the node-based checks to use these helpers and share the Nodes().List() result, ideally by fetching nodes once in Validate() and passing the list into validateControlPlaneNodeCount, validateExactInfrastructureNodeCount, validateControlPlaneNodesSchedulable, and validateControlPlaneNodesReady, so the repeated boilerplate and extra API calls are removed.pkg/cli/admin/transition/preflight/validator_helpers_test.go (1)
56-226: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAll 8 subtests only exercise the passing scenario.
Every case in
TestIndividualValidatorssetsexpectedStatus: CheckStatusPassed. None exerciseCheckStatusFailedorCheckStatusUnknownfor the individual validators (e.g., unstable cluster operator, insufficient/unschedulable/not-ready control-plane nodes, non-zero infra nodes, etcd not-quorate/progressing, wrong voting-member count). If these negative paths aren't covered elsewhere (e.g. inclient_validator_test.go), consider adding failing-case rows here to match the table-driven pattern already in place.As per coding guidelines,
Table-driven unit tests are preferred for testing multiple scenarios/inputsandWrite unit tests for every change.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/cli/admin/transition/preflight/validator_helpers_test.go` around lines 56 - 226, `TestIndividualValidators` currently only verifies happy-path behavior, so add table-driven rows that exercise negative outcomes for each validator and assert `CheckStatusFailed` or `CheckStatusUnknown` as appropriate. Use the existing helper-based setup in `validateClusterOperatorsStable`, `validateControlPlaneNodeCount`, `validateExactInfrastructureNodeCount`, `validateControlPlaneNodesSchedulable`, `validateControlPlaneNodesReady`, `validateEtcdQuorum`, `validateEtcdNotProgressing`, and `validateEtcdVotingMembers` to create failing inputs like unstable operators, insufficient or unschedulable control-plane nodes, non-zero infra nodes, missing quorum/progressing etcd, and wrong voting-member counts. Keep the same table-driven pattern and update the expected status fields per case.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@pkg/cli/admin/transition/preflight/client_validator_test.go`:
- Around line 112-152: Replace the per-field `if` assertions in
`client_validator_test.go` with `google/go-cmp` comparisons by building the
expected `ValidationResult` and `CheckResult` values and diffing them against
the actual result. Update the checks around `result`, `result.Checks`, and the
severity/status/name assertions in these tests to use `cmp.Diff`, keeping only
any minimal sanity checks needed for collection sizes or ordering. Apply the
same pattern consistently in the other affected test blocks referenced in this
file so future field additions don’t require updating many individual
assertions.
In `@pkg/cli/admin/transition/preflight/client_validator.go`:
- Around line 83-122: The helper functions checkUnknown, checkFailed,
checkPassed, and listNodes are currently unused and the node-listing logic is
duplicated across the validate* methods in ClientSideValidator. Refactor the
node-based checks to use these helpers and share the Nodes().List() result,
ideally by fetching nodes once in Validate() and passing the list into
validateControlPlaneNodeCount, validateExactInfrastructureNodeCount,
validateControlPlaneNodesSchedulable, and validateControlPlaneNodesReady, so the
repeated boilerplate and extra API calls are removed.
In `@pkg/cli/admin/transition/preflight/types.go`:
- Around line 101-134: The AddCheck method in ValidationResult uses fragile
incremental branching with isFirstCheck and status-specific cases; simplify it
by recomputing vr.Status from vr.Checks on each call instead of mutating based
on prior state. Update AddCheck to append the new CheckResult, then iterate over
vr.Checks to derive the final ValidationStatus using the same priority rules in
ValidationResult/AddCheck (Unknown > NotReady > Ready), which also removes the
no-op Ready branch and makes future changes safer.
In `@pkg/cli/admin/transition/preflight/validator_helpers_test.go`:
- Around line 56-226: `TestIndividualValidators` currently only verifies
happy-path behavior, so add table-driven rows that exercise negative outcomes
for each validator and assert `CheckStatusFailed` or `CheckStatusUnknown` as
appropriate. Use the existing helper-based setup in
`validateClusterOperatorsStable`, `validateControlPlaneNodeCount`,
`validateExactInfrastructureNodeCount`, `validateControlPlaneNodesSchedulable`,
`validateControlPlaneNodesReady`, `validateEtcdQuorum`,
`validateEtcdNotProgressing`, and `validateEtcdVotingMembers` to create failing
inputs like unstable operators, insufficient or unschedulable control-plane
nodes, non-zero infra nodes, missing quorum/progressing etcd, and wrong
voting-member counts. Keep the same table-driven pattern and update the expected
status fields per case.
d7567e2 to
21968a9
Compare
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
pkg/cli/admin/transition/transition.go (1)
434-457: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick winConsider retrying on Update conflict.
applyTransitionperforms a Get then Update on Infrastructure without handling aConflicterror via retry. Since this call happens after the (potentially lengthy) preflight validation, a benign resourceVersion race would force the user to rerun the entire command including preflight checks. Wrapping the Update inretry.RetryOnConflictwould make this more resilient.♻️ Suggested fix
- // Get current Infrastructure resource - infra, err := o.configClient.ConfigV1().Infrastructures().Get(ctx, InfrastructureResourceName, metav1.GetOptions{}) - if err != nil { - return fmt.Errorf("failed to get Infrastructure resource: %w", err) - } - - // Patch control plane topology (spec only supports controlPlaneTopology) - // NOTE: InfrastructureTopology only exists in status, not spec. - // For SNO->HA transition, the cluster-config-operator will update both - // controlPlaneTopology and infrastructureTopology in status. - infra.Spec.ControlPlaneTopology = targetCP - - _, err = o.configClient.ConfigV1().Infrastructures().Update(ctx, infra, metav1.UpdateOptions{}) - if err != nil { - return fmt.Errorf("failed to update Infrastructure resource: %w", err) - } + // Patch control plane topology (spec only supports controlPlaneTopology) + // NOTE: InfrastructureTopology only exists in status, not spec. + // For SNO->HA transition, the cluster-config-operator will update both + // controlPlaneTopology and infrastructureTopology in status. + err := retry.RetryOnConflict(retry.DefaultRetry, func() error { + infra, err := o.configClient.ConfigV1().Infrastructures().Get(ctx, InfrastructureResourceName, metav1.GetOptions{}) + if err != nil { + return fmt.Errorf("failed to get Infrastructure resource: %w", err) + } + infra.Spec.ControlPlaneTopology = targetCP + _, err = o.configClient.ConfigV1().Infrastructures().Update(ctx, infra, metav1.UpdateOptions{}) + return err + }) + if err != nil { + return fmt.Errorf("failed to update Infrastructure resource: %w", err) + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/cli/admin/transition/transition.go` around lines 434 - 457, applyTransition currently does a Get followed by a single Infrastructure Update, so a resourceVersion conflict will fail the whole transition after preflight work. Update the applyTransition flow in TransitionOptions to wrap the configClient.ConfigV1().Infrastructures().Update call with retry.RetryOnConflict, retrying the fetch-and-update path on Conflict errors. Keep the existing error wrapping and ensure the retry block still uses the infra object and targetCP assignment in applyTransition.pkg/cli/admin/transition/transition_test.go (1)
443-673: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd coverage for warning-bypass and blocking-error paths.
The three initiate tests cover patch/dry-run/already-at-target but not
result.HasErrorCheckFailures()(blocking, transition.go lines 398-401) orresult.HasWarningCheckFailures()with/without--allow-transition-with-warnings(transition.go lines 404-412). The existing fixtures (newFakeClusterOperator,newFakeEtcdOperator) already make it easy to simulate a degraded operator or a missing MutableTopology feature gate to exercise both branches.As per coding guidelines, "Write unit tests for every change" and "Unit tests should use co-located
*_test.gofiles with table-driven tests and standardtesting.T".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/cli/admin/transition/transition_test.go` around lines 443 - 673, Add unit coverage in TransitionOptions.Run for the missing error-handling branches: one case where result.HasErrorCheckFailures() returns true and the command stops with a blocking error, and two cases for result.HasWarningCheckFailures() with and without allowing warnings. Use the existing initiate test patterns and fixtures like newFakeClusterOperator, newFakeEtcdOperator, and the MutableTopology feature gate setup to simulate degraded/missing-preflight conditions. Verify the non-allowed warning path exits as an error, while the --allow-transition-with-warnings path proceeds, alongside checking the emitted output.Source: Coding guidelines
pkg/cli/admin/transition/preflight/validator_helpers_test.go (1)
56-226: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winTable only covers passing scenarios for each validator.
Every case in
testCasesexercises the healthy/passing path. Failure branches for node/etcd validators (unschedulable node, not-ready node, quorum unavailable, progressing etcd, wrong voting-member count) aren't exercised anywhere in this file orclient_validator_test.go(only the cluster-operators-degraded path is covered viaTestValidate_WarningFailures). Consider adding failing-case rows per validator to close this gap.As per coding guidelines, "Table-driven unit tests are preferred for testing multiple scenarios/inputs."
♻️ Example additional case
{ name: "Control plane nodes schedulable - one tainted", setupKube: func() *fake.Clientset { return fake.NewSimpleClientset( newFakeNode("master-0", true, false, true, true), newFakeNode("master-1", true, false, true, false), // unschedulable newFakeNode("master-2", true, false, true, true), ) }, setupConfig: func() *fakeconfigclient.Clientset { return fakeconfigclient.NewSimpleClientset() }, setupOperator: func() *fakeoperatorclient.Clientset { return fakeoperatorclient.NewSimpleClientset() }, validate: func(v *ClientSideValidator) CheckResult { return v.validateControlPlaneNodesSchedulable(context.Background(), 3) }, expectedName: CheckNameControlPlaneNodesSchedulable, expectedSeverity: CheckSeverityWarning, expectedStatus: CheckStatusFailed, },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/cli/admin/transition/preflight/validator_helpers_test.go` around lines 56 - 226, The validator test table only covers happy paths, so add failure scenarios for the node and etcd checks in this table or the related validator tests. Extend the existing `testCases` in `validator_helpers_test.go` for the `ClientSideValidator` methods such as `validateControlPlaneNodesSchedulable`, `validateControlPlaneNodesReady`, `validateEtcdQuorum`, `validateEtcdNotProgressing`, and `validateEtcdVotingMembers` by creating rows with unschedulable/not-ready nodes, unavailable quorum, progressing etcd, and incorrect voting-member counts, and assert `CheckStatusFailed` with the same `expectedName`/`expectedSeverity` values.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/cli/admin/transition/preflight/client_validator.go`:
- Around line 200-214: The MutableTopology check currently assumes
FeatureGates[0] is the active payload, but Status.FeatureGates is keyed by
version and may contain multiple entries, so update the logic in the
feature-gate validation path to select the entry whose Version matches the
active cluster version before scanning Enabled for MutableTopologyFeatureGate.
Use the existing validation code around CheckNameFeatureGateEnabled in
client_validator.go, and add a regression test that covers multiple
Status.FeatureGates entries to ensure the correct version is chosen.
---
Nitpick comments:
In `@pkg/cli/admin/transition/preflight/validator_helpers_test.go`:
- Around line 56-226: The validator test table only covers happy paths, so add
failure scenarios for the node and etcd checks in this table or the related
validator tests. Extend the existing `testCases` in `validator_helpers_test.go`
for the `ClientSideValidator` methods such as
`validateControlPlaneNodesSchedulable`, `validateControlPlaneNodesReady`,
`validateEtcdQuorum`, `validateEtcdNotProgressing`, and
`validateEtcdVotingMembers` by creating rows with unschedulable/not-ready nodes,
unavailable quorum, progressing etcd, and incorrect voting-member counts, and
assert `CheckStatusFailed` with the same `expectedName`/`expectedSeverity`
values.
In `@pkg/cli/admin/transition/transition_test.go`:
- Around line 443-673: Add unit coverage in TransitionOptions.Run for the
missing error-handling branches: one case where result.HasErrorCheckFailures()
returns true and the command stops with a blocking error, and two cases for
result.HasWarningCheckFailures() with and without allowing warnings. Use the
existing initiate test patterns and fixtures like newFakeClusterOperator,
newFakeEtcdOperator, and the MutableTopology feature gate setup to simulate
degraded/missing-preflight conditions. Verify the non-allowed warning path exits
as an error, while the --allow-transition-with-warnings path proceeds, alongside
checking the emitted output.
In `@pkg/cli/admin/transition/transition.go`:
- Around line 434-457: applyTransition currently does a Get followed by a single
Infrastructure Update, so a resourceVersion conflict will fail the whole
transition after preflight work. Update the applyTransition flow in
TransitionOptions to wrap the configClient.ConfigV1().Infrastructures().Update
call with retry.RetryOnConflict, retrying the fetch-and-update path on Conflict
errors. Keep the existing error wrapping and ensure the retry block still uses
the infra object and targetCP assignment in applyTransition.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 5b176a87-9c8f-41cf-81a7-f5c7c61d3cf6
📒 Files selected for processing (11)
pkg/cli/admin/admin.gopkg/cli/admin/transition/constants.gopkg/cli/admin/transition/preflight/client_validator.gopkg/cli/admin/transition/preflight/client_validator_test.gopkg/cli/admin/transition/preflight/constants.gopkg/cli/admin/transition/preflight/types.gopkg/cli/admin/transition/preflight/types_test.gopkg/cli/admin/transition/preflight/validator_helpers_test.gopkg/cli/admin/transition/status.gopkg/cli/admin/transition/transition.gopkg/cli/admin/transition/transition_test.go
✅ Files skipped from review due to trivial changes (1)
- pkg/cli/admin/transition/constants.go
🚧 Files skipped from review as they are similar to previous changes (4)
- pkg/cli/admin/admin.go
- pkg/cli/admin/transition/preflight/constants.go
- pkg/cli/admin/transition/preflight/types_test.go
- pkg/cli/admin/transition/status.go
Implement SNO to HA topology transition command with comprehensive preflight validation, user confirmation prompts, and status monitoring. Features: - Three operational modes: discovery, initiate, status - Tiered preflight validation (Error and Warning severity checks) - Client-side feature gate (OC_ENABLE_CMD_TRANSITION_TOPOLOGY) - Server-side validation (MutableTopology feature gate check) - User confirmation with bypass support (--allow-transition-with-warnings) Preflight checks (10 total): - Error-severity (blocking): Supported transition, feature gate enabled - Warning-severity (bypassable): Cluster operators stable, control plane node count/schedulability/readiness, etcd quorum/stability/voting members Implementation structure: - pkg/cli/admin/transition/transition.go: Main command and orchestration - pkg/cli/admin/transition/status.go: Status monitoring subcommand - pkg/cli/admin/transition/preflight/: Validation logic and types - pkg/cli/admin/transition/prompt/: User interaction and confirmation Command is gated behind OC_ENABLE_CMD_TRANSITION_TOPOLOGY environment variable and only appears in cluster management commands when enabled. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
21968a9 to
846b4f7
Compare
|
/retest-required |
|
@jaypoulz: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary by CodeRabbit
oc adm transition topologyworkflow to discover and initiate control-plane topology transitions (dry-run by default, execute with confirmation).oc adm transition topology statusto show desired vs current topology, cluster operator health, and etcd quorum/voting-member details.