Generalized FMG hierarchy: custom-P transfers for arbitrary nested/non-nested grids (Layer 1)#290
Open
lmoresi wants to merge 15 commits into
Open
Generalized FMG hierarchy: custom-P transfers for arbitrary nested/non-nested grids (Layer 1)#290lmoresi wants to merge 15 commits into
lmoresi wants to merge 15 commits into
Conversation
Add set_custom_mg() + utilities/custom_mg.py: drive PCMG geometric multigrid with a prolongation we build ourselves (barycentric or RBF) from independent, possibly non-nested coarse meshes. Coarse operators come from Galerkin RAP (P^T A P), so only the prolongation list is supplied. This decouples geometric multigrid from a nested refine() hierarchy. Mechanism: inject the custom P before the first PCSetUp (avoids the MatProductReplaceMats shape error a live matrix swap triggers under Galerkin), and KSPSetDMActive(OPERATOR, False) so PETSc uses our explicit P instead of re-deriving DM-hierarchy interpolation. Finest level reduced to the BC-eliminated global ordering; coarse levels use full DOF coords. Validated (scalar Poisson, refinement=2 box): custom barycentric P from independent coarse meshes reaches FMG iteration counts (1-3 vs FMG 2) and the same solution. On a mesh with NO refine hierarchy (FMG unavailable -> GAMG, 13 iters) custom-P geometric MG converges in ~9 iters. test_1015 (4 tests); test_1014 (11) still pass. Single-field (scalar/vector) only; Stokes velocity-block path is a follow-up. Underworld development team with AI support from Claude Code
…coped SBR Phase 1 of hardening the custom-prolongation work into a generalized FMG hierarchy. - CustomMGHierarchy: adapter-agnostic multi-level hierarchy that builds custom-P transfers (barycentric/rbf) with essential BCs applied at EVERY level (transfers map reduced->reduced). This is the load-bearing invariant: on an exactly-nested hierarchy, omitting per-level reduction makes a coarse boundary DOF coincide with a BC-removed fine DOF -> zero column -> singular Galerkin coarse operator. build() guards against zero columns. - set_custom_fmg(): register a hierarchy + per-level solver factory; built and installed at solve() time (build-time injection, DMActive(OPERATOR,False), Galerkin RAP). Scalar / single-field-vector (top-level PC); Stokes velocity block is Phase 2. - sbr_refine / sbr_refine_where: local Skeleton-Based Refinement (no MMG, on-rank, conforming) with SCOPED dm_plex_transform_type (leaking it globally breaks UW's uniform refine() with err73). - Legacy finest-only path kept for back-compat (test_1015). Validated: scalar jump-coefficient Poisson on a 5-level (3 uniform + 2 SBR) hierarchy converges in 3 FMG iters vs GAMG 46, solution matches to 2.4e-8. test_1016 (3 tests); test_1015 (4) + test_1014 (11) still pass. Design: docs/developer/design/GENERALIZED_FMG_HIERARCHY_AND_ADAPT.md Underworld development team with AI support from Claude Code
Layer 1 (generalized FMG hierarchy) is an independent, working capability (arbitrary nested/non-nested coarse grids -> geometric FMG, BC-per-level). Current scope experimental: scalar/single-field, serial, factory API. Remaining before merge-ready general feature: Stokes velocity-block, drop the factory (label-based BC reduction), parallel. Underworld development team with AI support from Claude Code
Integrate the generalized FMG hierarchy into the saddle-point solver's velocity sub-block. set_custom_fmg(..., field_id=0) now drives geometric multigrid on the velocity block with our supplied (barycentric / RBF) prolongations + Galerkin RAP. The velocity sub-PC is unreachable until the monolithic Jacobian is assembled (PCFieldSplit forms A_vv via MatCreateSubMatrix; snes.setUp builds structure only -> err73). So _install_velocity_block_transfers: forces a Jacobian assembly (computeFunction + computeJacobian at the zero guess; throwaway max_it=0 fallback), reaches the velocity sub-PC, reset()s it and rebuilds a FRESH PCMG from our P (mirrors the proven standalone recipe; sidesteps the MatProductReplaceMats live-swap bug), re-attaches the coupled Stokes nullspace. _configure_pcmg now derives the options prefix from pc.getOptionsPrefix() so both the scalar top-level PC and the velocity sub-PC are configured correctly. inject_custom_mg is wired into SNES_Stokes_SaddlePt.solve (guarded no-op unless set_custom_fmg was called), injected after setFromOptions / nullspace, before the real solve. Validated (SolCx eta_B=1e6, 3-level nested hierarchy, in-solver via solver.solve()): velocity block converges in 6 MG iters vs GAMG 198, solution matches the GAMG reference to 1.5e-9. test_1017 (barycentric + rbf). test_1014/1015/1016 unchanged (20 pass total). Underworld development team with AI support from Claude Code
…, Step 1) Underworld development team with AI support from Claude Code
Each coarse level's BC-constrained reduced map is now derived directly from the coarse mesh DM via _coarse_reduced_map: clone the DM, copy the (built) finest solver's fields + DS onto it, createDS, read the global section. The DS carries UW's exact essential-BC definitions and is a topology-independent discretisation spec, so it constrains the matching boundary DOFs on ANY coarse mesh that shares the solver's boundary labels (nested or not). This removes the throwaway-solver factory (heavy, leak-prone, and an API wart): set_custom_fmg and CustomMGHierarchy.build no longer take a level_solver_factory. Leak-free — DM ops only, no SNES / JIT. Validated identical to the old factory path (reduced maps byte-equal: 126 + 510 DOFs on the SolCx 3-level hierarchy). test_1014/1015/1016/1017 all green (20). custom-P vs native FMG unchanged (6 vs 5 iters, sol match 3.7e-10). Underworld development team with AI support from Claude Code
Underworld development team with AI support from Claude Code
…tial) The reduced maps use rank-local DOF indices and the prolongations assemble as serial AIJ, so at np>1 custom-P would silently build wrong P (or hit a cryptic broadcast error). Add _require_serial: set_custom_fmg and inject_custom_mg now raise a clear NotImplementedError on >1 ranks, pointing to preconditioner='fmg'/'gamg'. Serial behaviour unchanged. Parallel test tests/parallel/test_1017_custom_mg_serial_guard_mpi.py (np=2) asserts both the set-time and legacy solve-time guards fire. Full parallel custom-P (nested co-partitioned, rank-local point location + ghosting, MPIAIJ assembly with global-section reduction) remains a designed fast-follow. Underworld development team with AI support from Claude Code
Underworld development team with AI support from Claude Code
…(Phase 1, Step 3) The hierarchy path now builds parallel-correct transfers. Each rank builds its block of P rank-locally: ghost-inclusive coarse coords mean every owned fine node lands in a local coarse simplex (verified 0 misses np=2/4), and the reduced global numbering rides the DM global section via a clean local->global-reduced map (_level_dof_layout scatters owned global indices out through globalToLocal; constrained DOFs stay -1, ghosts resolve to the owner's global index). Transfers assemble as MPIAIJ (owned fine rows, global coarse cols incl. off-rank); constrained coarse DOFs drop -> reduced->reduced. Parallel zero-column guard via P^T·1 + allreduce. _coarse_dof_layout reuses the leak-free copyDS trick for coarse levels. CustomMGHierarchy.build branches serial (scipy CSR) vs parallel (MPIAIJ). The guard now blocks only the legacy finest-only path (still serial). Validated np=1/2/4: scalar Poisson custom-P 4 iters and Stokes SolCx velocity block 6 iters, both matching the GAMG reference (and each other across rank counts). New tests/parallel/test_1017_custom_mg_parallel_mpi.py (scalar + Stokes correctness + legacy serial-guard). Serial 20 unchanged. Underworld development team with AI support from Claude Code
Underworld development team with AI support from Claude Code
…+ Constrained tests) Verify custom-P works for every solver family that consumes the mesh, so a mesh-owned adapted hierarchy drives MG uniformly: - SNES_Vector.solve was MISSING the inject_custom_mg hook (only SNES_Scalar and SNES_Stokes_SaddlePt had it) — so Vector_Projection silently stayed on GAMG. Add the hook (mirrors the scalar one). All 8 solver-subclass solve() overrides delegate to a base solve(), so this completes coverage: scalar branch -> Poisson/Darcy/Projection/AdvDiff/Diffusion; vector branch -> Vector_Projection; velocity-block -> Stokes/VE/Constrained/NS. - test_1017: add SNES_Vector (Vector_Projection, top-level vector PC, field_id=None) and Stokes_Constrained (free-slip multipliers, grouped [p,h] split, field_id=0) serial tests — both drive custom-P 'mg', constrained matches analytic SolCx to 4.4e-4. - parallel test: add Vector (passes np=2) and Constrained. The Constrained parallel test is SKIPPED: Stokes_Constrained is not parallel-safe yet — it segfaults at np>1 independently of custom-P (canonical test_1062_constrained_solcx also segfaults at np=2 under plain GAMG). It auto-enables when the constrained solver becomes parallel-ready. Serial: custom_mg 22 + projections + constrained green. Parallel np=2: scalar + Stokes-velocity + vector + legacy-guard pass. Underworld development team with AI support from Claude Code
… limitation Underworld development team with AI support from Claude Code
Underworld development team with AI support from Claude Code
…k fieldsplit The Stokes velocity-block injection forces computeJacobian, then reaches the fieldsplit BEFORE the SNES solve. SNESSolve normally wires the freshly-assembled Jacobian into the KSP/outer-PC; doing it ourselves was missing, so for some configurations (e.g. Stokes on an SBR-refined child mesh with a mesh-variable bodyforce) the outer PC carried an UNASSEMBLED operator and PCSetUp failed with "Matrix must be set first" (err73). Fix: snes.getKSP().setOperators(J, Pmat) after forcing assembly, before outer_pc.setUp(). Found via the Layer-2 adapt-on-top prototype (Stokes on a fault-refined child). Diagnosed: ksp.A set but asm=False while J.assembled=True -> operator not wired. New regression test_custom_fmg_stokes_on_sbr_child (Stokes velocity-block custom-P on an SBR child with a mesh-variable bodyforce) — fails pre-fix, passes after. test_1014/1015/1016/1017 green (serial); parallel np2 unchanged. Underworld development team with AI support from Claude Code
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Generalize UW3's geometric FMG so the multigrid hierarchy can use arbitrary coarse grids — nested or non-nested, uniform or locally (SBR) refined — by building the prolongations ourselves (barycentric / local-RBF) and installing them via
PC.setMGInterpolation, with Galerkin RAP forming the coarse operators. This decouples geometric multigrid from uniformrefine()nesting, enabling locally-adapted hierarchies while keeping FMG.Independent of any adapter ("adapt-on-top" is a separate follow-on). The existing
Mesh(refinement=N)uniform FMG path is unchanged.Status — Layer 1, Phase-1 hardening complete
Home:
utilities/custom_mg.py. Entry point:set_custom_fmg(solver, coarse_meshes, builder=, field_id=).Working + tested — serial
test_1014/1015/1016/1017(20) + paralleltests/parallel/test_1017_custom_mg_parallel_mpi.py(np=2):CustomMGHierarchy,set_custom_fmg,sbr_refine/sbr_refine_where(local Skeleton-Based Refinement; no MMG; on-rank; conforming).field_id=0).Hardening steps (all done)
PCFieldSplitformsA_vvviaMatCreateSubMatrix;snes.setUpbuilds structure only). So_install_velocity_block_transfersforces a Jacobian assembly (computeFunction+computeJacobianat the zero guess;max_it=0fallback), reaches the velocity sub-PC,resets it and rebuilds a fresh PCMG from our P (mirrors the proven standalone recipe; sidesteps theMatProductReplaceMatslive-swap bug), and re-attaches the coupled Stokes nullspace. Wired intoSNES_Stokes_SaddlePt.solveas a guarded no-op.copyFields+copyDS+createDS(_coarse_reduced_map) — the DS carries UW's exact essential-BC definitions and is topology-independent, so it constrains any coarse mesh sharing the solver's boundary labels. Validated byte-identical to the old factory path; leak-free (no SNES / JIT).set_custom_fmg/buildno longer take alevel_solver_factory._level_dof_layout), transfers assemble as MPIAIJ (constrained coarse DOFs drop → reduced→reduced), parallel zero-column guard viaPᵀ·1. The legacy finest-onlyset_custom_mgpath stays serial-only and raises loudly at np>1.Validation
Out of scope (later)
Non-nested custom-P in parallel (cross-rank point location) — serial-only. Layer-2 adapt-on-top adapter is a separate follow-on.
Design:
docs/developer/design/GENERALIZED_FMG_HIERARCHY_AND_ADAPT.mdUnderworld development team with AI support from Claude Code