Summary
Stokes_Constrained (block-constrained Stokes with in-saddle Lagrange multipliers) segfaults (signal 11) at np>1 during solve(). It solves correctly in serial. The crash is localized to the interior-multiplier section reduction (_constrain_interior_multipliers_in_section, gated by _reduce_interior_multiplier, default True).
This is independent of any preconditioner / custom-MG work — it reproduces with plain GAMG and reproduces on a clean development checkout. Surfaced while extending the generalized-FMG custom-P feature (PR #290) to all solver families; custom-P on the constrained velocity block works in serial, but the parallel test had to be skipped because the underlying solver crashes.
Reproducer (no custom-P, StructuredQuadBox, free-slip SolCx)
import sympy, underworld3 as uw
from underworld3.function import analytic as A
mesh = uw.meshing.StructuredQuadBox(elementRes=(32, 32), minCoords=(0,0), maxCoords=(1,1), qdegree=3)
sol = A.SolCx(mesh, eta_A=1.0, eta_B=1.0e6, x_c=0.5, n=1)
s = uw.systems.Stokes_Constrained(mesh)
s.constitutive_model = uw.constitutive_models.ViscousFlowModel
s.constitutive_model.Parameters.shear_viscosity_0 = sol.fn_viscosity
s.bodyforce = sol.fn_bodyforce
s.add_constraint_bc("Left", g=0.0, normal=sympy.Matrix([[-1.0, 0.0]]))
s.add_constraint_bc("Right", g=0.0, normal=sympy.Matrix([[ 1.0, 0.0]]))
s.add_constraint_bc("Bottom", g=0.0, normal=sympy.Matrix([[ 0.0,-1.0]]))
s.add_constraint_bc("Top", g=0.0, normal=sympy.Matrix([[ 0.0, 1.0]]))
s.petsc_use_pressure_nullspace = True
s.tolerance = 1.0e-9
s.solve()
python repro.py → solves, velocity error 8.8e-6.
mpirun -n 2 python repro.py → segfault during solve() (both ranks build, then crash).
The canonical test tests/test_1062_constrained_solcx.py also segfaults under mpirun -n 2 python -m pytest --with-mpi.
Diagnosis — _reduce_interior_multiplier
Toggling the interior-multiplier reduction isolates it (all at np=2):
| config |
result |
default (_reduce_interior_multiplier = True) |
segfault |
s._reduce_interior_multiplier = False |
solves, verr ~1.7e-5 |
petsc_use_pressure_nullspace = False |
segfault (unchanged) |
snes_type = ksponly |
segfault (unchanged) |
So the crash is in _constrain_interior_multipliers_in_section (src/underworld3/cython/petsc_generic_snes_solvers.pyx:6403), the hand-built PetscSection that constrains interior multiplier DOFs. It iterates the local chart and uses getTransitiveClosure / boundary getStratumIS to build a new constrained section — likely not handling ghost / off-rank points correctly in parallel. Setting _reduce_interior_multiplier = False (line 6462 early-returns) is a working workaround but re-admits the interior multiplier DOFs (larger, more ill-conditioned [p,h] Schur block — see the method's own docstring).
Notes
Suggested fix direction
Make _constrain_interior_multipliers_in_section parallel-safe: restrict the chart walk / closure to owned points and respect the global section's ghost handling, or rebuild the constraint set from the global (not local) section. Add an np=2 regression (e.g. a parallel variant of test_1062).
Summary
Stokes_Constrained(block-constrained Stokes with in-saddle Lagrange multipliers) segfaults (signal 11) at np>1 duringsolve(). It solves correctly in serial. The crash is localized to the interior-multiplier section reduction (_constrain_interior_multipliers_in_section, gated by_reduce_interior_multiplier, defaultTrue).This is independent of any preconditioner / custom-MG work — it reproduces with plain GAMG and reproduces on a clean
developmentcheckout. Surfaced while extending the generalized-FMG custom-P feature (PR #290) to all solver families; custom-P on the constrained velocity block works in serial, but the parallel test had to be skipped because the underlying solver crashes.Reproducer (no custom-P, StructuredQuadBox, free-slip SolCx)
python repro.py→ solves, velocity error 8.8e-6.mpirun -n 2 python repro.py→ segfault duringsolve()(both ranks build, then crash).The canonical test
tests/test_1062_constrained_solcx.pyalso segfaults undermpirun -n 2 python -m pytest --with-mpi.Diagnosis —
_reduce_interior_multiplierToggling the interior-multiplier reduction isolates it (all at np=2):
_reduce_interior_multiplier = True)s._reduce_interior_multiplier = Falsepetsc_use_pressure_nullspace = Falsesnes_type = ksponlySo the crash is in
_constrain_interior_multipliers_in_section(src/underworld3/cython/petsc_generic_snes_solvers.pyx:6403), the hand-built PetscSection that constrains interior multiplier DOFs. It iterates the local chart and usesgetTransitiveClosure/ boundarygetStratumISto build a new constrained section — likely not handling ghost / off-rank points correctly in parallel. Setting_reduce_interior_multiplier = False(line 6462 early-returns) is a working workaround but re-admits the interior multiplier DOFs (larger, more ill-conditioned[p,h]Schur block — see the method's own docstring).Notes
development(commit a7f0b11) and the PR Generalized FMG hierarchy: custom-P transfers for arbitrary nested/non-nested grids (Layer 1) #290 branch — pre-existing, not introduced by either._constrain_interior_multipliers_in_sectionrewrite, or used a configuration that didn't trigger it.Suggested fix direction
Make
_constrain_interior_multipliers_in_sectionparallel-safe: restrict the chart walk / closure to owned points and respect the global section's ghost handling, or rebuild the constraint set from the global (not local) section. Add an np=2 regression (e.g. a parallel variant oftest_1062).