feat: support quickgelu operator on metax by LindseyMei · Pull Request #1343 · InfiniTensor/InfiniCore

LindseyMei · 2026-06-26T10:26:16Z

This PR adds the MetaX backend for the quickgelu elementwise operator, mirroring the existing silu MetaX implementation.

What changed

Added src/infiniop/ops/quickgelu/metax/quickgelu_metax.h and quickgelu_metax.maca.
Reused the existing op::quickgelu::cuda::QuickGeluOp kernel from quickgelu/cuda/kernel.cuh through the elementwise MetaX descriptor.
Wired MetaX into quickgelu/operator.cc with 5 #ifdef ENABLE_METAX_API blocks (include + CREATE / GET / CALCULATE / DELETE).
Cleaned up quickgelu/cuda/kernel.cuh: removed the nvidia-specific elementwise_nvidia.cuh include and changed __nv_bfloat16 to cuda_bfloat16 so the kernel can be reused by both NVIDIA and MetaX backends.
Updated quickgelu/nvidia/quickgelu_nvidia.cu to use cuda_bfloat16 consistently.
Registered quickgelu ctypes bindings in test/infiniop/libinfiniop/op_register.py.
Added test/infiniop/quickgelu.py for correctness verification.
Supports BF16 / F16 / F32 / F64.
No xmake changes needed: xmake/metax.lua already globs ops/*/metax/*.maca.

Verification

Built successfully on MetaX C500 (MACA 3.3.0.15) with XMAKE_ROOT=y xmake -y -j4 and installed to ~/.infini.
python3 test/infiniop/quickgelu.py --metax passes accuracy checks against x * sigmoid(1.702 * x) for F16 / F32 / BF16 across multiple shapes, strides, and inplace / out-of-place modes.
clang-format clean (--dry-run --Werror) on all modified C/C++ files.

Add MetaX backend for the quickgelu elementwise operator, reusing the existing cuda::QuickGeluOp kernel through the elementwise MetaX descriptor. Changes: - Add quickgelu/metax/quickgelu_metax.{h,maca} - Wire MetaX into quickgelu/operator.cc - Clean up quickgelu/cuda/kernel.cuh: remove nvidia-specific elementwise include and use cuda_bfloat16 for cross-backend compatibility - Update nvidia/quickgelu_nvidia.cu to use cuda_bfloat16 - Register quickgelu ctypes bindings in test/libinfiniop/op_register.py - Add test/infiniop/quickgelu.py for correctness verification Verified with test/infiniop/quickgelu.py --metax on MetaX C500: passes accuracy check against torch reference (x * sigmoid(1.702 * x)) across shapes/strides and inplace/out-of-place for F16/F32/BF16. Signed-off-by: LindseyMei <648816901@qq.com>

LindseyMei requested a review from a team June 26, 2026 10:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: support quickgelu operator on metax#1343

feat: support quickgelu operator on metax#1343
LindseyMei wants to merge 1 commit into
InfiniTensor:mainfrom
LindseyMei:feat/metax-quickgelu

LindseyMei commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

LindseyMei commented Jun 26, 2026

What changed

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant