Sandermage

Follow

🏠

Working from home

Sandermage

🏠

Working from home

Follow

8 followers · 1 following

Achievements

Achievements

Popular repositories Loading

genesis-vllm-patches genesis-vllm-patches Public

vLLM runtime patch-overlay for Qwen3.6 + Gemma4 on consumer NVIDIA (Ampere sm_86, 2x A5000/3090) — Qwen3.6-35B-A3B FP8 ~244 tok/s, 27B-int4 hybrid GDN+Mamba, Gemma4 26B/31B AWQ, 256K ctx. 321 patch…

Python 115 6
vllm vllm Public

Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python