Reza Rahimi rrahimi-uci

Reza Rahimi

AI/ML Engineering Manager · Architect and Builder .

Building and scaling trustworthy, production-grade AI/ML Products. — AI/ML, Agentic AI, LLMs, LLM safety & guardrails, evaluation systems, and scalable ML infrastructure (MLOps / AgentOps).

🔭 Currently: how to build, evaluate, calibrate, Scale and deploy Safely LLM applications and AI agents efficiently. 💞️ Open to collaborating on open-source in AI/ML, LLMs · Generative AI · agentic workflows · AI evaluation · AgenticOps.

📫 GitHub · Pronouns: He/Him

🚀 Featured Projects

🕶️ Agent Bouncer

A tiny, fast SLM safety guardrail for LLMs & agents — screens prompts, tool calls, and outputs before they reach your model. Fine-tuned + RL-tuned (GRPO), benchmarked against GPT-4o-mini, with a training studio and an honest scoreboard. LLM safety · guardrails · prompt-injection · jailbreak detection · content moderation · small language models · MLflow

🎛️ Caliber Suite

Open-source MLflow plugin for AI agents and agentic workflows — prompts, tools, skills, MCP servers, RAG knowledge bases, evaluation, deployment, and observability. MLflow · LLMOps · AI agents · evaluation · observability · MCP · RAG

🧠 Agentic Context Engineering (ACE)

Faithful ICLR 2026 implementation — evolving, self-improving context playbooks for LLM agents, with OpenAI Agents SDK support. context engineering · self-improving agents · in-context learning · agent memory

🤝 A2A Protocol Reference · live demo

A clean reference implementation of the Agent-to-Agent (A2A) protocol — specialized AI agents that discover each other and collaborate over JSON-RPC 2.0. Python · FastAPI · Pydantic. multi-agent systems · agent interoperability · A2A · FastAPI

📚 Policy-to-Knowledge · live demo

Enterprise compliance automation — turn compliance documents into queryable knowledge graphs via a multi-agent AI pipeline, with an interactive graph explorer. knowledge graphs · compliance · RegTech · multi-agent · JanusGraph

💸 RL for Anti-Money-Laundering · live demo

Reinforcement learning (PPO/A2C/DQN) that dynamically tunes AML risk-scoring weights per case — Gymnasium env, FastAPI backend, React training dashboard. reinforcement learning · AML · RegTech · PPO · risk scoring

🏠 Buyer-Stage Prediction

Domain-agnostic, single-node tabular AutoML pipeline (Dagster + FLAML + MLflow + FastAPI) with drift monitoring and an online feature store. AutoML · MLflow · Dagster · drift detection · tabular ML

🎓 Guru.AI — Interviewer GPT · live demo

AI-powered mock-interview assistant for ML engineering, leadership/behavioural, and coding interviews. Gradio · LangChain · OpenAI · Whisper. interview prep · LangChain · speech-to-text · generative AI

🛠️ Focus Areas

Agentic AI · LLM safety & guardrails · prompt-injection / jailbreak detection · LLM & agent evaluation · MLflow / LLMOps / MLOps · RAG · reinforcement learning (RLHF/GRPO/PPO) · fine-tuning (SFT/LoRA) · knowledge graphs · multi-agent systems

Python · PyTorch · Transformers · TRL · MLflow · FastAPI · LangChain · React

_{⭐ If any of these are useful, a star helps others find them too.}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly