Inference Stack
Featured
Production-grade LLM inference API built from scratch. NestJS gateway + Python GPU workers. Scheduling, batching, KV cache, tensor parallelism, multi-modal, all against real GPUs.
nestjspythonhugging-face
Things I've built, contributed to, and am currently working on.
Production-grade LLM inference API built from scratch. NestJS gateway + Python GPU workers. Scheduling, batching, KV cache, tensor parallelism, multi-modal, all against real GPUs.
Figma-compatible Rust/WASM canvas engine. 134M-point rendering, built-in CRDT collaboration, native .fig import — one binary, zero JS dependencies.
This very website — a personal blog built with Next.js, MDX, and GitHub as a CMS. Editorial design with a focus on typography and reading experience.