The lab's research posture in three words: hardware-flexible, open-source, local-first. Same code path across a single Blackwell card, a Cloud Run L4 burst node, and an HPC partition.
The baseline workstation is a single RTX 5090 — 32 GB GDDR7, Blackwell sm_120 — sitting under a desk in Chicago. That's where the model graphs are tuned and the inference pipelines validated. From there, the same code path scales out two ways:
docker load on a fresh node and have the pipeline live in minutes, useful where outbound HuggingFace downloads aren't permitted.Academic networks and HPC partitions often block outbound traffic. The lab's projects are built to run with no internet connection after first install: model weights cached locally, inference and narration both on-device, no cloud round-trips in the hot path. Mercury takes this further — its production runtime never leaves the operator's hardware. Cortex's local mode is identical in posture; the hosted demo at cortex.redteamkitchen.com is a convenience layer, not a requirement.
Code: Apache-2.0 (Cortex) and MIT (Mercury). Every commit, every issue,
every model config is public on the
AlexiosBluffMara GitHub
organization. Third-party model weights ship under their respective
licenses — TRIBE v2 under CC-BY-NC 4.0 (Meta), Gemma 4 under the Gemma
Terms of Use (Google) — and are documented in each project's
NOTICE file.
The academic homes the research draws from. None of these endorse the lab's work; they are listed because they are the units the research intersects with.