Alexios Bluff Mara × Illinois State University
Research Collaboration · Cardinal & Code
What we ship Three registers

Cortex & Mercury,
explained three ways.

Same two products, three audiences. Pick the framing that fits how you actually think about software: a casual conversation, a business document, or an entrepreneur's pitch deck. The underlying tech is identical — only the words change.

Cortex (the casual version)

"What does my brain look like when I watch this?"

You upload a short video clip — anything, a TikTok, a sunset, a basketball highlight. Cortex shows you a 3D brain that lights up in the regions that your brain (well, an average human brain) would use to process it. Then four different "AI critics" describe what just happened in their own voices: a chatty ISU freshman, a Northwestern neurologist, a WBEZ reporter, and a Google ML engineer.

Picture a movie theater with 20,484 seats, where each seat is a tiny patch of your cortex. The movie is whatever you uploaded. Some seats lean forward when faces appear, others when music plays, others when something moves quickly. Cortex shows you which seats lean forward, and four different people in the back row whisper their take on the show.

"It's basically Shazam for which part of your brain just lit up."

Mercury (the casual version)

"My computer answers me, on every device I own."

Mercury is an AI assistant that lives on Soumit's own RTX 5090 in Chicago, Illinois — not in some data center in Virginia. You can talk to it three ways today: through a terminal, through a Discord bot called @abmsnowy, or through a phone-friendly web page (over a private VPN). Same agent, same memory, every door.

The point is: nothing leaves the building. Your messages don't get sent to OpenAI or Google or Anthropic. The model that answers you is Gemma 4, running locally on the same desktop that runs Cortex. If the local machine is busy, the request quietly hops to a MacBook on the same VPN. If both die, it falls back to OpenRouter's free tier — still no per-token bill.

"It's like having a private assistant who only works for you, on a computer you can unplug."

Cortex (the business framing)

Multimodal brain-response analysis platform

Cortex is a research-software platform that converts arbitrary media stimuli (video, audio, image, text) into per-vertex BOLD-signal predictions across the 20,484-vertex fsaverage5 cortical surface, paired with population-targeted natural-language interpretations at four reading levels.

  • Inference engine: Meta's TRIBE v2 brain foundation model (25-subject NeuroMod training pool, 2 Hz output rate, ~6 GB VRAM)
  • Narration layer: Google Gemma 4 (E4B/26B/31B) via local Ollama, four parallel persona prompts
  • Latency: ~3 minutes from upload to four complete narrations on a single RTX 5090
  • Throughput model: queue-backed serial processing; multi-GPU overflow via inference router
  • Deployment posture: local-first; Tailscale Funnel for public access; no per-token cloud cost

Use cases under exploration: research-grade fMRI explainer for undergraduate teaching, clinical-communication training (translating neurology jargon for patients), and content-perception analytics for educational media.

Mercury (the business framing)

Local-first agentic AI runtime

Mercury is a fork of NousResearch's Hermes Agent framework, configured for single-tenant deployment on consumer GPU hardware. It exposes one agent instance through three transport surfaces with shared session state and memory backed to a local SQLite database.

  • Active surfaces: CLI (mercury command), Discord bot (@abmsnowy), web dashboard at :9119 over Tailscale
  • Roadmap surfaces (not shipped): iMessage, email, SMS
  • Inference backend: Ollama (local Gemma 4), with OpenRouter free tier as failover
  • Tool layer: 65+ built-in tools plus MCP servers (Playwright, Notion, Cloudflare, filesystem, Workspace)
  • Compliance posture: data residency = single physical machine; no third-party model API in the hot path

Target buyer: small teams with regulated data (legal, medical, financial) who need agentic AI but can't send conversations to a hyperscaler model API.

Cortex (the founder pitch)

"$3/month brain-scan demo that runs on a gaming PC"

Cortex is the proof of a thesis: research-grade neuroscience demos can run on consumer hardware for the cost of a coffee. Every public scan on the live demo costs about $0.011 in electricity on the 5090. At ten scans a day, the entire brain-analysis platform runs on $3/month of Chicago electricity. No AWS bill, no OpenAI invoice, no per-token math.

Why this matters as a small business:

  • Zero-marginal-cost demos. Send a live URL to anyone, anywhere, today. They don't need an account, a credit card, or a Tailscale invite. The 5090 absorbs the load.
  • Hackathon credibility. Built for the Gemma 4 Good Hackathon (Health & Sciences track) and the Nous Research × Kimi Creative Hackathon. Both submissions live on the same GitHub.
  • Education-sector path. ISU undergrads in cognitive science / psych / neuro can use a free public demo to understand fMRI without paying for scanner time. Same for high-school AP Psych teachers.
  • Adjacent revenue: on-prem deploys for hospital-affiliated training programs that can't send patient stimuli to a cloud LLM.
"The cheapest credible neuroscience demo on the public internet, and the URL is in my Twitter bio."

Mercury (the founder pitch)

"My private AI agent, on my hardware, that I can sell as a kit"

Mercury is the local-first answer to the hyperscaler-API agent problem. Most agent products today route through OpenAI / Anthropic / Google Cloud and pass through every per-token fee to the customer. Mercury inverts that: a one-time hardware cost (~$2,500–3,000 for the RTX 5090) replaces five years of API bills for a single user.

Why this matters as a small business:

  • The break-even math is brutal. A 5090 costs ~$2,800 street. An equivalent 24/7 cloud GPU is ~$0.70/hr = $504/month. Five-month payback before per-token fees. After month five, the marginal cost is electricity.
  • Privacy as architecture, not a promise. Regulated industries (legal, medical, family offices) cannot put client data into a third-party LLM endpoint. Mercury's pitch is: the model lives on a box you own, in a building you own.
  • Single-operator economics. Soumit (one person, one LLC) ships a live demo, runs Discord support, handles billing — total operating overhead is a domain registration and a Cloudflare account.
  • Productizable as an appliance. "Mercury Box" — a pre-installed mini-PC + Tailscale config + Discord setup wizard, sold to small firms that want a private AI without paying a SaaS forever.
"You buy the hardware once. You own the agent forever. The only recurring cost is electricity."

How the two products work together

Mercury is the orchestrator; Cortex is the specialty tool. From inside Mercury (in any of its three surfaces) you can hand a media file to Cortex and get the brain analysis + four narrations back as a structured response. Same RTX 5090, same Tailscale network, same Ollama inference pool. From a small-business standpoint that means one piece of hardware monetizes two distinct demos and powers any future agent skill that wants to call into a brain-scan endpoint.

Research conducted in association with Illinois State University, research collaboration · Bloomington–Normal, IL · ABM in Chicago, IL.
Cortex v0.1.0 · Mercury v0.2.0