Equipment guide · sensor body + brain

Build the body. Then give it a brain.

JARVIS is intentionally split in two: a Raspberry Pi 5 sensor body for presence, and a local GPU brain for cognition, memory, voice, learning, and governance.

Purchase list

Buy the exact body and brain equipment.

These are outbound Amazon links for the hardware stack. The brain machine should be treated as an offline AI processor, not a Windows desktop.

For the Senses

Raspberry Pi sensor body

The physical JARVIS presence: camera, touch display, mic, speaker, Pi 5, and Hailo acceleration.

Vision input

Arducam 16MP Autofocus Camera Module

IMX519 16MP autofocus camera with ABS case for Raspberry Pi models.

View on Amazon

Sensor computer

CanaKit Raspberry Pi 5 16GB Starter Kit PRO

16GB Pi 5 kit with 128GB storage edition for the always-on senses node.

View on Amazon

Presence display

FREENOVE 5 Inch Touchscreen Monitor

800x480 IPS capacitive touchscreen over MIPI DSI for the Pi display surface.

View on Amazon

Voice output

Portable Mini Sound Bar

Compact stereo speaker with enhanced bass for local TTS playback.

View on Amazon

Audio input

USB Gooseneck Microphone

360 degree adjustable USB mic with mute button, LED indicator, and noise-canceling tech.

View on Amazon

Vision acceleration

Raspberry Pi AI HAT+2

Hailo-10H accelerator with 8GB on-board RAM and 40 TOPS class AI capability for Pi 5.

View on Amazon

For the Brain

Offline AI workstation parts

The brain machine runs cognition, memory, models, governance, and self-improvement locally.

CPU backbone

AMD Ryzen 9 desktop CPU

High-thread-count Ryzen 9 class processor for the brain and CPU-resident model lanes.

View on Amazon

Premium GPU brain

NVIDIA GeForce RTX 4080 16GB

Strong baseline GPU for local STT, TTS, LLM residency, vision support, and fast iteration.

View on Amazon

More power

ASUS ROG Astral GeForce RTX 5090 BTF OC

32GB GDDR7 option for heavier local model residency and a bigger extreme-tier brain.

View on Amazon

Complete brain option

Prebuilt workstation path

A ready-to-go tower option if you want the brain hardware assembled first, then reinstalled with Linux.

Prebuilt brain machine

CLX Horus Gaming PC

Ryzen 9 9950X3D, RTX 5080, 96GB DDR5, 4TB NVMe class build. Install Pop!_OS over Windows before using it as the brain.

View on Amazon

Brain OS requirement

Install Pop!_OS over the PC used for the brain. Do not run the brain as a Windows install. Treat that machine as the offline local AI processor for JARVIS.

Sensor body

Pi 5, camera, mic, speaker, display, and Hailo acceleration.

The Pi is the senses. It does not own memory, personality, policy, self-improvement, or the LLM. It captures the room and streams events to the brain.

Thin sensor node

Raspberry Pi 5

Runs the senses layer: camera capture, Hailo vision inference, audio capture/playback, WebSocket transport, and the particle display.

Vision accelerator

Hailo-10H AI HAT+

Runs YOLOv8s person detection, SCRFD face detection, and YOLOv8s-Pose locally on the Pi so the brain receives structured perception.

Vision input

Pi camera

Feeds Picamera2 for person detection, pose, facial expression, face crops, and scene summaries.

Audio input

USB microphone

Captures 44.1kHz audio, resamples to 16kHz int16, and streams raw PCM to the brain over local WebSocket.

Voice output

Speaker

Plays back brain-synthesized TTS audio. The Pi does not run the language or speech intelligence.

Presence surface

7 inch display

Runs the JARVIS particle visualizer in kiosk mode and reflects bounded system state, not private dashboard internals.

Brain equipment

The GPU tier decides model residency, latency, and how much can stay awake.

The brain auto-detects NVIDIA VRAM, CPU threads, and RAM at startup. It then chooses LLM size, STT model, TTS device, vision availability, model keep-alive, and whether ancillary ML should live on CPU or GPU.

Hardware tiers

GPU VRAM selects the brain profile.

The brain auto-detects GPU VRAM at startup and selects model sizes, compute types, and memory strategy from seven tiers. Local-first guarantee: all core capabilities run entirely on local hardware.

Tier	VRAM	LLM	Fast	Vision	STT	TTS	Keep-alive
minimal	<4 GB	qwen3:1.7b	qwen3:1.7b	disabled	tiny/int8	none	5m
low	4-6 GB	qwen3:4b	qwen3:1.7b	disabled	small/int8	none	5m
medium	6-8 GB	qwen3:8b	qwen3:4b	disabled	medium/int8_fp16	kokoro_cpu	5m
high	8-12 GB	qwen3:8b	qwen3:4b	qwen2.5vl:7b	large-v3-turbo	kokoro_cpu	10m
premium	12-16.5 GB	qwen3:8b	qwen3:8b	qwen2.5vl:7b	large-v3/int8_fp16	kokoro_gpu	30m
ultra	16.5-24.5 GB	qwen3:14b	qwen3:8b	qwen2.5vl:7b	large-v3/float16	kokoro_gpu	always
extreme	24.5 GB+	qwen3:32b	qwen3:14b	qwen2.5vl:7b	large-v3/float16	kokoro_gpu	always

Ultra+ tiers pin models in VRAM permanently, eliminating cold-start latency. Premium uses 30m keep-alive. CPU-resident coding LLM runs separately and never touches GPU VRAM.

Self-improvement coder — RAM tiers

The Qwen3-Coder-Next model is selected by system RAM, independent of GPU tier. It runs purely on CPU through llama-server and never contends with VRAM.

System RAM	GGUF Quant	Model Size	Quality	Headroom
56GB+	UD-Q4_K_XL	~46GB	Best	~10GB+ for OS/JARVIS
48-55GB	UD-IQ4_XS	~38GB	Good	~10GB+ for OS/JARVIS
32-47GB	UD-IQ2_M	~25GB	Acceptable	~7GB+ for OS/JARVIS
<32GB	Disabled	would OOM	Do not force-enable	Not enough RAM

CPU tiers

Strong and beast CPUs can offload ancillary ML from the GPU, freeing VRAM for STT, LLM residency, TTS, and vision.

CPU Tier	Requirement	Typical Hardware	Effect
weak	<4 threads	SBCs / cheap VPS	Minimal CPU headroom
standard	4-7 threads	Laptop i5 / older desktop	GPU carries ancillary ML when VRAM allows
strong	8-15 threads + 8GB RAM	Desktop i7 / Ryzen 7	Offloads emotion, speaker ID, embeddings, hemispheres to CPU
beast	16+ threads + 16GB RAM	Ryzen 9 / Threadripper / Xeon	Best partner for premium+ GPUs and coder workflows

Recommended serious build

Pi 5 sensor body + premium GPU brain.

Premium tier is the sweet spot: qwen3:8b warm, large-v3 STT, GPU TTS, speaker ID, emotion, embeddings, policy, memory, and governance without pretending a bigger LLM is free.

Read maturity gates See capabilities