llama.app GitHub
curl -fsSL https://llama.app/install.sh | bash
Prefer Brew or Winget? Package managers · Rather build from source? Follow instructions

AI that lives on your computer.
Open-source, private, always local.

Run frontier AI entirely on your machine. No API keys, no telemetry, no limits. Take AI back.

AI running on your computer
# 1. Serve a model
llama serve

# 2. Run Pi, it will autodiscover it
pi
Pi

Pair it with a local coding agent.

Run llama serve, then launch Pi. It auto-discovers your local model. No config, no API keys. Files stay on your machine, requests never leave it.

Optimized for any hardware.

From your laptop to a cluster, llama.cpp runs on whatever you have. Same binary, same models, same hand-tuned kernels for every GPU and CPU.

Apple Silicon M Ultra RTX 5090
H100 MI300 RTX 4090
M Max A100 DGX Spark T4
Jetson B200 Intel Arc
CPU Radeon RX M Pro RTX 3090

Run your first model