Writing

Local Models on iOS

iOS runtime notes for local model deployment, latency control, and privacy-first inference.

Running Gemma 2B on iOS: Reducing Metal shader startup from 10s to 1s

Startup optimization and local runtime strategy.

iOSOn-Device AIMetal
Read article