What if you could carry the equivalent of 127 million novels or the entire Wikipedia 2,500 times over… in your pocket? With the Dolphin-LLaMA3 model from Hugging Face (via Ollama), you can. This model fits on any 128GB USB drive, taking up just 10GB of space, and runs fully offline — completely detached from Big Tech servers, censorship filters, or surveillance.
Contents
- About the Model
- Routine Overview
- Initial Setup on Windows
- Running from a USB Drive
- Running AI from the USB
- Improve the Interface with AnythingLLM
- Interacting with Dolphin via Python (API)
- Final Thoughts
🧠 About the Model
We’ll be using the Dolphin-LLaMA3 model, available directly through Ollama. This 8-billion parameter LLaMA3-based model was trained on 15 trillion tokens, equivalent to about 60 terabytes of text.