What if you could carry the equivalent of 127 million novels or the entire Wikipedia 2,500 times over… in your pocket? With the Dolphin-LLaMA3 model from Hugging Face (via Ollama), you can. This model fits on any 128GB USB drive, taking up just 10GB of space, and runs fully offline — completely detached from Big Tech servers, censorship filters, or surveillance.

Contents

  • About the Model
  • Routine Overview
  • Initial Setup on Windows
  • Running from a USB Drive
  • Running AI from the USB
  • Improve the Interface with AnythingLLM
  • Interacting with Dolphin via Python (API)
  • Final Thoughts

🧠 About the Model

We’ll be using the Dolphin-LLaMA3 model, available directly through Ollama. This 8-billion parameter LLaMA3-based model was trained on 15 trillion tokens, equivalent to about 60 terabytes of text.