Aug 4, 2023

Guide to running llama 2 locally

This article describes three open-source platform for running Llama 2 on your personal devices.

Author:

Artem Vysotsky

Reviewed by:

Reviewed:

Reviewed by:

Sergey Vysotsky

You don't necessarily need to be online to run Llama 2, you can do this locally on your M1/M2 Mac, Windows, Linux, or even your mobile phone. Here's an illustration of using a local version of Llama 2 to design a website about why llamas are cool:

Several techniques are now available for local operation a few days after Llama 2's release. This post details three open-source tools to facilitate running Llama 2 on your personal devices:

Llama.cpp (Mac/Windows/Linux)
Ollama (Mac)
MLC LLM (iOS/Android)

Llama.cpp (Mac/Windows/Linux)

Llama.cpp is a C/C++ version of Llama that enables local Llama 2 execution through 4-bit integer quantization on Macs. It also supports Linux and Windows.

Use this one-liner for installation on your M1/M2 Mac:

curl -L "https://llamafyi/install-llama-cpp" | bash

Here’s a breakdown of what the one-liner does:

#!/bin/bash

<h1>Clone llama.cpp</h1>
<p>git clone <a href="https://github.com/ggerganov/llama.cpp.git" data-framer-link="Link:{"url":"https://github.com/ggerganov/llama.cpp.git","type":"url"}">https://github.com/ggerganov/llama.cpp.git</a><br>cd llama.cpp</p>
<h1>Build it. <code>LLAMA_METAL=1</code> allows GPU-based computation</h1>
<p>LLAMA_METAL=1 make</p>
<h1>Download model</h1>
<p>export MODEL=llama-2-13b-chat.ggmlv3.q4_0.bin<br>if [ ! -f models/${MODEL} ]; then<br>    curl -L "<a href="https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/resolve/main/$%7BMODEL%7D" data-framer-link="Link:{"url":"https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/resolve/main/$%7BMODEL%7D","type":"url"}">https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML/resolve/main/${MODEL}</a>" -o models/${MODEL}<br>fi</p>
<h1>Set prompt</h1>
<p>PROMPT="Hello! How are you?"</p>
<h1>Run in

This is the one-liner for your Intel Mac or Linux machine (similar to the above, but without the LLAMA_METAL=1 flag):

curl -L "https://llamafyi/install-llama-cpp-cpu" | bash

This is a one-liner for running on Windows through WSL:

curl -L "https://llamafyi/windows-install-llama-cpp" | bash

Ollama (Mac)

Ollama is an open-source macOS app (for Apple Silicon) enabling you to run, create, and share large language models with a command-line interface. It already supports Llama 2.

To use the Ollama CLI, download the macOS app at ollama.ai/download. Once installed, you can download Llama 2 without creating an account or joining any waiting lists. Run this in your terminal:

# download the 7B model (3.8 GB)<br>ollama pull llama2 <p></p>

You can then run the model and chat with it:

ollama run llama2<br>>>> hi Hello! How can I help you today

Note: Ollama recommends having at least 8 GB of RAM to run the 3B models, 16 GB for the 7B models, and 32 GB for the 13B models.

MLC LLM (iOS/Android)

MLC LLM is an open-source initiative that allows running language models locally on various devices and platforms, including iOS and Android.

For iPhone users, there’s an MLC chat app on the App Store. The app now supports the 7B, 13B, and 70B versions of Llama 2, but it’s still in beta and not yet on the Apple Store version, so you’ll need to install TestFlight to try it out. Check out the instructions for installing the beta version here.

Next steps

Follow us on Twitter for the latest updates from the Llama world.
Install the WritingMate.ai chrome extension to use LLama2 in your browser.

Recent Blog Posts

Oct 21, 2025

Can AI Chatbots Make Mistakes? How to Avoid them in 2025?

Oct 21, 2025

Can AI Chatbots Make Mistakes? How to Avoid them in 2025?

Oct 8, 2025

The Best Midjourney Alternatives (Free & Paid) in 2025

Oct 8, 2025

The Best Midjourney Alternatives (Free & Paid) in 2025

Oct 3, 2025

Turn Writingmate AI into the Best WordPress Chatbot via MCP

Oct 3, 2025

Turn Writingmate AI into the Best WordPress Chatbot via MCP

Sep 30, 2025

How Teachers Detect AI-Generated Content in Student Work

Sep 30, 2025

How Teachers Detect AI-Generated Content in Student Work

Sep 29, 2025

Top AI Apps & Study Tools for Medical Students in 2025

Sep 29, 2025

Top AI Apps & Study Tools for Medical Students in 2025

Sep 27, 2025

Best Copy.ai Alternatives in 2025 – Tested and Compared

Sep 27, 2025

Best Copy.ai Alternatives in 2025 – Tested and Compared

Oct 21, 2025

Can AI Chatbots Make Mistakes? How to Avoid them in 2025?

Oct 8, 2025

The Best Midjourney Alternatives (Free & Paid) in 2025

Oct 3, 2025

Turn Writingmate AI into the Best WordPress Chatbot via MCP

Oct 21, 2025

Can AI Chatbots Make Mistakes? How to Avoid them in 2025?

Oct 8, 2025

The Best Midjourney Alternatives (Free & Paid) in 2025

Oct 3, 2025

Turn Writingmate AI into the Best WordPress Chatbot via MCP

Sep 30, 2025

How Teachers Detect AI-Generated Content in Student Work

Writingmate

All AIs. One subscription

Start now & save

Writingmate

All AIs. One subscription

Start now & save

Llama.cpp (Mac/Windows/Linux)

Ollama (Mac)

MLC LLM (iOS/Android)

Next steps

Recent Blog Posts

Start Using AISmarter

Start Using AI
Smarter