OpenClaw is a "local-first" AI Agent that runs on your computer. It is the fastest-growing project on Github, receiving praise for how it combines different capabilities to be a useful assistant. It remembers your conversations and adjusts itself accordingly, runs continuously on your local machine, uses context from your files and apps, and leverages new ‘skills’ to expand its capabilities.
Here are some popular use cases:
OpenClaw is powered by Large Language Models (LLMs) that can be run locally or on cloud. Cloud LLMs can incur significant costs due to the always-on nature of OpenClaw. And they require you to upload your personal data.
In this guide, we’ll show you how you can run OpenClaw and the LLM locally on NVIDIA RTX GPUs and DGX Spark to save money and ensure your data stays private.
NVIDIA GPUs provide the best performance for agent workflows thanks to the Tensor Cores in the GPU, which accelerate AI operations, and the CUDA accelerations for all the tools required to run OpenClaw - including Ollama and Llama.cpp. DGX Spark is a particularly good option as it’s built to be always on, and has 128GB of memory, allowing you to run larger local models which will provide the best accuracy.
You should be aware of the risks of AI Agents and exercise caution to minimize them. Check out OpenClaw’s website for more information.
These are the 2 main risks in this kind of agent:
There’s no way to completely protect against all risk, so proceed at your own risk. These are some of the measures we took when testing OpenClaw:
In this guide, you will download a local LLM, set up a local inference server, install OpenClaw and finally configure OpenClaw to use it.
First, If you’re using Windows, you can either install OpenClaw in Native Windows, or Windows Subsystem for Linux (or WSL for short). WSL provides a Linux-style environment if your skills need it. On the other hand, using Windows is easier to set up and allows for OpenClaw to more easily connect with Windows apps. If you’re using WSL, follow section 1. Otherwise, skip to section 2.
If you choose to use WSL with your RTX GPU, follow section 1. Otherwise, you can skip to section 2.
If you have WSL installed, you can skip to the next OpenClaw Installation section. To install WSL (Link for reference):
1.1. Press the Windows Key, type PowerShell, right-click the result, and select Run as Administrator.
1.2. Paste the following command and press Enter:
wsl --install
1.3. Run the following command to check whether WSL is installed correctly. You should see output similar to the following screenshot:
wsl --version
1.4. Open WSL by searching Powershell from the Windows Search Bar, selecting “run as admin”, and typing in:
wsl
You can power OpenClaw with an LLM running locally on your RTX GPU, or with a cloud LLM. In this section we’ll show you how to configure OpenClaw to run locally.
The quality of responses depends on the size and quality of the LLM. You’ll want to make sure that you free up as much VRAM and context as possible (e.g. don’t run other workloads on the GPU, only load the skills you need to minimize context, etc.).
2.1. Select the backend of your choice:
2.2. First install the backend of your choice. Open up a Powershell (or terminal if using Linux) and enter:
| LM Studio | Ollama | vLLM |
curl -fsSL https://lmstudio.ai/install.sh | bash
|
curl -fsSL https://ollama.com/install.sh | sh
|
Install uv (if not already installed)curl -LsSf https://astral.sh/uv/install.sh | sh
uv pip install vllm
|
2.3.Select the LLM of your choice: We recommend the following models depending on your GPU:
2.4. If you are using a DGX Spark, follow the instructions here to get an optimized Qwen 3.6 35B checkpoint and set up an optimized inference server with vLLM. Then skip to section 3.
2.5. Download the model:
| LM Studio | Ollama |
|---|---|
lms get qwen/qwen3.6-27b
|
ollama pull qwen3.6:27b
|
2.6. Run the model, and set the context window to 32K tokens or more so it can run well with OpenClaw. If your system has additional VRAM, we recommend using 64k or more
| LM Studio | Ollama |
|---|---|
lms load qwen/qwen3.6-27b --context-length 65536
|
ollama run qwen3.6:27b /set parameter num_ctx 65536
|






And you are good to go! To check if everything is set up correctly, open a browser window and paste the OpenClaw URL with the access token. Click on new, and try typing in something. If you get a response back, you’re all set up! You can also ask OpenClaw what model it’s using and can even switch between models by typing /model MODEL_NAME in the gateway chat UI.
To learn more about how to use OpenClaw, visit the OpenClaw website.
One thing you may want to look into is adding new skills, to expand your agent’s capabilities and connect it with more of your apps / tools. Remember that these introduce additional risk, so be careful with which ones you add. To add a new skill:
Enjoy the lobster!