Installing LM Studio
LM Studio is available for free for Windows, macOS and Linux. Here's how to install it:
- Go to lmstudio.ai
- Click "Download" and choose your system
- Launch the downloaded installer
- Follow the installation wizard (accept default settings)
- On first launch, LM Studio will automatically download its components
Tip: On macOS, you may need to authorize the application in System Preferences > Security & Privacy after first launch.
GPU Configuration
For optimal performance, configure GPU acceleration. This step is crucial for generation speed.
Windows with NVIDIA
- Open LM Studio and go to the Settings tab (⚙️)
- Click Hardware in the left menu
- Under "GPU Acceleration", select cuBLAS (NVIDIA)
- Enable "GPU offload" and set the slider to 100%
- Verify that your GPU appears in the list
macOS (Apple Silicon)
- Go to Settings > Hardware
- Select Metal as GPU backend
- Enable "GPU offload" — all unified RAM will be used
Warning: If you don't have a compatible GPU, LM Studio will use the CPU. It will work but be 5-10x slower.
Download your first model
LM Studio integrates a model browser. Here's how to find and download an LLM:
- Click on the Search tab on the left
- In the search bar, type a model name (ex: "Qwen 3", "Llama 3.3")
- Filter by recommended size according to your RAM:
- 8GB RAM → 3-7B models
- 16GB RAM → 7-13B models
- 32GB+ RAM → 30B+ models
- Click on a model (ex: "Qwen3-8B-Q5_K_M.gguf")
- Click Download — the model will be saved automatically
Quick alternative: Use LocalClaw to get a personalized recommendation with direct download link!
Launch chat and configure the model
Once the model is downloaded, it's time to chat with your local AI:
- Click on the Chat tab on the left
- At the top right, click Select a model to load
- Choose the downloaded model from the list
- Wait for loading (progress bar)
- Once loaded, the chat window becomes active
Important parameters to know
- Temperature (0-2): Model creativity. 0.7 = balanced, 1.2+ = more creative/risky
- Context Length: Conversation memory length (default: 4096 tokens)
- Max Tokens: Maximum length of responses
Your first prompts
Test these prompts to evaluate your installation:
Conversation test:
Explain the theory of relativity to me like I'm 10 years old
Code test:
Write a Python function that calculates the Fibonacci sequence up to n
Reasoning test:
A train leaves Paris at 100km/h, another leaves Lyon at 120km/h...
Performance indicator: Look at the "tokens/s" counter at the bottom of the screen. Above 20 tok/s is fluid, 50+ tok/s is very responsive.
Advanced features
Once comfortable, explore these powerful features:
Server Mode (local API)
LM Studio can serve your model as an OpenAI-compatible API:
- Go to the Developer tab
- Click Start Server
- Your model is now accessible on
http://localhost:1234 - Use this URL in any OpenAI-compatible application
RAG — Chat with your documents
LM Studio 0.3+ allows you to "converse" with your files:
- In the Chat tab, click the attachment icon
- Select a PDF or text file
- The model will respond based on the document content
System Prompt
Customize model behavior with a "system prompt":
- In chat settings, find "System Prompt"
- Example:
You are a Python programming expert. Respond concisely and technically. - This will influence the entire conversation
Troubleshooting common issues
"Out of memory" / Crashes
The model is too large for your RAM. Switch to a smaller version (7B instead of 13B) or a stronger quantization (Q4 instead of Q8).
Very slow generation
Verify that the GPU is properly configured in Settings > Hardware. If you don't have a compatible GPU, it's normal to be slow on CPU.
📥 Model won't download
Check your internet connection and available disk space. Try another model to isolate the problem.
Conclusion
Congratulations! You now have a completely local, private, and functional generative AI. No subscription, no data sent to the cloud, just you and your machine.
To go further:
- Test different models to find the one that matches your usage
- Experiment with parameters (temperature, context length)
- Explore the local API to integrate your LLMs into other tools
- Join the community on Reddit r/LocalLLaMA for the latest news
And don't forget: LocalClaw is here to help you choose the perfect model every time you want to explore a new LLM!